Heater
Posts: 18335
Joined: Tue Jul 17, 2012 3:02 pm

Schrödinger's Code - Undefined behavior in theory and practice

Mon May 31, 2021 11:03 am

If you have ever had the sneaking feeling that you are losing your mind when trying to get your C or C++ code working, or if you have ever started to think the compilers are completely insane, then rest assured, you are not alone.

This is a very nicely and easily understandable essay on what you have been suffering from:

"Schrödinger's Code - Undefined behaviour in theory and practice": https://queue.acm.org/detail.cfm?id=3468263
Memory in C++ is a leaky abstraction .

Heater
Posts: 18335
Joined: Tue Jul 17, 2012 3:02 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Mon Jun 07, 2021 6:32 pm

What, nobody much interested in writing correct C/C++ ?
Memory in C++ is a leaky abstraction .

pidd
Posts: 2262
Joined: Fri May 29, 2020 8:29 pm
Location: Wirral, UK
Contact: Website

Re: Schrödinger's Code - Undefined behavior in theory and practice

Mon Jun 07, 2021 8:54 pm

C is the culmination of putting every possible programming trap in one place (apart from leading whitespace).

I do like the basic concept of the article, until you try it you have no idea if it will work or how it will fail. For me this has put the whole Schrödinger's cat from hypothetical garbage into a realistic perspective.

There is a lot to to be said for trial and error, look at SpaceX's amazing progress compared to NASA's own slow attempts not to have any whoopsies.

Heater
Posts: 18335
Joined: Tue Jul 17, 2012 3:02 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Mon Jun 07, 2021 9:13 pm

pidd wrote:
Mon Jun 07, 2021 8:54 pm
C is the culmination of putting every possible programming trap in one place (apart from leading whitespace).
Yeah. Except I would not say C was the culmination. Rather it is the origin. The culmination is C++. And it has not finished culminating yet :)

As for white space errors C and C++ have them as well. Consider something like:

Code: Select all

if (a == b)
     c = d;
Now somebody comes along and wants to do something else inside that "if" block and they end up with this:

Code: Select all

if (a == b)
     c = d;
     e = f;
Oops. That "e = f" is not conditional. What they really wanted was:

Code: Select all

if (a == b) {
     c = d;
     e = f;
}
This has been the cause of many bugs. Famously perhaps Apple’s SSL bug: https://nakedsecurity.sophos.com/2014/0 ... ial-patch/ Note the line "goto fail; /* MISTAKE! THIS LINE SHOULD NOT BE HERE */"
pidd wrote:
Mon Jun 07, 2021 8:54 pm
I do like the basic concept of the article, until you try it you have no idea if it will work or how it will fail. For me this has put the whole Schrödinger's cat from hypothetical garbage into a realistic perspective.
Scary isn't it!
pidd wrote:
Mon Jun 07, 2021 8:54 pm
There is a lot to to be said for trial and error, look at SpaceX's amazing progress compared to NASA's own slow attempts not to have any whoopsies.
"trial and error" should be your tests.

Except that is not enough, be sure to use the memory, thread and other sanitisers.
Memory in C++ is a leaky abstraction .

swampdog
Posts: 718
Joined: Fri Dec 04, 2015 11:22 am

Re: Schrödinger's Code - Undefined behavior in theory and practice

Mon Jun 07, 2021 10:20 pm

I did give it a read last week and tried the suggested options on my personal library code. Some of it is timestamped 2004. Compiled just fine. I didn't want to appear smug but as you've nudged the thread. ;-)

I guess it helps that it started life cross-platform: windows/cygwin/bsd/linux, not that it is any more - so got thrown at a lot of compilers.

Another thing is efficiency. It's a computer. I mostly don't care. I'd much rather write "StrRemL(&s,n)" than "s.erase(s.begin(),s.begin()+n)" and load a whole file into a container with sdFileLoad() than muck about reading it a line at a time. Less to debug.

The examples were orientated toward C rather than C++ and as I wrap all my non-trivial C calls in C++ sanity wrappers it's difficult for odd behaviour to occur.

Code: Select all

sdStr FC
sdGetCwd(void) SD_THROW((sde_Sys))
/*[$PROTO
Return current working directory or raise sde_Sys on failure. The caller will
not see ERANGE from underlying api call (ie when buffer is too small) as this
call will handle that situation.
]*/
{sdMem<SD_C>    m       (sdMaxPath());

 while (!sd_t_getcwd<SD_C>(m,m.tSize()))        {
        if (ERANGE != sdErrno())
                throw sde_Sys(SD_SFL(sdGetCwd))
        ;
        m += sdMaxPath();
 }

 return m.t_ptr();
}
..where sdStr (these days) is std::string, SD_C is char (windows has wchar_t) and sdMem<> saves a leak.

User avatar
jahboater
Posts: 7150
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Schrödinger's Code - Undefined behavior in theory and practice

Mon Jun 07, 2021 10:47 pm

Heater wrote:
Mon Jun 07, 2021 9:13 pm
As for white space errors C and C++ have them as well. Consider something like:

Code: Select all

if (a == b)
     c = d;
Now somebody comes along and wants to do something else inside that "if" block and they end up with this:

Code: Select all

if (a == b)
     c = d;
     e = f;
Oops. That "e = f" is not conditional.
Why cant people just heed the compiler warnings?

Code: Select all

try.c:29:3: error: this 'if' clause does not guard...  (-Werror=misleading-indentation)
   29 |   if (a == b)
      |   ^~
try.c:31:6: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'
   31 |      e = f;
      |      ^

Heater
Posts: 18335
Joined: Tue Jul 17, 2012 3:02 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 1:36 am

jahboater wrote:
Mon Jun 07, 2021 10:47 pm
Why cant people just heed the compiler warnings?
There is no warning unless one has "-Wall" set.

I'm pretty sure there was no such warning for the first couple of decades of my using C.
Memory in C++ is a leaky abstraction .

lurk101
Posts: 658
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 2:57 am

"The correct approach, of course, is to write software whose behavior is predictable according to the relevant language standards. Diligent programmers study standards carefully and leverage tools to keep undefined operations out of their code."

^ Says it all!
How to make your arguments stronger? Longer is not the answer.

Heater
Posts: 18335
Joined: Tue Jul 17, 2012 3:02 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 3:30 am

lurk101 wrote:
Tue Jun 08, 2021 2:57 am
"The correct approach, of course, is to write software whose behavior is predictable according to the relevant language standards. Diligent programmers study standards carefully and leverage tools to keep undefined operations out of their code."

^ Says it all!
All very true. However impossible in practice.

Even the most diligent, experienced programmers make mistakes.

Microsoft, for example, has boat loads of very good programmers and yet they report that ~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues. That is to say programmers writing undefined behaviour into their code. See: https://msrc-blog.microsoft.com/2019/07 ... cure-code/

A better idea would be to use languages that don't allow one to write code with unpredictable behaviour. An old idea, dating back to ALGOL.

I'll match you quote from the article with this quote from the same article:
Quicksort inventor C.A.R. Hoare summarized one philosophy in his Turing Award lecture:7 The behavior of every syntactically correct program should be completely predictable from its source code. For the sake of safety, security, and programmer sanity, it must be impossible for a program to "run wild." Ensuring well-defined behavior imposes runtime overheads (e.g., array bounds checks), but predictability justifies the cost. Today, "safe" languages such as Java embody Hoare's advice.
Emphasis mine.

I think that really says it all.

Which we can update today as we now have languages that will ensure predictable behaviour with no run time overheads.
Memory in C++ is a leaky abstraction .

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 28918
Joined: Sat Jul 30, 2011 7:41 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 9:25 am

Heater wrote:
Tue Jun 08, 2021 3:30 am
lurk101 wrote:
Tue Jun 08, 2021 2:57 am
"The correct approach, of course, is to write software whose behavior is predictable according to the relevant language standards. Diligent programmers study standards carefully and leverage tools to keep undefined operations out of their code."

^ Says it all!
All very true. However impossible in practice.

Even the most diligent, experienced programmers make mistakes.

Microsoft, for example, has boat loads of very good programmers and yet they report that ~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues. That is to say programmers writing undefined behaviour into their code. See: https://msrc-blog.microsoft.com/2019/07 ... cure-code/

A better idea would be to use languages that don't allow one to write code with unpredictable behaviour. An old idea, dating back to ALGOL.

I'll match you quote from the article with this quote from the same article:
Quicksort inventor C.A.R. Hoare summarized one philosophy in his Turing Award lecture:7 The behavior of every syntactically correct program should be completely predictable from its source code. For the sake of safety, security, and programmer sanity, it must be impossible for a program to "run wild." Ensuring well-defined behavior imposes runtime overheads (e.g., array bounds checks), but predictability justifies the cost. Today, "safe" languages such as Java embody Hoare's advice.
Emphasis mine.

I think that really says it all.

Which we can update today as we now have languages that will ensure predictable behaviour with no run time overheads.
I know a number of MS programmer that I would not employ, yet have managed to avoid the MS cull for at least 15 years.

So I would not hold them up as paragons of programming virtue.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Working in the Application's Team.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 28918
Joined: Sat Jul 30, 2011 7:41 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 9:26 am

Heater wrote:
Tue Jun 08, 2021 1:36 am
jahboater wrote:
Mon Jun 07, 2021 10:47 pm
Why cant people just heed the compiler warnings?
There is no warning unless one has "-Wall" set.

I'm pretty sure there was no such warning for the first couple of decades of my using C.
Any compiler from the last 10 years at least should have -Wall or similar. Your experience prior to that is irrelevant. Technology has improved.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Working in the Application's Team.

User avatar
jahboater
Posts: 7150
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 9:28 am

Heater wrote:
Tue Jun 08, 2021 3:30 am
Quicksort inventor C.A.R. Hoare summarized one philosophy in his Turing Award lecture:7 The behavior of every syntactically correct program should be completely predictable from its source code. For the sake of safety, security, and programmer sanity, it must be impossible for a program to "run wild." Ensuring well-defined behavior imposes runtime overheads (e.g., array bounds checks), but predictability justifies the cost. Today, "safe" languages such as Java embody Hoare's advice.
Emphasis mine.

I think that really says it all.

Which we can update today as we now have languages that will ensure predictable behavior with no run time overheads.
Programs written in such languages may still fail, but they are likely to fail gracefully.

Regardless of the programming language, I think its worth getting into the mindset of proving to yourself beyond all doubt that UB or other failure conditions cannot occur. If the code is too complex, then add explicit checks (thus enabling the proof). Sanitize unpredictable input.

Over-allocation of resources is usually the cheap-and-nasty option.
While sometime integers at risk of overflow may be replaced by 64-bit ones which could obviate the need for overflow checks (with little cost on modern hardware), its not a good idea to allocate far too much memory - just to be safe.

User avatar
jahboater
Posts: 7150
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 9:44 am

Heater wrote:
Tue Jun 08, 2021 3:30 am
Which we can update today as we now have languages that will ensure predictable behavior with no run time overheads.
You provided a relevant quote some time ago, the gist of it was:

1) features in the newer, more expressive, languages may obviate the need for run-time checks.

2) some code can be proven correct (not to have UB say), or otherwise, at compile time.

3) other code must have run-time checks.

---------------------------------------------------------------------
1) yes

2) I think this is compiler technology dependent. A by-product of the heavy optimization that C compilers do is the advanced data-flow analysis which can provide more accurate diagnostics.
Compilers check all sorts of things. I see that GCC now does a strict analysis of the worst case buffer size needed for a sprintf, even with a complex format.
In general, as long as the warnings are switched on, I think C does a good job of static checking. The simple syntax may help.

3) I don't believe this is cost free:

n = array[ atoi(argv[1]) ];

this expression simply must have two or more explicit checks, involving actual code and run-time overhead.
(not that that's a bad thing).

swampdog
Posts: 718
Joined: Fri Dec 04, 2015 11:22 am

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 10:54 am

I could understand windows programmers not writing good code at the winapi level. A quitting app could consume increased resources. What's the point in checking every tiny memory allocation? By the time the os returns null to an app, windows itself was already flakey. Docs re-enforced this: you'd get a sense for it after a while, of m$ trying to hide a problem. Slap a framework on top of it and there's not a chance of dealing with it. In that respect I guess .net and c# were a step in the right direction. You'd stuggle to implement (eg: my sdMem<>) because of lack of structured (aka LPVOID) winapi datatypes.

It does no harm when learning, to write your own "malloc", not so much for handling pointers in this case but because you can set a limit on its total size and therefore use it to test your code.

LdB
Posts: 1703
Joined: Wed Dec 07, 2016 2:29 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 11:16 am

Linux and Windows like most modern O/S over commit memory so even if a malloc works it doesn't mean much other than congratulations you got a pointer. This whole thread is just random generalizations and sheds little light on anything meaningful and I love the concept of "correct c++ code".

If you don't like the way compilers are then write your own because we really don't care what you think. Personally all I care about is what the compiler writers think and what they are doing anything beyond that is a waste of time.

swampdog
Posts: 718
Joined: Fri Dec 04, 2015 11:22 am

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 11:36 am

LdB wrote:
Tue Jun 08, 2021 11:16 am
Linux and Windows like most modern O/S over commit memory so even if a malloc works it doesn't mean much other than congratulations you got a pointer. This whole thread is just random generalizations and sheds little light on anything meaningful and I love the concept of "correct c++ code".

If you don't like the way compilers are then write your own because we really don't care what you think. Personally all I care about is what the compiler writers think and what they are doing anything beyond that is a waste of time.
Did "we" get out of bed on the wrong side today?

#define malloc MyMalloc
#define free MyFree
//app

Quickest compiler I never had to write.

User avatar
jahboater
Posts: 7150
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 12:46 pm

swampdog wrote:
Tue Jun 08, 2021 10:54 am
What's the point in checking every tiny memory allocation?
Its good practice, even though (on Linux) the OOM (Out Of Memory) killer will likely terminate your process first.

This thread is about much more than that.

Did you read Heater's link?
https://queue.acm.org/detail.cfm?id=3468263
It is well written and a good read, though I didn't see anything unexpected.

lurk101
Posts: 658
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 3:34 pm

jahboater wrote:
Tue Jun 08, 2021 12:46 pm
Did you read Heater's link?
https://queue.acm.org/detail.cfm?id=3468263
It is well written and a good read, though I didn't see anything unexpected.
I did, and expected to be enlightened in ways to avoid 'undefined behavior'. Instead I got a long dissertation on white space, misplaced or missing brackets, out-of-bounds indexing, integer overflow, and such. All of which have caused countless bugs, but are not really what I'd call undefined behavior.

Valid syntax with undefined behavior:

Code: Select all

void a_function(volatile int n) {
   ...
}
What does volatile mean here? Or worse, what did the coder think it meant?
How to make your arguments stronger? Longer is not the answer.

User avatar
jahboater
Posts: 7150
Joined: Wed Feb 04, 2015 6:38 pm
Location: Wonderful West Dorset

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 3:47 pm

lurk101 wrote:
Tue Jun 08, 2021 3:34 pm
out-of-bounds indexing, integer overflow, and such. All of which have caused countless bugs, but are not really what I'd call undefined behavior.
These are most definitely UB.

LdB
Posts: 1703
Joined: Wed Dec 07, 2016 2:29 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 4:01 pm

Appendix J of the standard, C11 has 199 undefined behaviours and some are vital they be there. There are more than that but those are the important ones that are tracked in implementation.

The article didn't tell you why undefined behaviours exist and why we won't close them ... you see they have an upside.

@lurk101
void a_function(volatile int n) {
...
}
You assume the programmer doesn't know what he is doing on a couple of CPU's that has very precise meaning and is needed. Just because you don't know what it will do on a generic example doesn't make it wrong (that is the portability problem). If you really want to do that most compilers that would understand that change that to the keyword "register" to indicate they want to carry n in a register.

The one none of you talked about is variables in a function call, the fact you don't define input and output
void a_function(char* n) {
...
}
Can you tell if it's valid to write to n from that definition ... it can be undefined because n may point to ROM.
Last edited by LdB on Tue Jun 08, 2021 4:07 pm, edited 1 time in total.

pidd
Posts: 2262
Joined: Fri May 29, 2020 8:29 pm
Location: Wirral, UK
Contact: Website

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 4:06 pm

jahboater wrote:
Tue Jun 08, 2021 3:47 pm
lurk101 wrote:
Tue Jun 08, 2021 3:34 pm
out-of-bounds indexing, integer overflow, and such. All of which have caused countless bugs, but are not really what I'd call undefined behavior.
These are most definitely UB.
I thought that was the point of comparing it to Schrödinger, before you run the program it equally has no bugs and has bugs, so it is undefined, only after you run the program (open the box) does anything become defined.

ejolson
Posts: 7476
Joined: Tue Mar 18, 2014 11:47 am

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 4:07 pm

jamesh wrote:
Tue Jun 08, 2021 9:26 am
Heater wrote:
Tue Jun 08, 2021 1:36 am
jahboater wrote:
Mon Jun 07, 2021 10:47 pm
Why cant people just heed the compiler warnings?
There is no warning unless one has "-Wall" set.

I'm pretty sure there was no such warning for the first couple of decades of my using C.
Any compiler from the last 10 years at least should have -Wall or similar. Your experience prior to that is irrelevant. Technology has improved.
Badly formatted whitespace in C code could be fixed automatically since 1976. In 1982 indent became an official part of the Berkely Software Distribution.

https://en.m.wikipedia.org/wiki/Indent_(Unix)

Much of the static code analysis bundled into today's C compilers using -Wall was part of lint which dates back to 1978.

Note that creating one large executable was not possible within the 16-bit address space of the PDP-11. A modular design was also favoured since Multics had recently provided a practical example showing that bigger isn't always better.

While CPU architectures have advanced to the point computers with a 64-bit address space can be had for US $35, no similar increases in human memory or abilities have occurred. Thus, a modular design made of tools designed to do one thing well is still a useful simplifying force to enable human understanding.

In particular, no matter how impressive, a computer does not promote human interests unless people can understand what it does, why and how to fix it. Seen another way, language features that enable the construction of very large programs don't solve the problem that such designs aren't a very good idea in the first place.

LdB
Posts: 1703
Joined: Wed Dec 07, 2016 2:29 pm

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 4:15 pm

I think most programmers would simplify it to

If I wanted to drive a crash proof slow Volvo compiler I would buy one, I prefer the freedom and speed and trust in my ability to not crash the compiler.

lurk101
Posts: 658
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 5:42 pm

LdB wrote:
Tue Jun 08, 2021 4:15 pm
I think most programmers would simplify it to

If I wanted to drive a crash proof slow Volvo compiler I would buy one, I prefer the freedom and speed and trust in my ability to not crash the compiler.
I think it's mentioned early in the article under safety vs. performance.
How to make your arguments stronger? Longer is not the answer.

lurk101
Posts: 658
Joined: Mon Jan 27, 2020 2:35 pm
Location: Cumming, GA (US)

Re: Schrödinger's Code - Undefined behavior in theory and practice

Tue Jun 08, 2021 5:53 pm

LdB wrote:
Tue Jun 08, 2021 4:01 pm
void a_function(volatile int n) {
...
}
You assume the programmer doesn't know what he is doing on a couple of CPU's that has very precise meaning and is needed. Just because you don't know what it will do on a generic example doesn't make it wrong (that is the portability problem). If you really want to do that most compilers that would understand that change that to the keyword "register" to indicate they want to carry n in a register.
Actually it drives compiler writers batty!
The one none of you talked about is variables in a function call, the fact you don't define input and output
void a_function(char* n) {
...
}
Can you tell if it's valid to write to n from that definition ... it can be undefined because n may point to ROM.
You mean writing through n? You can't tell, nor is there any expectation that you should in this case. Witing to ROM is not an undefined operation, it's a dumb programming error.

As for writing to n, I can tell it would be perfectly valid.
How to make your arguments stronger? Longer is not the answer.

Return to “C/C++”