Credit: Gary Neill
Many modern dynamic languages lack tools for understanding complex failures.
Thank you for the interesting article. However, I have one comment / question regarding the broken code example. The article states that "[t]his simple program has a fatal flaw: while trying to clear each item in the ptrs array at lines 14–15, it clears an extra element before the array (where ii = −1)." I agree that the out of bounds access and write on the ptrs array is not good. However, wouldn't dereferencing and writing to an uninitialized pointer be more of a problem here?
I admit I am not familiar with Illumos and it is not clear to me what hardware this example was run on but it seems like writing to below the ptrs array with the negative index is probably just going to write to an area of the stack that isn't currently in use (ptrs, ii, pushed registers, previous stack frame, etc, all being above ptrs[-1]). However, since the stack isn't initialized *(ptrs[ii]) is going to access whatever address happens to be in memory at ptrs[ii] and *(ptrs[ii]) = 0; is going to try and write a 0 to that address. Wouldn't this attempt to write to a random location in memory be more fatal to the program's execution?
Berin, thanks for commenting. Yes, this program does a lot of broken things, and dereferencing one of the array elements is ultimately what lead to the crash. If it hadn't, then corrupting arbitrary memory (either by writing past the end of the array or by successfully dereferencing those pointers) may well have triggered a crash sometime later. In all of these cases, there'd be little hope of root-causing the bug without some kind of memory dump (assuming code inspection is intractable, as it would be in a more realistic program).