So Mars and I have been tracking down some code generation bugs with one of the new 2006 features for the better part of the afternoon. We're working on Windows, which is notoriously difficult to track down code generator bugs on. You see, Mars is well versed in the ways of MacsBug, which was awesome for PPC back in the OS 9 days. However, there's not much in the way of good kernel level debuggers on OS X (well, dgb, sorta kinda) -- but CodeWarrior will at least help. CodeWarrior trying to help on Windows is an absolute joke. Half the time you can't even be sure that what it's showing you really exists.
We started tracking the crash down by using my copy of VisualStudio 6 on Windows. When the app would crash, we'd hit the Debug button on the ensuing dialog and break into VC++. We'd break in, and it'd show me the asm. Yay, right? Wrong. The act of selecting the asm would sometimes change it -- and what we were looking at had absolutely no context information whatsoever. It was just a sea of asm and we seemed to be out in the weeds somewhere.
However, Windows has a great kernel level debugger with WinDbg. It's powerful (if a bit confusing to use for the uninitiated) and free. A great combination. I've had it installed for years and I pull it out on occasion to help with this sort of issue. So I whipped it out and started debugging.
Poof! Within 30 seconds, I found that we were crashing while trying to unlock an object. The object pointer was pure bunk. But it was confusing -- the method where the crash was happening had absolutely no reference counted objects in it. So why are we getting into an unlock object call when there's no objects to unlock?
Now came a quandry. I needed to be able to break into a good debugger that would show me accurarte, real-time assembly. However, I couldn't manually set any breakpoints, and I had no clue where the function resided when the app was loaded into memory (I really didn't feel like tracking it down via the PE32 headers and mapping it by hand). Dun... dun dun dun... DUHHHHHH!
Enter awesome hack of the day.[rbcode]
Declare Sub DebugBreak Lib "Kernel32" ()
DebugBreak[/rbcode]
This handy snippet of debuggery love tells the OS to break the application into the currently executing debugger if one is attached. Ooooh! I bet you can see where this is going.
I shoved this declare into the method I wished to see disassembled and debugged. I launched the app under WinDbg (by selecting File->Open Executable), and hit F5 to unconditionally run the app. Then, when the app got to my function -- poof, I broke into WinDbg. Then I opened up the disassembly window and started stepping. Ta da!
00c558c3 e88e40c07b call kernel32!DebugBreak (7c859956)
00c558c8 8b65dc mov esp,[ebp-0x24] ss:0023:0012e98c=0012e984
00c558cb 8d45f5 lea eax,[ebp-0xb]
00c558ce 8945f0 mov [ebp-0x10],eax
00c558d1 33c9 xor ecx,ecx
00c558d3 03c1 add eax,ecx
00c558d5 8945ec mov [ebp-0x14],eax
00c558d8 8a10 mov dl,[eax]
00c558da 50 push eax
00c558db 8bc2 mov eax,edx
00c558dd 6698 cbw
00c558df 98 cwde
00c558e0 8bd0 mov edx,eax
00c558e2 58 pop eax
00c558e3 8855ea mov [ebp-0x16],dl
00c558e6 50 push eax
00c558e7 8bc2 mov eax,edx
00c558e9 6698 cbw
00c558eb 98 cwde
00c558ec 8bd0 mov edx,eax
00c558ee 58 pop eax
00c558ef 8955e4 mov [ebp-0x1c],edx
00c558f2 3bd1 cmp edx,ecx
00c558f4 0f9cc3 setl bl
00c558f7 83e301 and ebx,0x1
00c558fa 885deb mov [ebp-0x15],bl
00c558fd 895dd8 mov [ebp-0x28],ebx
00c55900 e905000000 jmp 00c5590a
As you can see, I had just stepped my way out of the DebugBreak call in the Kernel32 module, and am currently residing in the function I wished to be debugging. w00t!
While I certainly don't expect anyone reading this blog entry to ever need to use WinDbg when tracking down crashes in their Win32 apps made with REALbasic.... I do figure that writing about it will help me remember this neat little trick so that I can do it in the future and skip CW/VS hackery. :-D
Oh well, off to crawl through some assembly.
Neat. But.... Windows ;^)
Yes, sadly he's teamed up with the wrong side in the OS wars and is gradually persuading the company he works for to burn down the homes of the faithful.
All I can say is, I'm glad I'm not a beta-tester :)
Steve, I hope you're joking.
Aaron, thanks for this-- sure is interesting! By the way, you mentioned tracking down the function manually in the PE32 headers. This made me think-- does the "Include Function Names" setting do anything for Windows binaries, and if so, what? I never see any meaningful symbols when I get a crash on Windows.
@Adam - Yes, I'm itching to become a beta tester. As for the rest, you decide
@Aaron - it's your fault I've got a headache. I've been up all night pouring over assembler mnemonics I haven't dealt with for ages, trying to recognise patterns I've forgotten, among code I have no hope of knowing where it fits in to the grand scheme of things. And all from pure nosiness with no practical end in view.
@Steve -- just to ease your pain a little bit... The offending code isn't shown here, and it was a parameter passing bug that would stomp on the stack if you used a new (to 2006r1) feature in a specific way. So it'd be very tough to notice without stepping thru the assembly since it's nothing more than a mov instruction that was causing the problem.
@Adam -- for Windows and Linux, it does nothing useful. In fact, I don't think it does anything at all; the setting may be ignored.
Aaron, it wasn't your code that gave me the headache.
Problem was your awesome hack gave me ideas above my station and I used WinDbg looking for an old problem in a VB5 app where I'm convinced the compiler is optimising away a side-effect.
I never did get to grips with exactly what I was seeing but I did manage to get very frustrated.
Have loked at it again today and now know I'm going insane because it's begining to make sense to me :)
Steve said: "All I can say is, I’m glad I’m not a beta-tester :)"
http://www.mactech.com/news/?p=1008171
Steve, still glad you're not a beta tester? ;)
@adam1234 - Makes no difference to me. I don't drink cocoa ;)
All I do on the Mac is struggle to make the basics of my Windows and Linux stuff perform properly and look vaguely like a native app.
Unix is my spiritual home but most of my gui experience is MS Windows.
@Steve: That clears it up. I thought your early comment was in earnest, not sarcastic. Heh.
I'd whine about there not being a Mac equivelant, but I don't have enough grapes, and I'm finding - at the moment - less and less of a need for one.
(Which only means one of three things:
1.) I need to make sure I get more ZZZ's before I code,
2.) I'm getting pretty darn good at coding, or
3.) I'm ____ing up so bad I can't tell a mistake from a correct response.)
(I hope it ain't #3, BTW - I've never thought of myself being that buttheaded.)