When I first loaded msdsrv.exe into IDA Pro, I had no idea what I was dealing with. No strings, no obvious behavior — just raw assembly. What followed was one of the most satisfying reverse engineering sessions I've had: peeling back the layers of a real-world keylogger, instruction by instruction.
This post walks through exactly how I reverse engineered this sample using IDA Pro's free edition. We'll cover imports analysis, Windows hooking, compound conditionals in assembly, and jump table mechanics — all the good stuff.
Why Keyloggers Are Interesting to Reverse Engineer
Keyloggers are deceptively simple malware. Their whole job is to capture what you type and write it somewhere — a file, a remote server, a registry key. But at the assembly level, they use some fascinating Windows internals: low-level keyboard hooks, virtual key code mappings, and modifier key detection.
Understanding how they work at this level teaches you patterns you'll see across a huge range of malware families.
The Sample: msdsrv.exe
The file is a 32-bit Windows PE executable. Before doing anything dynamic, I loaded it straight into IDA Pro for static analysis. No sandboxes, no execution — just the disassembler and MSDN documentation.
Step 1: Checking the Imports Tab
The Imports tab in IDA is often the fastest way to build a theory about what malware does before you read a single instruction.
For msdsrv.exe, these stood out immediately:
| API | Purpose |
|---|---|
GetKeyState |
Check if a specific key is currently pressed |
GetAsyncKeyState |
Check key state asynchronously (across threads) |
SetWindowsHookExA |
Install a system-wide hook procedure |
OpenClipboard |
Access clipboard contents |
GetClipboardData |
Read from clipboard |
CallNextHookEx |
Pass the hook message to the next handler in chain |
Just from the imports, the picture is clear: this binary installs a keyboard hook and reads keystrokes. Classic keylogger architecture.
Step 2: Locating SetWindowsHookEx References
I double-clicked SetWindowsHookExA in the Imports tab, then pressed x to view all cross-references. Three calls showed up:
0x00403FD10x0040440F0x0040F5B7
The key parameter here is idHook — it tells you what kind of events the hook captures. Cross-referencing with MSDN's SetWindowsHookEx docs:
| Address | idHook value | Constant | Meaning |
|---|---|---|---|
0x00403FD1 |
0x0D (13) |
WH_KEYBOARD_LL |
Low-level keyboard hook |
0x0040440F |
0x0D (13) |
WH_KEYBOARD_LL |
Low-level keyboard hook |
0x0040F5B7 |
0x05 |
WH_CBT |
Window/UI change hook |
Two of the three calls install a low-level keyboard hook. That's the primary keystroke capture mechanism. The WH_CBT hook monitors window activity — likely used to track which application is active when keystrokes are captured.
Step 3: Navigating to the Hook Procedure
The hook procedure is the callback function that runs every time a keyboard event fires. I jumped to it directly using IDA's g hotkey:
g → 4024D0
Pressing Spacebar switches from Text view to Graph view. Zooming out on this function reveals something impressive: dozens of decision blocks branching in every direction. This is not a simple function — it handles an enormous number of cases.
I used View → Open Subviews → Function Calls to get a high-level summary of everything called from inside this function. The list was dominated by repeated calls to GetKeyState and GetAsyncKeyState — the actual keystroke readers.
Step 4: The Compound Expression — Filtering for Numeric Keys
I jumped to 0x40259F — the address of a GetAsyncKeyState call — and looked at what comes before it (addresses 0x40258B–0x402597).
Here's the assembly:
; At 0x402585
MOV EAX, [lParam] ; lParam is a pointer to KBDLLHOOKSTRUCT
MOV EAX, [EAX] ; dereference → vkCode (virtual key code) now in EAX
; At 0x40258E
JB 0x40273B ; jump if EAX < 0x30 (below '0' key)
; At 0x402597
JA 0x40273B ; jump if EAX > 0x39 (above '9' key)
The lParam parameter of a keyboard hook points to a KBDLLHOOKSTRUCT. The first member (vkCode) is the virtual key code of the key that was pressed.
The compound expression does this:
- If the key code is below
0x30→ bail out (not a digit key) - If the key code is above
0x39→ bail out (not a digit key) - Otherwise, continue to
0x40259D— the key pressed is 0 through 9
This is a range check using two separate conditional jumps — a classic assembly pattern for if (key >= '0' && key <= '9').
Step 5: Checking the Shift Key with GetAsyncKeyState
Once we know a numeric key was pressed, the code checks whether Shift is held down. At 0x40259F:
PUSH 10h ; VK_SHIFT = 0x10
CALL GetAsyncKeyState ; was Shift pressed?
TEST AX, AX ; test the lower 16 bits of return value
JZ 0x40270D ; jump if Shift is NOT pressed
From MSDN: GetAsyncKeyState returns a value where the least significant bit is set if the key was pressed since the last call. The TEST AX, AX checks this.
So: if AX is zero (Shift not pressed), we jump to 0x40270D and skip the special character handling. If Shift is pressed, we continue.
This makes perfect sense — pressing Shift + 1 on a US keyboard produces !. The keylogger needs to know whether Shift was held to log the actual character the user typed, not just the key code.
Step 6: The Jump Table — Mapping Digits to Special Characters
Now it gets interesting. At 0x4025BC:
JMP ds:off_403380[EAX*4]
This is a jump table — a common compiler optimization for switch statements. Instead of a chain of if/else comparisons, the code multiplies EAX by 4 (each address is 4 bytes) and jumps to the corresponding entry in the table at 0x403380.
Before the jump, there's a normalization step:
; At 0x4025B0
ADD EAX, 0FFFFFFD0h ; This is -0x30 in two's complement
; If EAX = 0x39 ('9'), result = 9
; If EAX = 0x30 ('0'), result = 0
Then a bounds check:
CMP EAX, 9
JA 0x403339 ; if above 9, don't access table (safety check)
So the jump table has 10 entries (0 through 9), mapping each digit's key code to a branch that handles the Shift+digit combination.
Step 7: What Happens at the Jump Target (EAX = 1)
If EAX is 1 (the 1 key with Shift held), execution jumps to 0x4025E4. Here's what I found:
; At 0x4025E4
CMP byte_4375D0, 0 ; check a flag byte
; At 0x4025EB
MOV EAX, offset aExclamation ; pointer to "!" string
; At 0x4025F0
JNZ 0x4025F7 ; jump if flag is NOT zero (ALT not pressed)
MOV EAX, 1 ; else: ALT is pressed, use raw value instead
The flag at 0x4375D0 is set earlier in the function. Tracing back:
; At 0x402567
CMP EDI, 100h ; is wParam == WM_KEYDOWN (ALT not pressed)?
SETZ DL ; DL = 1 if yes, 0 if no (ALT pressed)
MOV byte_4375D0, DL ; store the result
WM_KEYDOWN (0x100) means a regular key down event — no ALT modifier. WM_SYSKEYDOWN (0x104) means ALT is held. The keylogger explicitly tracks this because:
-
Shift + 1 without ALT → log
! -
Shift + 1 with ALT → log the raw value
1instead (ALT+number combos have different meanings in many applications)
Once the character is resolved, it gets pushed as an argument to sub_402070, which — after following a few more call chains — writes the keystroke to disk.
Finally, at 0x402600, a jump to 0x403339 leads to CallNextHookEx, which properly passes the event down the hook chain. Skipping this call would break keyboard input for the user — so even malicious hooks follow the rules here.
How to Verify This Analysis
If you want to walk through this yourself:
- Load
msdsrv.exeinto IDA Free - Open the Imports tab and search for
SetWindowsHookExA - Press
xon the API name to view cross-references - Check the
idHookargument pushed before eachCALL - Press
gand jump to0x4024D0to reach the hook procedure - Use View → Open Subviews → Function Calls to see the full call graph
- Press
gagain and go to0x40259Fto examine theGetAsyncKeyStatecall
Hotkeys used in this session:
- Spacebar → Toggle Graph/Text view
- x → View cross-references
- g → Jump to address
- ; → Add comment to instruction
What I Learned
Reversing this keylogger taught me several things that textbooks don't fully convey:
Windows hooks are legitimate APIs. SetWindowsHookEx is in every Windows SDK. The same API that powers accessibility tools powers keyloggers. Context is everything.
Assembly compound expressions have patterns. Back-to-back JB and JA checking the same register? That's a range check. Once you see it a few times, you start recognizing it instantly.
Jump tables are elegant but tricky. The JMP ds:table[EAX*4] pattern is generated by compilers for dense switch statements. Spotting the normalization (ADD EAX, -0x30) before the table access was the key to understanding the structure.
Following return values matters. EAX carries return values in 32-bit x86. Every time I saw TEST EAX, EAX or CMP EAX, something right after a CALL, that was the program checking whether the last function succeeded. Building this reflex makes code reading much faster.
ALT modifier handling reveals attacker intent. The fact that this keylogger specifically distinguishes WM_KEYDOWN from WM_SYSKEYDOWN shows the author wanted accurate character logging, not just key codes. That's a sign of deliberate, targeted development.
Common Mistakes Table
| Mistake | What Actually Happens |
|---|---|
Confusing JZ and JNZ after TEST EAX, EAX
|
You get the branch condition backwards — this is the #1 source of errors when reading conditionals |
| Reading ADD with a large hex operand as addition |
0FFFFFFD0h is negative in two's complement — it's actually -0x30
|
| Skipping the normalization step before a jump table | You'll misread the table index and end up at the wrong target address |
Ignoring wParam in hook callbacks |
It tells you the message type (WM_KEYDOWN vs WM_SYSKEYDOWN) — missing this means you miss the ALT detection logic |
Not cross-referencing lParam with MSDN |
lParam in keyboard hooks points to a KBDLLHOOKSTRUCT — without that knowledge, dereferencing it makes no sense |
Treating all SetWindowsHookEx calls the same |
The idHook value completely changes the behavior — always check what type of hook is being installed |
Conclusion
What started as a black-box executable became a fully understood keystroke capture mechanism — just through IDA's disassembler and MSDN documentation. No dynamic execution, no sandbox, no behavioral analysis.
The most valuable thing about this kind of static analysis is that it works even when malware is designed to evade sandboxes. The code still has to run somewhere. And when it does, it follows rules — calling conventions, API contracts, register semantics. Those rules are exactly what make reverse engineering possible.