I know the strlen function in assembly has been covered on the internet before, but I figured I’ve cover it again, just in case someone like myself were searching google for it ; here’s the code:
00404334 sub_404334 proc near
00404334 push edi
00404335 push eax
00404336 push ecx
00404337 mov edi, edx
00404339 xor eax, eax
0040433B repne scasb
0040433D jnz short loc_404341
0040433F not ecx
00404341 loc_404341:
00404341 pop eax
00404342 add ecx, eax
00404344 pop eax
00404345 pop edi
00404346 jmp sub_4041BC
00404346 sub_404334 endp
The inputs for this function come from edx and ecx, edx holds a pointer to our string (in this example, the string is the location of the windows directory, so I’m going to say it’s “C:\Windows”). Ecx holds the maximum length of the string, which is 256 in this example. This is important as ecx is used as a countdown while the string is checked. Let’s go line-by-line:
00404334 push edi
00404335 push eax
00404336 push ecx
These 3 lines just save the variables to the stack so they’re not overwritten, standard stuff.
00404337 mov edi, edx
00404339 xor eax, eax
edx (which is a LPCSTR to “C:\Windows”) is moved into edi (you’ll see why in a second). Eax is XOR’d with itself to reset it to 0. The next instructions will compare each character in the string with al, so essentially it’s searching for the NULL character ‘\0′
0040433B repne scasb
This instruction works from the beginning of edi, comparing each character of the string to whatever is in al (which is ‘\0′ or NULL right now). It decrements ecx for every character it compares (scans). If it does not find a match (repne – repeat-ne==Not Equal), it moves to the next character. In our example “C:\Windows” (terminated by NULL, like a good string should), ecx will decrease from 256 to 246 (C – 256, : – 255, \ – 254, W – 253, i – 252, n – 251, d – 250, o – 249, w – 248, s – 247, \0 – 246)
0040433D jnz short loc_404341
If the end of the string was reached and there were not NULL bytes, jump to location 0x404341. In our example, it’s not jumped.
0040433F not ecx
Flip all the bits in ecx, since ecx will be treated as a signed number, this makes ecx = -ecx. Note that if the end of the string is reached (ecx = 0), this instruction would be skipped by the jump in the previous instruction. In our example however, ecx becomes -246 (or 0xFFFFFF09).
00404341 loc_404341:
00404341 pop eax
00404342 add ecx, eax
Ecx’s starting value (256, remember?) is popped back into eax. Eax is then added to ecx and the result is stored in ecx. Therefore:
eax = 256
ecx = ecx + eax
ecx = -246 + 256
ecx = 10
The length of the string now resides in ecx, we can restore our original registers and jump away in the ending instructions:
00404344 pop eax
00404345 pop edi
00404346 jmp sub_4041BC
And that, is one way to get the length of a string in assembly.