Inline strlen function in assembly

June 6, 2008

I know the strlen function in assembly has been covered on the internet before, but I figured I’ve cover it again, just in case someone like myself were searching google for it ;) ; here’s the code:

00404334 sub_404334 proc near
00404334  push edi
00404335  push eax
00404336  push ecx
00404337  mov edi, edx
00404339  xor eax, eax
0040433B  repne scasb
0040433D  jnz short loc_404341
0040433F  not ecx
00404341 loc_404341:
00404341  pop eax
00404342  add ecx, eax
00404344  pop eax
00404345  pop edi
00404346  jmp sub_4041BC
00404346 sub_404334 endp

The inputs for this function come from edx and ecx, edx holds a pointer to our string (in this example, the string is the location of the windows directory, so I’m going to say it’s “C:\Windows”). Ecx holds the maximum length of the string, which is 256 in this example. This is important as ecx is used as a countdown while the string is checked. Let’s go line-by-line:

00404334  push edi
00404335  push eax
00404336  push ecx

These 3 lines just save the variables to the stack so they’re not overwritten, standard stuff.

00404337  mov edi, edx
00404339  xor eax, eax

edx (which is a LPCSTR to “C:\Windows”) is moved into edi (you’ll see why in a second). Eax is XOR’d with itself to reset it to 0. The next instructions will compare each character in the string with al, so essentially it’s searching for the NULL character ‘\0′

0040433B  repne scasb

This instruction works from the beginning of edi, comparing each character of the string to whatever is in al (which is ‘\0′ or NULL right now). It decrements ecx for every character it compares (scans). If it does not find a match (repne – repeat-ne==Not Equal), it moves to the next character. In our example “C:\Windows” (terminated by NULL, like a good string should), ecx will decrease from 256 to 246 (C – 256, : – 255, \ – 254, W – 253, i – 252, n – 251, d – 250, o – 249, w – 248, s – 247, \0 – 246)

0040433D  jnz short loc_404341

If the end of the string was reached and there were not NULL bytes, jump to location 0x404341. In our example, it’s not jumped.

0040433F  not ecx

Flip all the bits in ecx, since ecx will be treated as a signed number, this makes ecx = -ecx. Note that if the end of the string is reached (ecx = 0), this instruction would be skipped by the jump in the previous instruction. In our example however, ecx becomes -246 (or 0xFFFFFF09).

00404341 loc_404341:
00404341  pop eax
00404342  add ecx, eax

Ecx’s starting value (256, remember?) is popped back into eax. Eax is then added to ecx and the result is stored in ecx. Therefore:

eax = 256
ecx = ecx + eax
ecx = -246 + 256
ecx = 10

The length of the string now resides in ecx, we can restore our original registers and jump away in the ending instructions:

00404344  pop eax
00404345  pop edi
00404346  jmp sub_4041BC

And that, is one way to get the length of a string in assembly.

 

1 Comment to "Inline strlen function in assembly"

  1. Martijn wrote:

    Hi,

    Nice explanation.

    May I ask why you are not clearing or setting the direction flag (cld/std)? If DF is set (=1) I believe that the scasb instruction will decrease edi instead of increase.

    I haven’t done anything with assembly since long so I could be easily mistaken :-)

    Best Regards Martijn

 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org