Saturday, 2 April 2011

No such file or directory after ld.

I was toying with the Linux development tools trying to figure out the whole compilation process of a program (something I should have done a loooong time ago) and ran into this interesting error. Or, rather, quite a boring one, but with a baffling manifestation for a permanent newbie like me.

What I wanted to do was to go through the whole source->compiler->assembler->linker->binary tool invocation chain manually instead of relying on GCC. I made a typical C program:
#include <stdio.h>

main()
{
printf ("Le ha-ha.\n");
}
Ran a typical compiler with the -S option to get a typical assembly source rather than a typical ready-to-go binary:
gcc -S hello.c -o hello.S
Assembled it into a typical ELF object file:
as hello.S -o hello.o
And, finally linked it with libc containing printf() and the crt* wrappers.
ld hello.o /usr/lib/crt* /usr/lib/libc.so -o hello
I say, that was quite simple! Let's run the bastard!
$ ./hello
bash: ./hello: No such file or directory
Huh? I guess there WAS an error, but the stupid tools didn't report it. Let's see which file is missing:
$ ls
hello hello.c hello.o hello.S
Err, what? The binary is present? What about the permissions?
$ ls -lh ./hello
-rwxr-xr-x 1 * * 4.3K 2011-04-03 03:43 hello
WTF?
$ file hello
hello: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped
So the file actually EXISTS and it IS executale. However, when I run it, the system says it's absent.

Guess what? Linux actually doesn't find one file when I invoke my program, however, it is not my binary.

Let's dive into the details. If you run GCC with the -v option, it prints all the commands it executes. You can find the linking stage there too, although it's performed through a wrapper called collect2. My line was the following:
"/usr/lib/gcc/i486-linux-gnu/4.4.1/collect2" "--build-id" "--eh-frame-hdr" "-m" "elf_i386" "--hash-style=both" "-dynamic-linker" "/lib/ld-linux.so.2" "-z" "relro" "/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib/crt1.o" "/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib/crti.o" "/usr/lib/gcc/i486-linux-gnu/4.4.1/crtbegin.o" "-L/usr/lib/gcc/i486-linux-gnu/4.4.1" "-L/usr/lib/gcc/i486-linux-gnu/4.4.1" "-L/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib" "-L/lib/../lib" "-L/usr/lib/../lib" "-L/usr/lib/gcc/i486-linux-gnu/4.4.1/../../.." "-L/usr/lib/i486-linux-gnu" "/tmp/ccWw7lET.o" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/i486-linux-gnu/4.4.1/crtend.o" "/usr/lib/gcc/i486-linux-gnu/4.4.1/../../../../lib/crtn.o"
After some trial and error I found out that the option I needed was this:
"-dynamic-linker" "/lib/ld-linux.so.2"
It specifies the name of the dynamic linker that will be used on program invocation. But how on earth was I supposed to know that the default dynamic linker wasn't good if the manpage for ld says:
The default dynamic linker is normally correct; don't use this unless you know what you are doing.
Liars. Let's find out what the default linker is:
$ ld hello.o /usr/lib/crt* /usr/lib/libc.so -o hello.without.explicit.dl
$ ./hello.without.explicit.dl
bash: ./hello.without.explicit.dl: No such file or directory
$ ld --dynamic-linker=/lib/ld-linux.so.2 hello.o /usr/lib/crt* /usr/lib/libc.so -o hello.with.explicit.dl
$ ./hello.with.explicit.dl
Le ha-ha.
$ objdump -s hello.without.explicit.dl > hello.without.explicit.dl.objdump
$ objdump -s hello.with.explicit.dl > hello.with.explicit.dl.objdump
$ diff -C1 hello.with.explicit.dl.objdump hello.without.explicit.dl.objdump
*** hello.with.explicit.dl.objdump 2011-04-03 04:14:11.000000000 +0400
--- hello.without.explicit.dl.objdump 2011-04-03 04:14:00.000000000 +0400
***************
*** 1,7 ****

! hello.with.explicit.dl: file format elf32-i386

Contents of section .interp:
! 8048114 2f6c6962 2f6c642d 6c696e75 782e736f /lib/ld-linux.so
! 8048124 2e3200 .2.
Contents of section .note.ABI-tag:
--- 1,7 ----

! hello.without.explicit.dl: file format elf32-i386

Contents of section .interp:
! 8048114 2f757372 2f6c6962 2f6c6962 632e736f /usr/lib/libc.so
! 8048124 2e3100 .1.
Contents of section .note.ABI-tag:
The only difference was the string value in the .interp section (which apparently specifies the path to the dynamic loader). And instead of /usr/lib/libc.so.1 what I needed was /lib/ld-linux.so.2. So... What does libc.so.1 look like?
$ ls /usr/lib/libc.so.1
ls: cannot access /usr/lib/libc.so.1: No such file or directory
There we have it. So the error we saw was the error about a missing dynamic loader, not the binary itself! But how was I supposed to know that from that message without stepping on this rake once? Beats me.

A query to Google shows that /usr/lib/libc.so.1 is used on SCO UnixWare systems, not Linux. Why ld doesn't put the proper linker name on an i386 Ubuntu system and, on top of that, confuses the user by saying not to touch the --dynamic-linker option is another question I can't answer.

Finally, it seems that the older Linux systems used to complain about a "bad ELF interpreter" which was kind of right. I wonder if the modern behaviour can be considered a bug.

Lesson learned? Even robust tools used for many years on a multitude of platforms may try to trick you. Especially robust tools used for many years on a multitude of platforms.

No comments:

Post a Comment