Archive for April, 2012

Building executables with Gnu Linker

Wednesday, April 4th, 2012

You can use the linker ld command directly without using gcc command to generate executable file for a C program. Linking is the final stage in the compilation. Linking is done by a program called linker ld. Linker takes all relocatable files and link them with the other static and shared libraries and generates an executable file. On Linux, linker ld is part of GNU binutils.

You can use which command to know the linker that is used on your machine.

$ which ld
/usr/bin/ld

So, linker accepts relocatable object files as its input. You should have generated relocatable object files for your C programs. Once you have the all .o files you can use the  linker ld. Let’s take the following program.

#include <stdio.h>

int main(void)
{
      int a = 10, b = 20, c;
      c = a + b;
      printf("\nAddition:%d\n", c);
      return 0;
}

Let’s compile this program to and use linker to build executable object file

$ gcc -c test.c -o test.o
$ ld test.o -o test

ld: warning: cannot find entry symbol _start; defaulting to 0000000008048074
test.o: In function `main':
test.c:(.text+0x38): undefined reference to `printf'

When I ran the ld command it has given a warning and an error. Look at the first warning, it says that it cannot find a symbol _start which happens to be entry point for your program. And it says that entry point for your program is being set to main  function.

Entry point for your C program is not main

We think that entry point for our program is main function. But it is not. Some extra start up code will be added by the compiler while you compile a program. This start up code has a function named _start. This function is the one that executed first and then this function calls our main function. The start up code will be in relocatable object files on your Linux machine. gcc will pass these object files to the linker when you compile a program. So, we need to pass these start up code object files to the linker.

These files are, crt1.o, crti.o, and crtn.o. These will be stored on your machine somewhere. We need to find the path of them. For that you can use the find or locate command. I will use the locate command for that.

$ locate crt1.o
/usr/lib/i386-linux-gnu/Mcrt1.o
/usr/lib/i386-linux-gnu/Scrt1.o
/usr/lib/i386-linux-gnu/crt1.o
/usr/lib/i386-linux-gnu/gcrt1.o

Look at the locate command output. It says that crt1.o is in /usr/lib/i386-linux-gnu. The remaining two files crti.o and crtn.o will also be in this directory. We need to pass these object files to the linker.

Need to pass libraries to the linker

Error given by the linker is test.c:(.text+0×38): undefined reference to `printf’. This says that it is unable to find code for the printf function. printf code will be in a library file. This library on LInux is libc.so or libc.a.

We should tell the compile to link our test.o file with the library C library libc. We need to pass the library to the linker using the -l flag.

Let’s now pass startup files and library libc to the linker and create executable file for test.o.

$ld /usr/lib/i386-linux-gnu/crt1.o /usr/lib/i386-linux-gnu/crti.o /usr/lib/i386-linux-gnu/crtn.o test.o -lc -o test

You can now type ls command and see an executable test has been created.

Running the program

We have got the executable file test. Let’s run this program.

$ ./test
bash: ./test: No such file or directory

it is throwing an error saying that No such file or directory. But we have the test executable file. But why is it giving the error?. To know this, let’s see what type of file is test. You can use ‘file’ command for this purpose.

$ file test
test: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

file command says that the file is an executable file, and it is dynamically linked, that means uses the shared libraries. Yes, it uses the shared library libc.so, we have passed this library to the linker to link it with the test.o. Linker by default creates an dynamically linked executable file, it uses the shared libraries instead of static libraries.

When a dynamically linked executable program is executed, we need to load the shared libraries into the memory along with the program. This is not required in case of static executable. Linux uses a program called dynamic link loader to load the all shared libraries in to the memory. While we create an executable using the linker, dynamic link loader information information also should be included in the executable file.

Link loader used by the Linux is ld.so. This is an executable shared object file. In my case it is,

/lib/i386-linux-gnu/ld-2.15.so

There is a symbolic link for this file in the /lib with name ld-linux.so, like this.

ls -l /lib/ld-linux.so.2
lrwxrwxrwx 1 root root 25 Jul 2 22:22 /lib/ld-linux.so.2 -> i386-linux-gnu/ld-2.15.so

You need to tell this link loader information to the linker ld using an option –dynamic-linker, like this:

$ ld --dynamic-linker /lib/ld-linux.so.2 /usr/lib/i386-linux-gnu/crt1.o /usr/lib/i386-linux-gnu/crti.o /usr/lib/i386-linux-gnu/crtn.o test.o -lc -o test
$ ./test

Addition30

Changing the entry point for a Program

We saw that, _start is the entry point for a program. But you can change entry point for a program. For this you need to use the -e with the linker ld. In that case, you need not to pass any start up files, like this.

$ ld --dynamic-linker /lib/ld-linux.so.2 -e main  test.o -lc  -o test

I have changed entry point for my program as main function.

$ ./test

Addtion30
Segmentation fault (core dumped)

it is terminating with  segmentation fault. When a program is created a process will be created. We need to close this process after your completion of your program. This is done using the exit()function. When you use start up files, _start code has a call to exit() function. From your main() function you will just return to the start up code. Start up code will take care of closing the program.

But, when we make our main() function as entry point for our program, there is no start up code. But, we have used the return statement in our program. Even if we don’t write return statement in the main() function, automatically there will a return statement will be added by the compiler. So, from the main() function we are trying to jump some where else with the return statement. We don’t have any code before the main() function. So, control will jump to an invalid address, hence the segmentation fault.

How to fix this?. You need to exit the program in main() function it self, like this:

Now compile the program and run the program, you should see a successful execution.

A program without main() function
With the above concept, can we not write a program which doesn’t have main() function?. Yes we can, give any other function as entry point for your program. For example lets take this program

vim test.c

In the above program, there is no main() function. Let’s compile this and create an executable.

Look at the linker command, I changed the entry for my program as mymain(). That’s it, run the program now.