ASLR implementation in Linux Kernel 3.7

by @Jonathan Salwan - 2013-01-19

In this short note, we'll see how to ASLR is implemented in the Linux Kernel 3.7. The function which is called when the kernel wants to generate the pseudo random number is get_random_int(). This function is located in drivers/char/random.c.

static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash);
unsigned int get_random_int(void)
{
        __u32 *hash;
        unsigned int ret;

        if (arch_get_random_int(&ret))
                return ret;

        hash = get_cpu_var(get_random_int_hash);

        hash[0] += current->pid + jiffies + get_cycles();
        md5_transform(hash, random_int_secret);
        ret = hash[0];
        put_cpu_var(get_random_int_hash);

        return ret;
}

First, the get_random_int function begins by initializing an hash with the get_cpu_var(). The get_cpu_var function returns a value for the current processor version of the variable. Then, we add some other information to generate the random number:

  • Current PID
  • Jiffies
  • Numbers of instruction cycles

The Jiffies is a kernel global variable which represents the number of time ticks (irq0) since the machine was started.

The numbers of instruction cycles is obtained with rdtsc's instruction for the Intel architectures. The call trace for the get_cycles function in the Intel architecture is :

Currently the first step for the random number is :

first_step = (random int) + (current PID) + (IRQ0 ticks) + (RDTSC)

For the second step the get_random_int function calls md5_transform with the first step. md5_transform() is the core of the MD5 algorithm, this alters an existing MD5 hash kept in buf to reflect the addition of 16 long-words of new data passed in argument.

After these two steps, we have a pseudo random number really difficult to reproduce.

random_int = md5_transform((random_int + PID + IRQ0_ticks + RDTSC), random_int2)

To generate a number between two addresses, the kernel uses the randomize_range function. This function calls just the get_random_int function and applies a modulo to get a number between the 'start' and 'end' values.

unsigned long
randomize_range(unsigned long start, unsigned long end, unsigned long len)
{
        unsigned long range = end - len - start;

        if (end <= start + len)
                return 0;
        return PAGE_ALIGN(get_random_int() % range + start);
}

When the kernel loads an elf binary, the load_elf_binary function in /fs/binfmt_elf.c is called. A part of this function, is the initialization of pointers memory, like code, data, stack section. The following code is an extract of load_elf_binary function.

[...]

if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
        current->flags |= PF_RANDOMIZE;

[...]

current->mm->end_code = end_code;
current->mm->start_code = start_code;
current->mm->start_data = start_data;
current->mm->end_data = end_data;
current->mm->start_stack = bprm->p;

#ifdef arch_randomize_brk
        if ((current->flags & PF_RANDOMIZE) && (randomize_va_space > 1)) {
                current->mm->brk = current->mm->start_brk = arch_randomize_brk(current->mm);
#ifdef CONFIG_COMPAT_BRK
                current->brk_randomized = 1;
#endif
        }
#endif

[...]

We can see that if the randomize_va_space variable is higher than 1, and the PF_RANDOMIZE flag is set, the base address of brk is randomized with the arch_randomize_brk function. The following scheme is a call trace from load_elf_binary function with the different randomization functions.