Tag: linux

Writing a simple kernel module

Why do we need a kernel module?

Sometimes we need to carry out some privileged operation which is not available in Ring3. Linux kernel modules are a way to get hold of Ring0. Although linux provides a lot of APIs, but still the need arises sometimes for kernel modules.

A Linux kernel module is a piece of compiled binary code that is inserted directly into the Linux kernel, running at ring 0, the lowest and least protected ring of execution in the x86–64 processor. Code here runs completely unchecked but operates at incredible speed and has access to everything in the system.

Getting Started

Make a folder where you would put your kernel module code.

$ mkdir sample_module

Open up a file where you would write your main module code and write down the below sample contents. Suppose I name the file technicalityinside.c

#include<linux/init.h>
#include<linux/module.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Akash Panda");

static int technicalityinside_init(void) {
        printk(KERN_ALERT "Module loaded\n");
        return 0;
}

static void technicalityinside_exit(void) {
        printk(KERN_ALERT "Goodbye cruel world\n");
}

module_init(technicalityinside_init);
module_exit(technicalityinside_exit);

Now we have the simplest of all modules. Now let us understand what does it say line by line.

“includes” cover the required header files required for linux kernel module development.
There are different module licences available for MODULE_LICENSE:

  • “GPL” [GNU Public License v2 or later]
  • “GPL v2” [GNU Public License v2]
  • “GPL and additional rights” [GNU Public License v2 rights and more]
  • “Dual BSD/GPL” [GNU Public License v2or BSD license choice]
  • “Dual MIT/GPL” [GNU Public License v2 or MIT license choice]
  • “Dual MPL/GPL” [GNU Public License v2 or MPL license choice]

We define both the init (loading) and exit (unloading) functions as static and returning an int.

Please note that at the end of the file we have called module_init and module_exit functions. This gives us an opportunity to name the init and exit functions as we like.

Now let us look at writing Makefile

obj-m += technicalityinside.o
all:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Once we are ready with Makefile, we can now run make to compile our module.

Now, once compiled, we are ready to insert the linux module and test it.

$ sudo insmod technicalityinside.ko

Once we have loaded the module, we can see the kernel log output

$ tail -4 /var/log/kern.log

Once we see our module is working, we can remove it by issuing rmmod command.
To learn how to load and unload kernel module, follow this article.

Linux x86 ring usage overview

In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:

  • 0 for kernel
  • 3 for users

This is the most hard and fast definition of kernel vs userland.

Why Linux does not use rings 1 and 2?

The intent by Intel in having rings 1 and 2 is for the OS to put device drivers at that level, so they are privileged, but somewhat separated from the rest of the kernel code.

Rings 1 and 2 are in a way, “mostly” privileged. They can access supervisor pages, but if they attempt to use a privileged instruction, they still GPF like ring 3 would. So it is not a bad place for drivers as Intel planned.

VirtualBox, a Virtual Machine, puts the guest kernel code in ring 1. Some Operating systems may use this, but not a famous design at the current design.

What can each ring do?

  • ring 0 can do anything
  • ring 3 cannot run several instructions and write to several registers, most notably:
    • cannot change its own ring!
    • cannot modify the page tables.
    • cannot register interrupt handlers.
    • cannot do IO instructions like in and out, and thus have arbitrary hardware accesses.

What is the point of having multiple rings?

There are two major advantages of separating kernel and userland:

  • it is easier to make programs as you are more certain one won’t interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
  • it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.


Checking if Hugepages are enabled in Linux

Issue:

On your Linux system, you want to check whether transparent hugepages are enabled on your system.

Solution:

Simple:

  • cat /sys/kernel/mm/transparent_hugepage/enabled

You will get an output like this:

  • [always] madvise never

You’ll see a list of all possible options ( alwaysmadvisenever ), with the currently active option being enclosed in brackets.
madvise is the default.

This means transparent hugepages are only enabled for memory regions that explicitly request hugepages using madvise(2).

always means transparent hugepages are always enabled for each process. This usually increases performance, but if you have a usecase with many processes that only consume a small amount of memory each, your overall memory usage could grow drastically.

never means transparent hugepages won’t be enabled even if requested using madvise.

For details, take a look at Documentation/vm/transhuge.txt in the Linux kernel documentation.

How to change the default value

Option 1: Modify sysfs directly (the setting is reverted back to default upon reboot):

  • echo always >/sys/kernel/mm/transparent_hugepage/enabled
  • echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
  • echo never >/sys/kernel/mm/transparent_hugepage/enabled

Option 2: Change system default by recompiling kernel with modified config (this is only recommended if you’re using your own custom kernel anyway):

In order to set default to always, set these options:

  • CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
  • # Comment out CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y

In order to set default to madvise, set these options:

  • CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
  • # Comment out CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y

Recent Posts

Categories

GiottoPress by Enrique Chavez