KSM – What is it??

Kernel Samepage Merging (KSM) allows de-depulication of memory in Linux and has been released with kernel version 2.6.32. KSM tries to find identical Memory Pages and merge those to free memory. It tries to find memory pages that are updated seldom, otherwise it could be inefficient.

Originally KSM was developed for virtual machines. If these virtual machines use the same programs or operating systems, the overall memory usage can be reduced dramatically and more virtual machines can be operated with the available physical RAM.

Some tests from Red Hat have shown, that 52 virtual machines with Windows XP and 1 GB RAM can be operated on one server with only 16 GB RAM.

The following command allows to check if KSM is integrated into the kernel:

$ grep KSM /boot/config-`uname -r`
CONFIG_KSM=y

Further information and configuration options can be found in sysfs file system:

$ ls -1  /sys/kernel/mm/ksm/
full_scans
pages_shared
pages_sharing
pages_to_scan
pages_unshared
pages_volatile
run
sleep_millisecs

Checking if Hugepages are enabled in Linux

Issue:

On your Linux system, you want to check whether transparent hugepages are enabled on your system.

Solution:

Simple:

  • cat /sys/kernel/mm/transparent_hugepage/enabled

You will get an output like this:

  • [always] madvise never

You’ll see a list of all possible options ( alwaysmadvisenever ), with the currently active option being enclosed in brackets.
madvise is the default.

This means transparent hugepages are only enabled for memory regions that explicitly request hugepages using madvise(2).

always means transparent hugepages are always enabled for each process. This usually increases performance, but if you have a usecase with many processes that only consume a small amount of memory each, your overall memory usage could grow drastically.

never means transparent hugepages won’t be enabled even if requested using madvise.

For details, take a look at Documentation/vm/transhuge.txt in the Linux kernel documentation.

How to change the default value

Option 1: Modify sysfs directly (the setting is reverted back to default upon reboot):

  • echo always >/sys/kernel/mm/transparent_hugepage/enabled
  • echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
  • echo never >/sys/kernel/mm/transparent_hugepage/enabled

Option 2: Change system default by recompiling kernel with modified config (this is only recommended if you’re using your own custom kernel anyway):

In order to set default to always, set these options:

  • CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
  • # Comment out CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y

In order to set default to madvise, set these options:

  • CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
  • # Comment out CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y

How to crack a system Design Interview? – A step by step guide to System Design Interviews

I begin by saying that there is no single answer to this question. But what I learnt from experiences of me and my friends is there are three basic areas you need to be good at.

  1. Clarifications – Asking for clarifications on the problem till the time you are pretty clear what the problem really is!
  2. Solid Background – Solid background really helps. Solid background comes into action when you get some answer based on your experience. If you have some experience in development, it might be the case that you may have got similar kind of requirement to build.
  3. Preparation – Most important. “How prepared are you?”. I am not saying that without preparation, you will not be able to crack it. It is just that “Preparation” acts like a catalyst.

Step 1: Clarification Questions

Such interviews are to give you a chance to demonstrate your real world knowledge. There is no right or wrong answer to the problems.
A good system design question always sounds very ambiguous, and the reason for that is it’s supposed to give you a chance to demonstrate the following:

  1. How you would think about the problem space
  2. How you think about bottlenecks
  3. What you can do to remove these bottlenecks.

Asking good clarification questions helps a lot in such interviews. Clarifications may be one of the several things:

  1. What is the scope of our application?
  2. What do you expect of the user behavior? Do you want me to make some assumptions?
  3. I am lost, can you help me with some direction to proceed?
  4. I think there will be a bottleneck in my solution? Should I explain you that, so that I may get a nod from you to go further?

You should always ask for clarifications to the problem. You are never being judged on whether you asked a particular question or not during the interview, but you are definitely being judged on how you think about the problem?

Use your background to your advantage
Your experience and background can vary widely from other candidates. You have a lot of things unique in you – set of values and expertise. That is what makes you valuable and irreplaceable. Regardless of what field you’re in, people care about what you can bring to the table.

Step 2: System interface definition

You should define what APIs are expected. This will help you establish the exact requirement of the software and also ensure that wen haven’t got anything wrong.

Step 3: Estimation

We should always estimate the scale of the system that we are designing. This will help us in focusing on scaling, load balancing and caching.

  • What scale is expected from the system
  • How much storage will we need?
  • What network bandwidth usage are we expecting?

Step 4: Define Data model and make high level design

Defining the data model early will clarify how data will flow among different components of the system.

Draw a block diagram with 5-6 boxes representing the core components of our system. We should identify enough components that are needed to solve the actual problem from end-to-end.

Step 5: Detailed Design

Once we get our high level design, we can move on to detailed design of the components one by one.

I will be writing more blogs about specific problems and their solution along with the process of thinking and getting to the answer.

Why is System Design Preparation needed ?

Recent interview process put a lot of emphasis on System Design. Good performance in System Design interviews shows that you are able enough to work in the position. It also improves your chances of securing a higher salary.
Generally, people struggle with System Design interviews. There could be many reasons for it.
Some reasons are :

  1. Due to inexperience in developing large scale applications
  2. Never thought of why we are choosing certain APIs/process/Code structure/endpoints

Generally, such interviews are open-ended. There may be many right solutions to the problem. It all depends on how you see the problem, how you approach and how you explain your solution with your point of view to the interviewer. As I said there is no one correct answer to the problem, one should be good enough with design patterns to generally be able to think of a solution to the problems.

Requirements for cracking a System Design Interview :

  1. Good hold of System Design Patterns.
  2. Good in Object Oriented Programming.
  3. Able to apply the Design patterns to real world problems.

Going through source code of libvirt

Clone the source code:

git clone https://libvirt.org/git/libvirt.git

The directory listing inside the repository is something like this.

libvirtd is based on C programming language. It has bindings in different languages – C,C++,C#,Java,PHP, Ruby, and so on.

docs, daemon, and src are the few important directories. It is an well documented project, which is found in http://libvirt.org.

libvirtd starts connection or operates based on driver modes. On initialization, the drivers are registered with libvirtd.
Different types of drivers are a part of libvirtd. Each driver has a registration API, which loads up the driver specific function references for the libvirt APIs to call.

As we see in the above figure there is a Public API exposed to the client. When the client calls the Public API, depending on the connection uri, delegation is done to the specific driver implementation.

libvirt , qemu and kvm

libvirt talks to various hypervisors underlying it. It is an open source API and a domain management tool for managing different hypervisors.

Command-line clint of libvirt is virsh.

Goal of libvirt library is to provide a common and stable layer to manage Virtual Machines running on a hypervisor.
In order to manage VMs, libvirt provides APIs for VM provision, creation, modification, monitoring, control and migration.

The libvirt process is daemonized, and is known as libvirtd.

What happens when libvirt client, such as virsh, requests service from libvirtd?
Based on connection URI passed by the client, libvirtd opens a connection to the hypervisor. This is how the clients virsh or virt-manager asks libvirtd to start talking to the hypervisor. Like any other daemon, it also provides services to its client.

Most people think that libvirt is restricted to a single node or local node
where it is running; it’s not true. libvirt has remote support built into the
library. So, any libvirt tool (for example virt-manager) can remotely connect to a libvirt daemon over the network, just by passing an extra connect argument.

Connection sample URI:

1. qemu://xxxx/system
2. qemu://xxxx/session

[1] requests to connect locally as “root” to the daemon supervising qemu / kvm virtual machines.
[2] requests to connect locally as a normal user to its own set of qemu/kvm domains.

qemu also supports remote connections. To achieve remote connections, a small change need to be done in the connection URI.
The connection format is :
[driver+transport]://[username@hostname]:[port]/[path][?extra parameters]

Example command of a virsh process for a remote connection is :

$ virsh --connect qemu+ssh://root@remtelocation.server/system list --all

Settting pintos on local machine

Introduction

Pintos is a simple operating system framework for the 80×86 architecture. It supports kernel threads, loading and running user programs, and a file system, but it implements all of these in a very simple way. This assignment was used in MIT as a part of OS course.

Doing this as an assigments would give you a clear idea on the various topics in OS. This guide will take you through how to install Pintos in your native Ubuntu machine. Pintos is a very small operating system and can run it on a virtual machine. The tutorial will be using Qemu to run Pintos.

Once you are finished setting up the Pintos you can start working on your assignments.

Setting up

Step 1: Install Qemu

If you are using Ubuntu: sudo apt-get install qemu

Step 2: Download Pintos

Download Pintos from here. Extract it in your home folder. Eg: /home/username/os. ‘username’ is your $HOME folder

Step 3: Set GDBMACROS

Now open the script ‘pintos-gdb’ (in $HOME/os/pintos/src/utils) in any text editor. Find the variable GDBMACROS and set it to point to ‘$HOME/os/pintos/src/misc/gdb-macros’.

Your ‘gdb-macros’ file will contain GDBMACROS=/home/username/os/pintos/src/misc/gdb-macros

Step 4: Compiling utilities

Go to /home/pintos/os/src/utils: cd /home/username/os/pintos/src/utils

And compile utilities folder: make

If this preoduces an error: Undefined reference to ‘floor' then edit “Makefile” in the current directory and replace LDFLAGS = -lm by LDLIBS = -lm and compile again.

Recent Posts

Categories

GiottoPress by Enrique Chavez