Tuesday 17 May 2016

Generating call graph of c code - Linux call graphs

Recently I was trying to analyse the manner in which kernel modules  get installed in a Linux kernel. After spending a lot of time rummaging through code of "kernel/module.c" in the Linux source file using ctags, cscope, and vim, I could only discover a few functions. As a side note, the following two functions are the main system calls which are used by tools like insmod, modprobe, etc. to load a kernel module.
  • init_module() - kernel/module.c
  • finit_module() - kernel/module.c
The latter function was added in Linux 3.8, and allows loading module using the file descriptor of the module's file. This is useful when authenticity of the module can be determined from its location in the file system. The former is the older system call and expects a binary elf image of the module to be supplied to it. Both these functions call "load_module()" defined in the same file, which was the centre of my research.

This is when the idea of generating and analysing call graphs struck me. I will start with explaining how to generate the call graph of a simple c program, then we will move forward to generate call graphs of "module.c" present in "<Linux_source>/kernel/module.c".


Note - I learnt about generating call graphs from this page. This blog post of mine, is written to add additional insights I got while following the instructions over there.

Installing the required tools
  1. graphviz
    • sudo apt-get install graphviz
  2. egypt - website
    • wget "http://www.gson.org/egypt/download/egypt-1.10.tar.gz" 
    • perl Makefile.PL
    • make
    • make install 
Generating call graph

Generating a call graph is a 3 step process.
  1. Compiling c code with -fdump-rtl-expand flag set.

    eg.
    gcc myfile.c -fdump-rtl-expand
    


    This will cause gcc to generate and rtl file, which represents c code in an intermediate format, which is easier to parse for call-graph than the original c code. More about rtl can be found here.
  2. Converting the obtained expand file to a representation understood by dot utility of the graphviz package. This is done by the egypt software.

    eg.
    egypt myfile.c.192r.expand
    


    This converts the rtl description present in myfile.c.192r.expand file to a simple directed graph representation. If function f1 calls function f2, the representation will be simply "f1"->"f2".

    This is a very simple directed graph representation of the call graph, and thus we can do a lot of processing in this stage, for example colouring certain edges, changing stroke styles, etc. More about this later.
  3. Creating svg using dot utility of graphviz package.

    Pipe the output of the previous stage to dot.

    eg.

    egypt myfile.c.192r.expand | dot -Tsvg -o myfile_callgraph.svg
    

    Explanation -
                           -T : output language, here svg
                           -o : output file name
    For more options refer to man page of dot (man dot).

Example 1 - a simple c program to find nth Fibonacci number.


#include<stdio.h>

int fib(int n){
if(n<=2) return 1;
return fib(n-1)+fib(n-2);
}

int main(){
int n=10,x=0;
x=fib(10);
printf("%d",x);
return 0;
}

Save it as fib.c. Issue the following commands:
  • gcc -fdump-rtl-expand fib.c 
    

  • egypt fib.c.192r.expand |dot  -Tpng -o fibonacci_call_graph.png #note png instead of svg - to upload image to Blogger. Blogger doesn't support svg.
    
We will get the following call graph.



Example 2 - "module.c" in "<Linux_source>/kernel/module.c"
  • Edit module/Makefile and add the following lines

    ccflags-y+=-fdump-rtl-expand
    CCFLAGS+=-fdump-rtl-expand
    

  • OR, pass it as argument to make.

    make CFLAGS="-g -fdump-rtl-expand"
    

  • compile Linux.

    make
    

  • Pass the generated module.c.192r.expand to dot to generate graph png or svg.

    egypt module.c.192r.expand | dot -Tsvg -o module_callgraph.svg
    
There will be several ".expand"  files generated in "<Linux_source>/kernel/" directory, we can make a large call graph, containing the interleaving of functions from all these files simply by passing them to egypt.


egypt *.expand | dot -Tsvg -o module_callgraph.svg


To include function names of functions which don't have their code in Linux source tree, use "--include-external" option in egypt.

  
egypt --include-external *.expand | dot -Tsvg -o module_callgraph.svg


Editing directed graph (digraph) generated by egypt

Since the representation of call graph generated by egypt is very simple, we can use several attributes of graphviz's dot language (given here). For example in the svg file obtained from module.c.192r.expand, we want to colour all incoming arcs to the function "load_module()" red. we can use the following script.


egypt module.c.192r.expand |awk '{if ($3 == "\"load_module\"") {$4="[style=solid color=red]";} print $0; }'|dot -Tsvg -o module_callgraph_coloured.svg


Look at dot's man page, it contains a plethora of options that can be used to customized various aspects of the graph, Things like graph size, rotation etc. can be specified by passing command-line arguments to dot.

Additional Resources - 
  1. ftracer  - A toolkit for tracing C/C++ program(including multi-thread program), it can generate a call graph for you. Link.
  2. gprof2dot - This is a Python script to convert the output from many profilers into a dot graph. Link.
  3. stackoverflow.com discussion. Link.
  4. LD_PRELOAD, and LD_DEBUG - Link1 , Link2, also see its usage in ftracer.

Note - My various attempts to generate expand files for code in every directory of Linux source failed. I tried adding "-fdump-rtl-expand" flag in the Makefile present in the root of source, exporting CFLAGS, CCFLAGS, etc. from bash, but it didn't work. Only when I added the flag to make file in "<Linux_source>/kernel/" director, the expand files for the c-files in the directory were generated. It may not be a wise idea anyway to generate a call graph for the entire kernel. The call graph for module.c itself is quite large, making it difficult to interpret using traditional tools.

Note - The discussion on this page is for generating call graph for code written in c only.

Update - The call graphs can also be generated using doxygen by enabling it in its configuration file before running it on the source tree.