Chapter V. Developing open-source systems

Table of Contents
V.1. The structure of Linux systems
V.1.1. History of Unix
V.1.2. The Open Source software development model
V.1.3. The Linux operating system
V.1.4. Linux distributions
V.1.5. X Window System
V.1.6. Embedded Linux
V.2. The GCC compiler
V.2.1. The origins of GCC
V.2.2. Steps of compilation with GCC
V.2.3. Host and Target
V.2.4. The frequently used options of GCC
V.2.5. The make utility
V.2.6. The gdb debugger
V.3. Posix C, C++ system libraries
V.3.1. stdio.h
V.3.2. math.h
V.3.3. stdlib.h
V.3.4. time.h
V.3.5. stdarg.h
V.3.6. string.h
V.3.7. dirent.h
V.3.8. sys/stat.h
V.3.9. unistd.h

V.1. The structure of Linux systems

In the previous parts of this book, we have mainly dealt with Microsoft Windows-based software development. However, there are more successful software platforms that are widespread nowadays. One of these is constituted by Unix-like systems, the code portions, concepts, solutions and ideas of which have been integrated into Windows. The following part of the book deals with differences and similarities between the two.

V.1.1. History of Unix

History of Unix is started in 1969. Two programmers (Ken Thompson and Dennis Ritchie) in the Bell Laboratories were not satisfied with their disused DEC PDP-7 operating system. They decided to create an interactive, multi-user, multi-task operating system to machines mainly used for software development. They named the system UNICS, which later became UNIX. The operating system was written in BCPL (B) language. UNIX was finished in 1971 and was contained a word processor, an assembler compiler and textformatting tools needed for documentation. In 1972, Ritchie created the C language relying on the B language. Many people thought concerning C language at that time that it is impossible to create a high-level programming language that is able to function on processors with different bit-length and different memory organisations. C language made the operating system portable since, for each CPU with a new instruction set, it was only the assembler of that CPU that had to be written. After that, the operating system and the application softwares could be compiled for the new hardware. In 1975, Unix was given to universities, like Berkeley. At that university, the source code was further developed under the name of BSD and some developments, like the Sockets software package that handles the Internet was added back to the Bell Labs, in the successor USG Unix. Unix became the tool of scientific society.

An important element of Unix is standardization. From the beginning, there were many systems in order that each software could run under all systems. System calls were standardized (e.g. opening a file, printing a character to the console). Softwares containing system calls complying with the standards can run under standard Unix operating systems. The standard was named POSIX in 1988. Many organisations dealt with standardization, so the proprietor of the UNIX brand name changed many times. Proprietors were X/Open, OSF, The Open Group. Nowadays, from these groups, the Austin Common Standards Revision Group was founded. It has more than 500 members, coming from the industrial, governmental sector and from the open-source community. A system can be considered as Unix if it has been standardized at The Open Group of this organisation (of course by paying its costs). For example, Mac OS X, HP-UX, Sun Solaris and IBM AIX can be considered as Unix. BSD is not Unix in spite of the fact that standard system functions (Posix - IEEE-1003.1) are tested under BSD.

V.1.2. The Open Source software development model

BSD introduced a new principle: that of open-source. Open-source means that all sources needed to compile an operating system (and its utility softwares) are available legally, therefore they can be modified, can be viewed free and can be given to a third party. The GNU project shared this view: it creates open-source tools of good quality for Unix-type operating systems. Their most known project is the C compiler that was launched in 1987, and it is called GCC (Gnu Compiler Collection). Since that time, it supports more programming languages (C, C++, Fortran). They distribute their software under GPL (General Public License). The difference between GPL and open-source licences of BSD is that it is obligatory to provide a source code for products distributed under a GPL license, and it is not obligatory for BSD.

V.1.3. The Linux operating system

First Unix (and BSD) systems only worked on mainframe computers because the so-called "personal computers" in the eighties did not have enough memory even to load the kernel of the operating system. IBM PC changed this situation later: Microsoft sold Unix under the name Xenix for these machines but it was not a success. A better attempt was minix, created by professor Andy Tanenbaum, which was a Unix-clone made for educational purposes for PCs of the 80's (with a 8086 and 80286 CPU). Microsoft Windows was not too much successful (rapid) either for these CPUs. At that time, as a result of a great development, Intel launched the 80386 CPU with 32-bit memory addressing and real protected mode. This CPU was able to support multi-task operating systems.

A Finnish student, Linus Torvalds, dealt with the memory management of the 80386 CPU. He published the developments carried out on a mailing list on the Internet, some people helped him, and suddenly the kernel of a Unix-like operating system was born. This was named Linux, in a way that was similar to the transformation of UNICS to UNIX. It was created with a GNU GPL license and was distributed at that time (in the middle and late 90's) on the Internet that was available for most people. Linux was also chosen as an operating system for many servers because it achieved sometimes a higher performance on a relatively cheap PC, than standardized Unix systems on one or two times more expensive mainframe computers. It should be noted that Linux cannot be considered as UNIX. Although Linux complies with Posix standards, it did not pay for a license, so, (just like BSD) cannot be considered as Unix only as "Unix-like". Strictly speaking, Linux means only the kernel but the kernel itself is non-functional. In Linux, the C compiler is the already mentioned GCC, most of the utility software come from the GNU project, so Linux is named GNU/Linux at the request of the GNU project. According to its definition, a Linux kernel "is written in a way that it could be compiled with GCC”.

V.1.4. Linux distributions

We mean by distribution a compact system that can be installed on a computer with an empty hard disk. It has to contain various utilities: disk partitioner, bootloader, package selector (not the same utility software and settings are needed for a web server and for a workstation used for word processing) and a configuration setup. There are many distributions like that. Users of any of these distributions think that the one they have chosen is the best and the others are bad and they think that the users of other distributions are “incompetent beginners”. All the distributions contain the kernel hallmarked under the name of Linus and of course, one of the versions of the GCC compiler. The "package manager" that handles bootloader scripts and installable software packages is realized differently in different distributions. Some software companies "adopted" a distribution to solve the problem of product support because product support is not given to free distributions: banks and internet service providers must not use products without a support. Supported distributions that can be bought are the same in most cases as the ones that can be downloaded from the website of that distribution. There is a distribution tailored for big companies and having a database manager but of course this has a higher price, accordingly.

V.1.5. X Window System

When Unix was developed, graphical displays were not so widespread. Even the alphanumeric display (terminal) that was connected to the computer by a serial cable was a novelty at that time. That is why, Unix (and Linux, too) communicates with users with the help of a terminal, basically through a command interpreter software (shell). If you think that this architecture (a host machine with a lot of terminals connected to it) is outdated, you should look at a cloud-based application: internet replaces serial cables, many computers instead of one, so the whole seems as one computer logically. Graphical user interface (GUI) also appeared in the Linux-world as a tool to learn much easily how to use a computer. Now it is not the 2-character-long commands that we should keep in mind and that have not changed in the last more then 40 years, but the place where we have to click for a given operation and the manufacturers (e.g. Microsoft Office) enthusiastically reorganise these places in each new version. The Unix-style GUI is called X Window System, which supports the well-known basic notions: mouse, cursor, icon and click. The properties of X Window System:

  • it is not part of the operating system, it is just an application. Unix works without it

  • network-transparent: an Internet (TCP/IP) connection is sufficient between the running application and its graphical display.

  • it does not have a surface itself, it consists of a graphical desktop and a mouse cursor. In order to be used, it needs the Window Manager (WM) software that handles windows, icons and launches the applications that are behind the icons.

  • there is an open-source version to be used under GPL operating systems. This can be found on the X.org website if we want to compile it. It does not contain graphical card drivers; these can be downloaded from the xfree86.org website. Linux distributions contain xfree86 and some of them can be installed graphically.

V.1.6. Embedded Linux

As Linux was written in C, it can be compiled to other CPUs. TCP/IP support, graphics (X Window System) and a wide range of server softwares (web, file, print, database) make it an ideal operating system for products connected to the Internet (ADSL router, network hard drive, DVD and entertaining devices: media players, cell phones, tablets, industrial automatization), that is it can be tailored for these devices at a low price. It has only one disadvantage: it requires too much memory in comparison with the size and power supply of the devices (in the case of portable devices, the power supply is constituted by a battery). Its most known form is Google's operating system for cellphones, the Android, which runs on a modified Linux kernel and which was developed with modified GNU application softwares.

V.2. The GCC compiler

V.2.1. The origins of GCC

The GCC compiler was developed for the GNU operating system within the GNU project. It contains Fortran, Java, C, C++, Ada interfaces and a set of standard header files to compile software and libraries for linking/execution/debugging. From the GNU system, almost everything was completed in a way that it could become a Unix-like operating system, with the exception of the kernel. On the other hand, the applications, like the compiler, shell and utilities were useful for Linux because there was only completed the kernel. When the GCC compiler was designed and its code was written, the main aim was to create a free tool having a good quality. Another aim was to support different computer architectures. The GNU system is a result of the work of developers collaborating together on the Internet: the source code of the compiler is available to anyone. If someone corrects or extends it and if (s)he sends that to the coordinating organisation of its development, his/her modification will be integrated in the system.

V.2.2. Steps of compilation with GCC

The GCC compiler creates executable files from .C source codes In the way presented in Section IV.1.1. Contrary to the Microsoft compiler under Windows, the GCC first creates text files in assembly language containing instructions in machine code from the .C files, and the former files are compiled to object files by an assembler.

Let's see the following and very simple example code which is typed with a text editor and saved in a file named x.c:

#include <stdio.h>
 
int main(void) {
  int i;
  i=0;
  i++;
  printf("%i\n",i);
  return 0;
}

This code is first processed by the preprocessor, which resolves all #include and #define directives: it replaces #define directives (this code does not require this step) and at the #include directive, it copies the needed header. We chose the simplest header to print a text on the screen. The preprocessed C source stored is x.i file is the following:

# 1 "x.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "x.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 40 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 59 "/usr/include/sys/cdefs.h" 3 4
# 1 "/usr/include/machine/cdefs.h" 1 3 4
# 60 "/usr/include/sys/cdefs.h" 2 3 4
# 1 "/usr/include/sys/cdefs_elf.h" 1 3 4
# 62 "/usr/include/sys/cdefs.h" 2 3 4
# 41 "/usr/include/stdio.h" 2 3 4
# 1 "/usr/include/sys/featuretest.h" 1 3 4
# 42 "/usr/include/stdio.h" 2 3 4
# 1 "/usr/include/sys/ansi.h" 1 3 4
# 35 "/usr/include/sys/ansi.h" 3 4
# 1 "/usr/include/machine/int_types.h" 1 3 4
# 47 "/usr/include/machine/int_types.h" 3 4
typedef signed char __int8_t;
typedef unsigned char __uint8_t;
typedef short int __int16_t;
typedef unsigned short int __uint16_t;
typedef int __int32_t;
typedef unsigned int __uint32_t;
typedef long int __int64_t;
typedef unsigned long int __uint64_t;
typedef long __intptr_t;
typedef unsigned long __uintptr_t;
# 36 "/usr/include/sys/ansi.h" 2 3 4
typedef char * __caddr_t;
typedef __uint32_t __gid_t;
typedef __uint32_t __in_addr_t;
typedef __uint16_t __in_port_t;
typedef __uint32_t __mode_t;
typedef __int64_t __off_t;
typedef __int32_t __pid_t;
typedef __uint8_t __sa_family_t;
typedef unsigned int __socklen_t;
typedef __uint32_t __uid_t;
typedef __uint64_t __fsblkcnt_t;
typedef __uint64_t __fsfilcnt_t;
# 43 "/usr/include/stdio.h" 2 3 4
# 1 "/usr/include/machine/ansi.h" 1 3 4
# 75 "/usr/include/machine/ansi.h" 3 4
typedef union {
 __int64_t __mbstateL;
 char __mbstate8[128];
} __mbstate_t;
# 45 "/usr/include/stdio.h" 2 3 4
typedef unsigned long size_t;
# 1 "/usr/include/sys/null.h" 1 3 4
# 51 "/usr/include/stdio.h" 2 3 4
typedef __off_t fpos_t;
# 74 "/usr/include/stdio.h" 3 4
struct __sbuf {
 unsigned char *_base;
 int _size;
};
# 105 "/usr/include/stdio.h" 3 4
typedef struct __sFILE {
 unsigned char *_p;
 int _r;
 int _w;
 unsigned short _flags;
 short _file;
 struct __sbuf _bf;
 int _lbfsize;
 void *_cookie;
 int (*_close)(void *);
 int (*_read) (void *, char *, int);
 fpos_t (*_seek) (void *, fpos_t, int);
 int (*_write)(void *, const char *, int);
 struct __sbuf _ext;
 unsigned char *_up;
 int _ur;
 unsigned char _ubuf[3];
 unsigned char _nbuf[1];
 struct __sbuf _lb;
 int _blksize;
 fpos_t _offset;
} FILE;
extern FILE __sF[];
# 214 "/usr/include/stdio.h" 3 4
void clearerr(FILE *);
int fclose(FILE *);
int feof(FILE *);
int ferror(FILE *);
int fflush(FILE *);
int fgetc(FILE *);
int fgetpos(FILE * __restrict, fpos_t * __restrict);
char *fgets(char * __restrict, int, FILE * __restrict);
FILE *fopen(const char * __restrict , const char * __restrict);
int fprintf(FILE * __restrict , const char * __restrict, ...)
    __attribute__((__format__(__printf__, 2, 3)));
int fputc(int, FILE *);
int fputs(const char * __restrict, FILE * __restrict);
size_t fread(void * __restrict, size_t, size_t, FILE * __restrict);
FILE *freopen(const char * __restrict, const char * __restrict,
     FILE * __restrict);
int fscanf(FILE * __restrict, const char * __restrict, ...)
    __attribute__((__format__(__scanf__, 2, 3)));
int fseek(FILE *, long, int);
int fsetpos(FILE *, const fpos_t *);
long ftell(FILE *);
size_t fwrite(const void * __restrict, size_t, size_t, FILE * __restrict);
int getc(FILE *);
int getchar(void);
void perror(const char *);
int printf(const char * __restrict, ...)
    __attribute__((__format__(__printf__, 1, 2)));
int putc(int, FILE *);
int putchar(int);
int puts(const char *);
int remove(const char *);
void rewind(FILE *);
int scanf(const char * __restrict, ...)
    __attribute__((__format__(__scanf__, 1, 2)));
void setbuf(FILE * __restrict, char * __restrict);
int setvbuf(FILE * __restrict, char * __restrict, int, size_t);
int sscanf(const char * __restrict, const char * __restrict, ...)
    __attribute__((__format__(__scanf__, 2, 3)));
FILE *tmpfile(void);
int ungetc(int, FILE *);
int vfprintf(FILE * __restrict, const char * __restrict, __builtin_va_list)
    __attribute__((__format__(__printf__, 2, 0)));
int vprintf(const char * __restrict, __builtin_va_list)
    __attribute__((__format__(__printf__, 1, 0)));
char *gets(char *);
int sprintf(char * __restrict, const char * __restrict, ...)
    __attribute__((__format__(__printf__, 2, 3)));
char *tmpnam(char *);
int vsprintf(char * __restrict, const char * __restrict,
    __builtin_va_list)
    __attribute__((__format__(__printf__, 2, 0)));
int rename (const char *, const char *);
# 285 "/usr/include/stdio.h" 3 4
char *ctermid(char *);
char *cuserid(char *);
FILE *fdopen(int, const char *);
int fileno(FILE *);
void flockfile(FILE *);
int ftrylockfile(FILE *);
void funlockfile(FILE *);
int getc_unlocked(FILE *);
int getchar_unlocked(void);
int putc_unlocked(int, FILE *);
int putchar_unlocked(int);
int pclose(FILE *);
FILE *popen(const char *, const char *);
# 332 "/usr/include/stdio.h" 3 4
int snprintf(char * __restrict, size_t, const char * __restrict, ...)
    __attribute__((__format__(__printf__, 3, 4)));
int vsnprintf(char * __restrict, size_t, const char * __restrict,
     __builtin_va_list)
    __attribute__((__format__(__printf__, 3, 0)));
int getw(FILE *);
int putw(int, FILE *);
char *tempnam(const char *, const char *);
# 361 "/usr/include/stdio.h" 3 4
typedef __off_t off_t;
int fseeko(FILE *, __off_t, int);
__off_t ftello(FILE *);
int vscanf(const char * __restrict, __builtin_va_list)
    __attribute__((__format__(__scanf__, 1, 0)));
int vfscanf(FILE * __restrict, const char * __restrict, __builtin_va_list)
    __attribute__((__format__(__scanf__, 2, 0)));
int vsscanf(const char * __restrict, const char * __restrict,
    __builtin_va_list)
    __attribute__((__format__(__scanf__, 2, 0)));
# 398 "/usr/include/stdio.h" 3 4
int asprintf(char ** __restrict, const char * __restrict, ...)
    __attribute__((__format__(__printf__, 2, 3)));
char *fgetln(FILE * __restrict, size_t * __restrict);
char *fparseln(FILE *, size_t *, size_t *, const char[3], int);
int fpurge(FILE *);
void setbuffer(FILE *, char *, int);
int setlinebuf(FILE *);
int vasprintf(char ** __restrict, const char * __restrict,
    __builtin_va_list)
    __attribute__((__format__(__printf__, 2, 0)));
const char *fmtcheck(const char *, const char *)
    __attribute__((__format_arg__(2)));
FILE *funopen(const void *,
  int (*)(void *, char *, int),
  int (*)(void *, const char *, int),
  fpos_t (*)(void *, fpos_t, int),
  int (*)(void *));
int __srget(FILE *);
int __swbuf(int, FILE *);
static __inline int __sputc(int _c, FILE *_p) {
 if (--_p->_w >= 0 || (_p->_w >= _p->_lbfsize && (char)_c != '\n'))
  return (*_p->_p++ = _c);
 else
  return (__swbuf(_c, _p));
}
 
# 2 "x.c" 2
int main(void) {
  int i;
  i=0;
  i++;
  printf("%i\n",i);
  return 0;
}

The code became a little bit long, we can see that an include can include several other files as well. The preprocessed source, compiled into assembly, in the file named x.s:

    .file    "x.c"
    .section    .rodata
.LC0:
    .string    "%i\n"
    .text
.globl main
    .type    main, @function
main:
.LFB3:
    pushq    %rbp
.LCFI0:
    movq    %rsp, %rbp
.LCFI1:
    subq    $16, %rsp
.LCFI2:
    movl    $0, -4(%rbp)
    incl    -4(%rbp)
    movl    -4(%rbp), %esi
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    movl    $0, %eax
    leave
    ret
.LFE3:
    .size    main, .-main
    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long    .LECIE1-.LSCIE1
.LSCIE1:
    .long    0x0
    .byte    0x1
    .string    "zR"
    .uleb128 0x1
    .sleb128 -8
    .byte    0x10
    .uleb128 0x1
    .byte    0x3
    .byte    0xc
    .uleb128 0x7
    .uleb128 0x8
    .byte    0x90
    .uleb128 0x1
    .align 8
.LECIE1:
.LSFDE1:
    .long    .LEFDE1-.LASFDE1
.LASFDE1:
    .long    .LASFDE1-.Lframe1
    .long    .LFB3
    .long    .LFE3-.LFB3
    .uleb128 0x0
    .byte    0x4
    .long    .LCFI0-.LFB3
    .byte    0xe
    .uleb128 0x10
    .byte    0x86
    .uleb128 0x2
    .byte    0x4
    .long    .LCFI1-.LCFI0
    .byte    0xd
    .uleb128 0x6
    .align 8
.LEFDE1:
    .ident    "GCC: (GNU) 4.1.3 20080704"

Here as well, we can recognise our program code: e.g. i++ has become incl -4(%rbp). The compiler has also inserted its version number as well but only as a text. This text part remains in the executable file even after linking.

By default (i.e. without using any specific option), these work files are deleted. Only one executable file remains besides the C file. If no name is assigned to the executable, it will be a.out. If a name is intended to be given to the executable file, the compiler should be executed by the option –o filename:

Command

result

execution

gcc x.c

a.out

./a.out

gcc –o x x.c

x

./x

gcc –save-temps –o x x.c

x.i, x.s, x.o, x

./x

V.2.3. Host and Target

The source code of a compiler (in our case the C(++) compiler) is given. From a text file (the .c source), it creates another text file, the assembler source file. So basically, the source of a compiler can be compiled on any computer in a way that it could be executed on that computer. An executable file is created from the compiler code and it can be launched (in Unix with gcc, in Windows with gcc.exe). The system where the compiler can be executed is called Compiler Host, since compilation will takes place on that. For the compilation of the compiler the CPU instruction set to be applied has to be specified. This is called Compiler Target. A compiler is in general used on a personal computer or a notebook. This is there that program development takes place, so the host system is generally Intel-Linux or Intel-Windows. The 64-bit version is called amd64 or x86_64 for historical reasons. If the operating system of the computer will execute our compiled code (target==host), then the compiler is called “native”.  Microsoft Visual Studio C++ is a native compiler: it runs under Windows and its output is also a Windows application. The embedded Linux systems generally do not contain GCC compiler, because of their size, so the computers running them are not adequate for code development. In that case, the program code is written on a PC, it is there that it is compiled. It is only the target that is set to the embedded system. The name of this process is "cross-compiling”. The whole program structure of SOCs (see Chapter VI) is often created in this way. The input of a cross compiler is a complete library structure with source codes, and its output is the content of the Rom of the SOC.

V.2.4. The frequently used options of GCC

Since gcc compilers are designed to be executed in command line, its functioning can be modified by using different switches that have to be typed in the command line. The basic command is the gcc source file that compiles the source file into an executable file named a.out. Attention: under Unix, not only the source code written in C but also options are case-sensitive.

Option

Example

Effect

-o output file

gcc –o x x.c

An executable file named x is created and not a.out.

-Wall

gcc –Wall x.c

All warnings are printed out, e.g. a decrease in precision during conversion or another interpretation of the data a pointer points to.

-g

gcc –g x.c

The executable file will also contain a symbol table so it can be debugged by the gdb debugger.

-save-temps

gcc –save-temps x.c

Does not delete temporary files created during the compilation process.

-c

gcc –c x.c

Does not create an executable file, only an obj file because the computer program is made up of more than one files to be compiled.

-x language

gcc –x c x.c

Specifies the programing language. If it is not given, it is deduced from the file name’s extension: .cpp extension will therefore mean c++ language.

-E

gcc –E x.c

Only the preprocessor is executed so no assembly file is created.

-S

gcc –S x.c

Compiles only until assembly file is created, no obj file is created.

-v

gcc –v x.c

Verbose: shows commands executed during compilation from preprocessing until linking.

-O[number]

gcc –O3 x.c

Optimization: there is no optimization with O0, full optimization with O3: compilation takes a longer time and uses more memory but the compiled program runs faster.

-D macro

gcc –Dtest x.c

Defines a macro. It is the same as #define in the source code.

-Ilibrary_name

gcc –I. x.c

The library name is added to the path of include files. So the includes of our program can also be there. The option is a capital “I” letter, as in "Include”.

-nostdinc

gcc –nostdinc x.c

Does not search for standard include files. In SOC, it is the includes of the given system that have to be used, our example code will not work either without the option –I.

-llibrary

gcc –l. x.c

The path used for linking may also be .lib files. The option is a lower-case "l” letter, as in „link dir”.

-shared

gcc –shared lib.c

It compiles a non-executable program, only a shared part into a library, which can be used later.

-time

gcc –time x.c

It prints out the time spent on each compilation step under a Unix approach: user and operating system time consumptions.

-fverbose-asm

gcc –S –fverbose-asm x.c

Comments are inserted in the assembly file, e.g. the actual line of the C source in order that it becomes more legible.

V.2.5. The make utility

We have previously described how gcc compiler compiles a source file written in C. However, software may be bigger: it can be made up of more than one source file. If there are more than one C source files, then first they are compiled by the -c option and after compilation, linking is carried out by a computer program named ld (linker). A shell script (a text file containing commands, e.g. .bat or .cmd files for Windows) can also be written to control that process. In this text file we can specify the order of compilation of each file and the way they are going to be linked. If the code is really big, compilation may take days. The development is not easy in the case of huge computer programs: it would not be useful to compile all C files each time some source files are modified because it is unnecessary to generate again the object files created from the unchanged files. Modifications of files can be tracked in a simple way: a .c file that is newer (on the basis of the date of its latest modification) than the .o file created from it is modified, and that .o file should be deleted and the modified .c file should be recompiled. This relationship between files is called "dependency”. A condition like that can be created for executable programs to track changes of .o files. The program that can check these conditions is called make. For using that program, we should create a file named Makefile in which we specify for each file which file it is derived from and if it has to be recompiled which command should be used. Besides that, make has further possibilities: in a Makefile, further command line options can be specified. For example what commands should be executed when commands like make clean or make install are used. A Makefile example for a program composed of more than one files:

tr1660: main.o libtr1660.a screen.o
    gcc -o tr1660  -lcurses main.o libtr1660.a screen.o
    size tr1660
 
condi:  condi.o libtr1660.a screen.o
    gcc -o condi -lm -lcurses condi.o libtr1660.a screen.o
    rm condi.o libtr1660.a screen.o
    size condi
     
libtr1660.a: tr1660.h tr1660.c
    gcc -Wall -c -o tr1660.o tr1660.c
    ar rcs libtr1660.a tr1660.o
    rm tr1660.o
      
main.o: main.c
    gcc -c -Wall -o main.o main.c
    
condi.o: condi.c
    gcc -c -Wall -o condi.o condi.c
    
screen.o: screen.c screen.h
    gcc -c -Wall -o screen.o screen.c
 
clean:
    rm -f *.o *.a *.so tr1660 condi *~ a.out

The name of the main program is tr1660, and is made up of 3 separate source files: main.c, libtr1660 (one library is created from the header and the source) and screen.c. The program named condi can also be compiled from condi.c using the library above and the whole project can be cleaned by make clean.

V.2.6. The gdb debugger

We may find that the program does not work the way we want.  The program can be compiled but the operating system sends an error message and aborts it or the result printed out is erroneous. It is in that case that a debugger can be used. It can place break points in the program; until these points the program runs at full speed and than execution stops there and the value of variables can be tracked, and after the program can be executed step by step. In the case of the GNU project, this program is called gdb. It should be noted that in the debugger, code lines and variables can be viewed in C, while it is the machine code of the program that runs. This can only be realised if the program is created with gcc with the -g option. In that case, symbols are inserted in the executable file which maps machine code instructions to the source file from which they are created.

In the next example, let's debug the code of Section V.2.2: let's insert a breakpoint in the program, let's execute the program step by step and track the value of the variable:

First, let's compile the program code as debuggable

bash-4.2# gcc -Wall -g x.c -o x

Now let's run the debugger with the program

bash-4.2# gdb x
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.

Let's ask for a source list if we do not remember any more what we have written.

(gdb) l
1       #include <stdio.h>
2       
3       int main(void) {
4         int i;
5         i=0;
6         i++;
7         printf("%i\n",i);
8         return 0;
9       }

Let's insert a breakpoint to the line 5

(gdb) b 5
Breakpoint 1 at 0x400918: file x.c, line 5.

The program can be executed now.

(gdb) r
Starting program: /root/prg.c/x 
 
Breakpoint 1, main () at x.c:5
5    i=0;

The program stopped at line 5. Let's make a step: let's execute i=0!

(gdb) n
6    i++;

Let's execute i++ too!

(gdb) n
7    printf("%i\n",i);

Let's print out the value of i !

(gdb) p i
$1 = 1

Let's make the program run at full speed.

(gdb) c
Program exited normally.

Debugging is ready, the debugger can be exited.

(gdb) q

V.3. Posix C, C++ system libraries

In Section V.1.1, we have presented that a program written in language C(++) can be portable if it can be compiled on more systems. For that purpose, standardization organisations have defined a commonly used function library, the elements of which can be found in all systems complying with posix standards. These functions contain frequently used program code portions a programmer may need. For example, it is not necessary for programmers to write a code for print out floating point numbers since the function printf(”%f”) is at their disposal when #include <stdio.h> is used. Although Posix was about Unix-type systems, these libraries are available in Visual Studio as well. This library contains a huge number of header files and includes. However, the following part will only present the functions to be mentioned during the training because if all functions were mentioned, it would exceed size limitations of this book. The whole system can be found on Wikipedia, in the article named "C POSIX library”. In the following parts, the described functions will be grouped on the basis of their header files.

V.3.1. stdio.h

stdio.h (c++: cstdio) contains basic I/O operations.

  • fopen: opens a file

  • freopen: reopens a file

  • fflush: synchronises the output to the actual state

  • fclose: closes a file

  • fread: reads data from a file

  • fwrite: writes data to a file

  • getc/fgetc: reads a character from the standard input/a file

  • gets/fgets: reads a string from the standard input/a file

  • putc/fputc: writes a character to the standard output/a file

  • puts/fputs: writes a string to the standard output/a file

  • ungetc: puts an already read character back into a stream

  • scanf/fscanf/sscanf: reads formatted data from the standard input/a file/a string. See format settings in Section I.2

  • printf/fprintf/sprintf/snprintf: writes formatted data to the standard output/a file/a string/a string of given length. With snprintf() cannot occur buffer overrun error.

  • ftell: returns the current file position

  • fseek : sets the current file position, moves the file pointer to a specific location

  • rewind: moves the file position indicator to the starting position of the file

  • feof : checks whether end of file is reached

  • perror: printing a text and an error message into the standard error file

  • remove: deletes a file

  • rename: renames a file

  • tmpfile: creates a temporary file with a unique name and returns the pointer to that file

  • tmpnam: returns the name of a temporary file with a unique name

  • EOF: end of file constant

  • NULL: a pointer constant that is guaranteed not to point anywhere

  • SEEK_CUR: moves the file pointer to a position relative to the current file position

  • SEEK_END: moves the file pointer to a position relative to the end of the file

  • SEEK_SET: moves the file pointer to a position relative to the beginning of the file

  • stdin, stdout,stderr: definitions of the standard files

  • FILE : the structure of file variables

  • size_t: the measure unit of the sizeof operator.

V.3.2. math.h

The file named math.h (c++: cmath) contains mathematical functions and constants

  • abs/fabs: absolute value for integer/real numbers

  • div/ldiv: the quotient and remainder of integer division into a structure of type div_t

  • exp: exponential function

  • log: logarithm function (e base)

  • log10: base 10 logarithm function

  • sqrt: square root function

  • pow: raises a number to a given power (there is no power operator in C++, only a function)

  • sin/cos/tan/atan: trigonometrical functions: sine/cosine/tangent/inverse tangent.

  • sinh/cosh/tanh/atanh: hyperbolic trigonometrical functions

  • ceil: returns the integer that is not bigger than its argument (rounds down)

  • floor: returns the integer that is not smaller than its argument (rounds up)

  • round: rounds to an integer.

V.3.3. stdlib.h

stdlib.h (c++: cstdlib) contains other, typical and frequently used functions needed in Unix: string-numeric conversion, generating random numbers, memory allocation, process control.

  • atof: converts a string to a floating point value (in practice, to a double)

  • atoi: converts a string to an integer

  • atol: converts a string to a long integer

  • rand: returns a random integer

  • srand: sets the beginning value of the random number generator. It is normally initialised by system time

  • malloc: memory allocation

  • realloc: resizes an already allocated memory block

  • calloc: memory allocation and initializes all bytes of the allocated memory block to zero

  • free: deallocates allocated memory

  • abort: makes the program abort without cleaning up

  • exit: makes the program exit normally with cleaning up

  • atexit: registers a function that will be called when the program is exited

  • getenv: accesses the environment variables of the program. These are in the format name=value

  • system: calls the command processor of the operating system with the command specified in its argument. It can be used to run external commands.

V.3.4. time.h

The header time.h (c++: ctime) contains data structures and functions necessary to handle time. A time of Unix type is an integer number representing the seconds elapsed since the 1st of January 1970 midnight, in UTC (formerly called GMT) in order that computers in different time zones would communicate well with each other. The current computer has to compute the corresponding local time with the help of its current time zone. Using 32-bit values for storing time seemed to be a good idea when Unix system was planned but nowadays it should be taken into consideration that the 32-bit time cannot be used for dates after the 18th of January 2038. Nowadays, time_t has become 64-bit, which can be used for 293 billion years in the future.

  • time: returns the actual time in a time_t format (Unix time)

  • difftime: calculates the difference between two times

  • clock: returns the number of CPU ticks from the beginning of the execution of the current program (as process)

  • asctime: converts a tm structure into a string

  • ctime: converts the time (as local time) that can be found in a tm structure into a string

  • strftime: like asctime, but the format can be specified (e.g. yyyy-mm-dd hh:mm:ss)

  • gmtime: converts a UTC time of type time_t (integer) into a tm structure

  • localtime: converts a local time of type time_t (integer) into a tm structure

  • mktime: converts a time from a tm structure into a time of type time_t

  • tm: time data type. Members: tm_sec seconds, tm_min minutes, tm_hour hours, tm_mday day of the month, tm_mon month 0:january, 11:december, tm_year: years since 1900, tm_wday: day of the week: 0:sunday, tm_yday: days since 1st January, tm_isdst: daylight saving active flag

  • time_t: time of integer type, number of seconds since 1st January 1970

  • clock_t: integer type, for counting CPU clock ticks.

V.3.5. stdarg.h

stdarg.h (c++: cstdarg) contains types and macros that is needed to write functions with varying number of arguments. Some of these functions (like printf) are treated in Section II.2.

  • va_list: type of lists of a varying number of elements

  • va_start: macro to define to the beginning of the list

  • va_arg: returns the next element in the list

  • va_end: frees the list

  • va_copy: copies the list into another list

V.3.6. string.h

The headers in string.h (c++: cstring) contains functions and constants handling "C style" strings, ending with 0 and composed of characters of 1 byte. Manipulation of strings and contiguous memory blocks is an important task in every programming language. There are functions that work until the closing 0 byte on a data array and there are functions that do not take into consideration the end sign: the latter ones always have a parameter that indicates the length of the data to be processed.

  • strcpy: copies a string until the closing 0 byte

  • strncpy: copies the given number of characters from a string

  • strcat: appends a string to another, concatenation, until the zero closing byte is reached

  • strncat: appends a string to another, concatenation, in the specified character length

  • strlen: calculates the length of a string (until 0 end sign)

  • strcmp: compares two strings, until 0 end sign. Output: 0, if the two strings contain the same characters, a negative number if the first argument is smaller (in ASCII) than the second, positive if the second argument is smaller

  • strncmp: compares two strings, in the provided character length. The output is the same as defined at strcmp()

  • strchr: searches for a character (char) in a string. If it finds that, it returns a pointer to that character, if not, it returns NULL

  • strstr: search for another string in a string. Returns the same as strchr()

  • strtok: splits a string into tokens. Attention! It modifies the original content of the string

  • strerror: converts a numeric error code returned by the operating system into a string containing the errror message in English

  • memset: fills up a memory block the pointer points to with the same characters, their quantity can be set in an argument

  • memcpy: copies a memory block a pointer points to into an area the other pointer points to, the number of bytes to be copied is given in an argument

  • memmove: like memcpy, but memory blocks can overlap (that is one of the pointers+length<=other pointer). In the latter case, it copies them into a temporary memory space and then to the target memory space

  • memcmp: compares two memory blocks, at a given length

  • memchr: searches for the given character in the memory block, at a given length. If it does not find that, it returns NULL.

V.3.7. dirent.h

The file named dirent.h contains data types and functions to manage directories on data storage devices. dirent.h is not part of the C standards but POSIX system contains it. Microsoft Visual C++ does not contain dirent.h. There, the corresponding functions of Win32 should be used instead of it (FindFirstFile, FindNextFile, FindClose).

  • struct dirent: structure type applied to return some information about the directory. It contains the serial number and the name of the file as an array of type char[]. It can also contain file offset, record size, name length and file type.

  • opendir: opens a directory of the given name, places it in a memory variable

  • readdir: reads the next entry from an opened directory. At the end of the directory (if there are no more entries), returns NULL

  • readdir_r: similarly to random files, the catalogue can be read by direct addressing

  • seekdir: moves the pointer used by readdir_r()

  • rewinddir: moves the pointer used by readdir_r() to the first element

  • telldir: returns the pointer used by readdir_r()

  • closedir: closes the directory. (The number of file and directory desciptors are finite.)

V.3.8. sys/stat.h

This header file contains functions and constants related to the state of files.

  • struct stat: reads the state of a file into that structure. The members of the structure:

    • st_dev: the device where the file is

    • st_ino: the inode of the file

    • st_mode: the access rights of the file and its type: ST_ISREG: ordinary file, ST_ISDIR: directory.

    • st_nlink: the number of hard links (means other filenames ) belonging to the file

    • st_uid: the user id of the owner of the file

    • st_gid: the group id of the owner of the file

    • st_rdev: the id of the device if it is not a regular file (e.g. a device)

    • st_size: the size of the file

    • st_blksize: the block size of the file

    • st_blocks: the number of blocks allocated for the file

    • st_atime: time of the latest access

    • st_mtime: time of the latest content modification

    • st_ctime: time of the latest state modification

  • stat: a function returning the state of a file (the name of which is given) as a pointer of type struct stat. If the file is a symbolic link, stat returns the parameters of the original file

  • fstat: like stat(), but it does not take a file name argument but the descriptor of the opened file

  • lstat: like stat(), but it does not return the state of the original file but the link.

V.3.9. unistd.h

The file named unistd.h contains constants and functions related to program execution and management under standard Unix-type systems. Only some of them are treated here:

  • chdir: changes the current directory

  • chown: changes the owner of a file

  • close: closes a file using its descriptor

  • crypt: crypts a password

  • dup: duplicates an open file descriptor

  • execl/ecexle/execv: executes a new process (program)

  • _exit: exits the program

  • fork: runs a subprocess called from the parent process. The environment is copied for the subprocess, it is launched and then the parent process waits until it ends

  • link: creates a hard or symbolic link to a file

  • lseek: moves the file pointer

  • nice: changes the nice value of a process

  • pipe: creates a pipeline for data

  • read: reads from a file using its descriptor

  • rmdir: deletes a directory

  • setuid: sets the user id of the current user

  • sleep: waits for a given time

  • sync: updates the buffers of the file system

  • unlink: removes a file

  • write: writes a file using its descriptor.