CPP, the C/C++ Preprocessor

CPP is the C & C++ Preprocessor. When you compile C or C++, instead of just reading the file from disk, the compiler asks CPP to read the file. CPP is designed to operate on a stream of text. It reads one line at a time, does whatever operations it needs to do, and then writes that line out to the compiler. To CPP, the lines below the current line is the future, and it has no clue about what will happen in the future (until it gets there).

Most preprocessor things are done with preprocessor statements, which are lines which start with a #. These lines tell the preprocessor what to do, and are not part of the output that is sent to the compiler.

comments

When CPP encounters a comment, it removes it.
However, it keeps the newlines so that line numbers will still be accurate.

Comments come in two different forms. The first one begins with /* and ends with */. The other begins with // and ends at the end of the line.

// This is a comment (which cannot be longer than one line)
// (unless you have another comment right below...)

/* this is also a comment
    but this comment continues
    over several lines */

#if / #endif

#if works a lot like a comment, but it depends on the value of a condition. Both of these examples will output nothing:

/*
   this will get removed
*/
#if 0
   this will also get removed
#endif
#if 1
   this will not be removed
#endif

expressions

After #if you can put an expression, which gets evaluated using integer (whole-number) math. You can use the following operators: + - * / ~ ! % ^ & == != < > <= >= ( ) If the result is 0, the code inside the if is removed, if it is anything else, it is not removed.

Expressions can also contain defined(SOME_DEFINED) which evaluates to 1 if the SOME_DEFINE is defined and 0 otherwise.

#define

This creates a new define. Defines works like a dictionary, translating one word into something else. It’s basically a search-and-replace for the source code.

Defines can be used in different ways. Sometimes all we care about is if the define exists or not, like this:

#define  HAVE_TEA
#if defined(HAVE_TEA)
   drink_tea();
#endif

In this case HAVE_TEA is added to the dictionary, but it doesn’t expand to anything, it also doesn’t matter what it expands to, because the #if only checks if it’s in the dictionary or not.

#define TEA_FLAVOR "Earl Grey"
   drink_tea(TEA_FLAVOR);

In this example, TEA_FLAVOR expands to "Earl Grey", so the final output becomes:

  drink_tea("Earl Grey");

Defines can also have arguments, which makes them look a lot like function calls. However, they are quite different, since preprocessor define expansion happens before compilation and return code, while function calls happen when the code is eventually executed, and they return data, not code.

#define drink_tea(FLAVOR) boil(); steep(FLAVOR); drink();

  drink_tea("Earl Grey");

The output would be:

   boil(); steep("Earl Grey"); drink();

Of course, boil, steep and drink, could also be expanded further.

Sometimes people use capital letters for defines to separate defines from variables, function calls and other things that happens later. This is merely convention, and neither the pre-processor, nor the compiler give any special significance to capital letters.

Note that defines start working when they are defined, and only affect lines after that define.

  drink_tea();
#define drink_tea drink_coke
  drink_tea();

This would expand to:

  drink_tea();
  drink_coke();

#else

Between #If and #endif, you can use an #else to invert the condition.

#if 0
This will be removed
#else
This will not be removed
#endif

#elif

#elif is short for else if. These two examples do the same thing:

#if 0
removed
#elif 1
not removed
#else
also removed
#endif

#if 0
removed
#else
#if 1
not removed
#else
also removed
#endif
#endif

#ifdef, #ifndef

#ifdef is short for “if defined” and #Ifndef is short for “if not defined”.

// These two are the same
#ifdef HAVE_TEA
#if defined(HAVE_TEA)

// These two are the same.
#ifndef HAVE_TEA
#if !defined(HAVE_TEA)

(In case you didn’t know, ! means “not” in C/C++. )

#include

#include reads another file and inserts it at the current location. The file to include can be specified in three different ways:

// This syntax is used to include "system" files.
// The preprocessor has a list of directories where
// where it will look for the file.
#include <some_file.h>

// This is used to include "local" files, which are
// part of the program that is being compiled.
// The path is relative to the current file.
#include "some_file.h"

// The file name can also be specified with a define, like
#define SOME_FILE "some_file.h"
#include SOME_FILE

ProffieOS uses the third option above to include the config file.

#error, #warn

These two lets you output an error message, for instance you could do something like:

#if defined(HAVE_TEA)  && defined(HAVE_NO_TEA)
#error Make up your mind!
#endif

The only difference between #error and #warn is that #error will prevent the program from successfully compiling, while #warn will continue compiling after outputting the error message.

Stringification

The preprocessor also has a way to turn an argument to a function-like define into a string, like this:

#define CHECK(X) if (!X) errror(#X " is not true")
   CHECK(have_tea);

The output of this would be if(!have_tea) error("have_tea" " is not true"); (Note that C/C++ will concatenate two strings next to each other, so “A” “B” is the same as “AB”)

Concatenation

The preprocessor also has ways to concatenate tokens. This is often useful to build function or variable names.

#define CONCAT(A, B)  A##B
  CONCAT(pole, vault)

This would output polevault.

#line

#line is special, and it not often used in code, instead, it is output by the preprocessor when there is a change in line numbers. This helps the compiler tell you where an error actually occurred. Example:

Some stuff
#include "some_file.h"
More stuff

This would actually output:

Some stuff
#line 1 "some_file.h"
The contents of some_file.h
#line 3 "whatever your file is called..."
More stuff

This way, the compiler will now that More stuff came from line 3 in your file.

System defines

Before reading your file, the preprocessor will add a bunch of defines which depends on the compiler, the system, the processor and other useful things. These defines usually start with an _ to mark them as special, but not always. Examples include, but are not limited to: __gcc___, __arm__ and __linux__.

Surprisingly, this covers almost everything the preprocessor can do. There is more to learn of course, some things interact in subtle ways, and knowing the right way to use the preprocessor is a bit of an art form. However, this article should hopefully cover enough to understand how we use the preprocessor in ProffieOS.

3 Likes

I hate the preprocessor so much… I really do. :sweat_smile:

It’s useful, but it’s so ugly. I hate it, and yet I couldn’t do a lot without it!!

It’s far too useful to be ugly IMHO.
I wish it could do more.
Sometimes I think C++ templates would have been better if they were implemented in the preprocessor.

Sometimes I miss the preprocessor when working in other languages, like Java.

There are plenty of things I don’t like about templates… IMO besides for very simple uses I really don’t like using them without Concepts, but I much prefer them to macros.

They can be ugly and useful at the same time :laughing:

Agreed.