How to read a C++ file

profezzorn · September 17, 2024, 3:17am

Before you can read a C++ file, you need to have some basic understanding of the preprocessor. Sometimes, it can make things hard to read, but if you understand what it’s doing, you should be able to puzzle it out.

So, if you haven’t already, go read this post…

So, once we can see past what the pre-processor is doing, we can read what is in a C++ file:

variables
functions
namespaces and “using” statements
static asserts
type declarations

At this point you may think I’m pulling your leg… 5 things, is that it? While, yes, it is, but each of these hides a lot of stuff. Also, functions and type declarations can be templated, which we’ll talk about later. For now, let’s dive into these 5 things.

Variables

Variables are memory locations which can hold a value of some kind. It is also given a name so that you can refer to it easily and get or set it’s value. Generally speaking, variables in C++ look like “TYPE NAME;”. Here are a few examples:

int apples;
int64_t pears;
float orange_juice_litres;

Here, int, int64_t and float are types. Types specify how much memory is needed for the variable, and what you can do with it. “int” means whole numbers, “float” can have decimal numbers, etc.

Variables can also be initialized to a value, like this:

int apples = 10;
int64_t pears = 100;
float orange_juice_litres = 1000.0;

Unfortunately, C++ allows for some variable declarations that are harder to read, like:

int *ptr;         // pointer to integer
int a, b, c, d;  // four variables: a, b, c, d
int arr[33];    // array of 33 integers

// function pointer pointing to function that takes
// three integers as input and returns integer
int (*funcptr)(int, int, int);

Function pointers are particularly troublesome, but luckily, well written code don’t usually contain these types of variables, because you can use a typedef (discussed below) to make things simpler.

What’s important to know is that when a variable appears in the file, and not inside a function, struct or class, it is a global variable, which means that it’s memory is accessible from the whole file. The memory is already allocated when the program starts, and it doesn’t go away until the program ends.

Functions

Functions are bits of code. They take a number of input values, do some things and return a single output value. The pattern for functions looks like this:

TYPE NAME ( ARGUMENT LIST ) {
  CODE;
}

Here are a few examples:

void hello_world() {
  printf("hello world\n");
}
int add_one(int x) { return x + 1; }
int biggest(int a, int b, int c) {
  if (a >= b && a >= c) return a;
  if (b >= c) return b;
  return c;
}

The first function (called hello_world) takes no arguments. add_one takes one argument and biggest takes three. All the arguments needs to be of type “int”.
Obviously there is a lot of stuff that can go inside a function, like if statements, functions calls and things like “return”. For now, we’re just going to call all of that stuff CODE and get to it later.

namespaces and “using” statements

Namespaces are a way to organize code and prevent name collisions. Usually C++ programs are made of lots of files, and if two of them have variables with the same name, it can cause a lot of problems. Namespaces can make that easier. Namespaces come in two flavors; named and anonymous. Here is what they look like:

namespace MyNameSpace {
  // this variable will be called MyNameSpace::x
  // outside of the namespace.
  int x = 0;
};
namespace {
  // this variable has no name outside of the namespace
  int x = 0;
  int count() { x = x + 1; return x; }
};

Using statements allow you to access variables inside of a namespace without using the full name each time, like:

// Without using, we have to use the full name.
void decrement_x() { MyNameSpace::x = MyNameSpace::x - 1; }

// With using, we have can use a shorter name.
using MyNameSpace;
void increment_x() { x = x + 1; }

I’m here to tell you though, that while it’s tempting to use “using” statements a lot to make the code shorter, it often makes the code harder to read because it can be harder to tell where things came from. Lots of code on the internet tends to start with “using std;” but in my code I generally write out the “std::string” and “std::vector” instead. As usual, tastes will vary.

static asserts

These are less common, but they are basically a way to create an error if some compile-time condition isn’t met. Examples:

// static_assert(CONDITION, MESSAGE);
static_assert(NUM_BLADES >= 0, 
              "NUM_BLADES cannot be negative");

Obviously only expressions that can be evaluated at compile time can be used this way. Similar things can be done with #if and #error, but static_assert can calculate more things, like the size of a type, number of elements in an array or if two types are the same.

type declarations

Ok, this is the big one. A lot of C++ revolves around types. There are basically three ways to create a declare a type:

enums
type aliases (typedef / using)
struct / class

Perhaps the most important part to know is that type declarations do not do anything by themselves. They allocate no memory and don’t run any code. Not until you create a variable or value of that type is memory allocated, and then the code inside can operate on that memory.

enums

Enums are fairly simple. They are a way to create a type that has a set of possible values, and give names to those values. Here is an example:

enum CoinFlip {
   HEADS,
   TAILS,
   LANDED_ON_EDGE,
   COIN_LOST
};

By default, C++ will make an enum that is compatible with an “int”, so it’s ok to write “int coinflip = TAILS;”. Values assigned to the names in an enum generally start with 0 and are increased by one each time, so HEADS=1, TAILS=1, LANDED_ON_EDGE=2 and COIN_LIST=4. However, if you can also set the values explicitly like this:

enum CoinFlip {
   HEADS = 1,
   TAILS = 2,
   LANDED_ON_EDGE = 1000,
   COIN_LOST = 100
};

Having enums be compatible with integers can be handy, but it can also lead to programming errors where the wrong value is assigned to a variable of type CoinFlip. C++ has a way to make a variable that isn’t compatible with an int:

enum class CoinFlip {
   HEADS,
   TAILS,
   LANDED_ON_EDGE,
   COIN_LOST
};
int x = HEADS; // does not work.
CoinFlip y = 100; // does not work.
CoinFlip z = CoinFlip::TAILS; // does work

type aliases

There are basically two ways to make a type alias. One is called “typedef” and the other one is with the “using” keyword.

// These two do the same thing.
typedef int MyIntegerAlias;
using MyIntegerAlias = int;

The “typedef” uses the same syntax as variable definitions, which means that all the horrible things you can do with a variable definition apply here too. Personally I think “using” is a lot more clear. However, “typedef” has been around longer, so old code always uses “typedef”.

Type aliases might seem like a simple thing but they can be quite powerful when used correctly in code. (especially when you use it together with templates, more about that below.)

struct / class

struct is short for “structure”. In C, structures can only contain variables, kind of like this:

struct Box {
  float width;
  float height;
  float depth;
};

If we made a variable of type Box, we would need enough memory for three floats, and those floats can the be read or written individually, like so:

  Box box;
  box.width = 8.0;
  box.height = 11.0;
  box.depth = 5.0;

Now a class, is a struct, that can also contain code, like so:

class Box {
  // this makes things accessible from outside the class
  public:
    float width;
    float height;
    float depth;

    float getVolume() { return width * height * depth; }
};

float example() {
  Box box;
  box.width = 8.0;
  box.height = 11.0;
  box.depth = 5.0;
  float volume = box.getVolume();
  return volume;
}

A function which lives inside a class like “getVolume” above, is called a “method”. In C++ structs can also contain code, and the only difference between a struct and a class is that by default, all variables and code written inside a struct is accessible from the outside, while in a class, nothing is accessible from the outside by default. (Which is why I added “public” in the class example above.)

Classes and structs can contain variables, functions (methods), types and enums. The really cool part is that you can create as many instances of a class as you like, and the code inside the class always know which instance it belongs to. For example:

void example() {
  Box a; a.width=2.0; a.height=2.0; a.depth=2.0;
  Box b; a.width=3.0; a.height=3.0; a.depth=3.0;
  Box c; a.width=1.0; a.height=2.0; a.depth=3.0;
  float v1 = a.getVolume(); // this will return 8.0
  float v2 = b.getVolume(); // this will return 27.0
  float v3 = c.getVolume(); // this will return 6.0
}

Basically all the power of object-oriented programming comes from this trick of keeping track of which instance we are working with right now.

templates

Templates allow you write generic code which can handle multiple different types, without having to re-write the same code over and over again. There is a lot of complications with templates, so I’m not going to go over all of it, but I can at least describe the basics.

Templates can apply to functions, structs, classes and type aliases created with the “using” statement. Here are some examples:

template<typename T>
T max(T a, T b) {
  if (a > b) return a;
  return b;
}

// max<int>(1, 2)
// max<float>(3.14, 7.5);
// max<std::string>("a", "b")

template<class VALUE_TYPE>
struct Vector2d {
   VALUE_TYPE x;
   VALUE_TYPE y;
   VALUE_TYPE length() { return sqrt(x * x + y * y); }
};

// Vector2d<float> tmp;
// Vector2d<double> tmp;

Note that in some cases the compiler can determine the template types automatically, like if you call max(5,6). Since both arguments are of type “int”, the compiler will use max<int> automatically.

One very important think that you can do with templates is called “template specialization”. This allows you to have a generic version, but special implementations for some types, like:

// This is used if T is anything but a float.
template<typename T>
T abs(T x) {
   if (x < 0) return -x;
   return x;
}

// This is used if you use abs<float>
template<>
float abs<float>(float x) {
  return fabs(x);
}

This sort of thing kind of works like an if-statement for types, which allows templates to do a lot of complicated pre-processing. ProffieOS does a fair amount of this sort of things, so be prepared for that.

CODE

Ok, but none of this stuff does anything, what about the CODE???
Patience young padawan, patience… The stuff that goes into functions and methods will be described in an upcoming post.