C++ Play - my own short guide to C++11/14

C++ has evolved dramatically in the past years. It is virtually a new language, a huge improvement over the C++ I used to practice in the 2000s. Much more expressive templates, richer libraries, more emphasis on compile-time safety and performance improvements. Far from the C with classes in the early 2000s and far from the convoluted template constructs which appeared later to improve compile-time issue detection and type safety. To refresh my memory and also get up to speed with the new trends, I have spent a couple of hours digging through the docs, doing some tutorials. Here are my notes.

Improvements to classes - inline initializers for class members:

class Test{

	int x = 0; // new in C++
	int y = 0;

	std::string test_str = "Hello World";
}

Smart pointers in the standard library

In the 2000s you really had three choices for memory management in the C++ world:

Manual (new / delete) with clear rules for owership in the “style-guide” of the project.
The Boost library’s shared_ptr and weak_ptr for shared ownership or std::auto_ptr, the only smart pointer from the STD library, now deprecated.
Custom-built smart pointers, autolists or any other custom constructs.

Now smart pointers are included in the standard library and they are the preferred way of writing C++ code.

#include <memory>

// ...

class Test {
// ...
public:
	Test(int x, int y, const std::string& s) { /*...*/ }

}

void test_shared_ptr() {

	// the recommended way to initialize a smart pointer because the ref-counter is
	// allocated in the same call with the memory for Test
	// => less memory fragmentation and better locality
	shared_ptr<Test> pTest = make_shared<Test>(10, 10, "Hello World");
	weak_ptr<Test> pwTest = pTest;
}

Another smart pointer construct is the unique_ptr< >, which replaces the now deprecated auto_ptr . I think it is worth considering using it whenever possible instead of boost::shared_ptr<> and boost::weak_ptr<>. I think it sends a stronger signal about ownership intent when used in new code.

unique_ptr<Test> source_unique_ptr() {
	return make_unique<Test>();
}

// t is received by value. In this case, this function
// can only be called using std::move  on the pointer
void test_sink(unique_ptr<Test> t) {
	cout << t->x << endl;
}

void test_func_ref_unique_ptr(unique_ptr<Test>& t) {
	t = make_unique<Test>(); // destructor will be called here
							 // for the object sent by parameter
	t->x = 5;
}

void test_smart_pointers() {
	auto pt = source_unique_ptr();
	test_func_ref_unique_ptr(pt);
	test_sink(move(pt)); // pt is now empty!
}

Note: unique_ptr<> cannot be sent by value - it will result in a compilation error. If the receiving function accepts the unique_ptr<> by value as parameter, it can only be passed using the std::move function, which will nullify the pointer in the calling function. This is called the sink pattern and it is exemplified in the test_sync function above. The copy constructor for unique_ptr<> is explicitly deleted (later on this).

Improvements to collections

C++11 brings us unordered associative containers, a fixed size array class and a singly-linked list (std::forwad_list): http://www.cplusplus.com/reference/stl/.

Because of the new language features (move semantics, std::initializer_list and variadic templates), collections have become much more useful and expressive For instance, now you can actually consider storing a full instance in a collection instead of just a pointer, without worrying about endless copies and reallocations. This improves data locality and cache friendliness (provided that you have the appropriate move operations in place). For instance, old pattern in which we push pointers to std::vector<FullType*> is now similar in iteration speed with std::lists<FullType>::emplace_back. See below.

Here are some small examples:

void test_collections() {

	vector<unique_ptr<Test>> v_Test;

	// uses move semantics to fully change the ownership of the pointer to the collection.
	// provided that we had a full type with move constructor / move assignment,
	// on vector reallocation only the move operations would have been called.
	v_Test.push_back(make_unique<Test>()); 

	// initializer list for vector and list
	vector<int> v_Int = { 1, 2, 3, 4, 5 };

	list<Test> tst = {
		Test()
	};

	// new member function, based on variadic templates argument. 
	// No copy constructor, no move. 
	// Just construct the object at the end of the collection, invoking the right parameters
	// Better locality as the two pointers to back and forward from the list are stored next to the 
	// allocated memory for the Test object in this case.
	tst.emplace_back(4, 5, "Test test test"); 
}

void test_collections_priority_queue() {

	// deque does not do copies on increase; a little bit slower random access though
	priority_queue<Test, deque<Test> > tst; 

	tst.emplace(3, 0, "A");
	tst.emplace(1, 0, "B");
	tst.emplace(2, 0, "C");

	std::cout << "Popping out elements..." << endl;
	while (!tst.empty())
	{
		const Test& t = tst.top();
		cout << t.str.c_str() << endl;
		tst.pop();
	}
	std::cout << endl;
}

Lambdas

I think it is pretty obvious why lambdas were introduced. Before lamdas, you were bind either to pointer to functions or to functor objects, which were implemended “far” from the point of interest. This made the std-algorithms harder to use, due to code verbosity and scrolling back and forth in the source code. All in all, lambdas are mostly syntactic sugar, but a very useful one.

Returning a lambda from a function and storing a lambda in a variable

std::function<double()> getPIFunction() {
	static double PI = 3.1415;

	return []() { // because PI is declared as static, it does not need to be captured explicitly;
		return PI;
	};
}

//....

auto PI_func = getPIFunction();
cout << typeid(PI_func).name() << ": " << PI_func() << endl;
// output: class std::function<double __cdecl(void)>: 3.1415

Using lambdas in std::algorithms - passing them as arguments

vector<int> arr = { 1, 2, 3, 4, 5 };

for_each(arr.begin(), arr.end(), [] (int value) {
	cout << value << endl;
});

vector<int> cubes;

// back_inserter - very useful function
transform(arr.begin(), arr.end(), back_inserter(cubes), [](int n) { 
	return n * n * n;
});

for_each(cubes.begin(), cubes.end(), [](int  value) {
	cout << value << endl;
});

Improvements to lambdas starting with C++14

These include type deduction for lambda parameters - no need to send them explicitly, auto is enough and also initialization for capture parameters - useful when, for instance, you want to move an object when captured to lambda instead of just simply copy it by value, or apply any type of other transformation.

The code below uses another cool feature from C++ 11, the variadic templates. The template invoke simply invokes the functor it receives with the list of parameters it receives, all in a typesafe matter. This pattern can be applied when you want a template to store a function and then invoke it later, with different parameters.

Same for lambdas, the programmer might simply not care what the p1 and p2 types are - they are deduced by the compiler given the way the function is invoked - in this case p1 is an int and p2 is a const char*. Note though, this check is performed at compile time, so it is not at all a form of dynamic invocation.

template<typename Fn, typename... T> void invoke(const Fn& fn, const T&... params){
	fn(params...);
}

void test_lambdas_cxx14() {

	auto ptr = make_unique<Test>();

	// auto parameters for lambda, initialization for capture parameters (c++14)
	invoke( [ptr = move(ptr)] (auto p1, auto p2){ 

		cout << ptr->str.c_str() << " " << p1 << " " << p2 << endl;

	}, 2, "Hello World");

	// should be null because it was moved in the lambda initialization
	cout << "Value of ptr:" <<  ptr.get() << endl; 

}

A little bit of extra explanation.

int x = 1;
[x = x + 1](){
	cout << x << endl;
}();
cout << x << endl;

This code prints 2 followed by 1. The reason is that x in the lambda is a different variable than x in the outer function. The code above is identical to the following (also running code, although __x is not specifically declared anywhere)

int x = 1;
[__x = x + 1](){
	cout << __x << endl;
}();
cout << x << endl;

Lambdas can also be recursive

But in this case, they need to be explicily captured like in the code below. Please note the capture by reference. If an attempt is made to send the lambda by value the code compiles but you get a runtime error - the lambda has been declared but not defined yet.

void test_recursive_lambdas() {

	// a redundant implementation but serves the purpose :)
	std::function<int(int)> fibonacci = [&fibonacci](int n) -> int{

		if (n < 1)
			return -1;
		if (n == 1 || n == 2)
			return 1;

		return fibonacci(n - 1) + fibonacci(n - 2);
	};

	std::vector<int> v = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

	// range for; prefered to use int& in order to reference elements in v
	// range for is a new C++ 11 construct
	for (int &number : v) { 
		cout << fibonacci(number) << "; ";
	}

	cout << endl;
}

this can also be captured, like in the snipped below:

void CaptureThis::CaptureThis() {
	// "this" must be in the capture list for the following to work
	PrintSmth([this](const std::string& out) { 

	this->x++; // works
	this->y++; // works

	x++; // works
	y++; // works

	cout << this->x << " " << this->y << " " << out.c_str() << endl;

	}
	, "Hello World");
}

Changing values and mutable lambdas

	vector<int> numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

	int prev = 0;

	for_each(numbers.begin(), numbers.end(), [prev](int& value) mutable { 
		// value is sent by reference, so I can modify it in place
		// I send prev by value => prev does not modify in the upper scope BUT!
		// because I define the function as mutable, 
		// I can modify it in the scope of the lambda (in the generated class)
		prev = value = prev + value;
	});

	for_each(numbers.begin(), numbers.end(), [](int val) {
		cout << val << endl;
	});

The output of this program is:

Because the lambda was postfixed with the mutable keyword, it allows me to modify the captured values. However, since the capture was not done by reference, it is not modified in the outer scope. If we were to write in pseudo-c++, the generated lambdas are similar with the following classes:

By-value capture without the mutable postfix:

class Lambda{
	const Type t;
	operator() {
		// t cannot change
	}
}

By-value with the mutable postfix:

class Lambda{
	Type t;
	operator () {
		// t can change
	}
}

In the example above, since we declared the input parameter for the lambda as a reference (int&), it gives me direct write access to the vector content.

Decltype and Declval

Two usage scenarios:

We need a way to compute at compile-time the return type of a template function.

// starting with c++14 compiler can autodeduce return type
template<typename T1, typename T2, typename T3, typename T4> 
auto multiply(T1 t1, T2 t2, T3 t3, T4 t4)  /*-> decltype (t1 * t2 * t3 * t4) */ { 

	auto ret = t1 * t2 * t3 * t4;

	cout << typeid(ret).name() << endl;

	return ret;
}

//...

int x = 5;
double y = 5.4;
float z = 3.1f;
long w = 6;

cout << multiply(x, y, z, w) << endl;

We have a private constructor, but we still need the type of the class:

class A {
private: 
	A() {}
public:
	static shared_ptr<A> create() {
		return shared_ptr<A>(new A());
	}
};

//...

// declval can only be used where it will not be evaluated, 
//but only checked for type - like in decltype
cout << typeid(decltype(declval<A>())).name() << endl; 

Variadic templates (templates with a variable number of arguments)

I think this is one of the coolest features of C++11, as it allows very clean usage of libaries. In an example above, we saw the emplace_back method of std::collections which permits the caller to send the constructor parameters directly to the method, and the method inside does the allocation of the object. Here is a more complex example, which shows how to:

Create a template function with a variable number of parameters
Call another function for each parameter
Recursively extract each parameter and use it in a strongly typed manner
Use the library in a very natural way.

Let’s assume we want to write a library class which outputs CSV to a stream. We want to use it in a straight forward manner, like in the snippet below:

template<class Stream> auto createCSVPrinter(Stream& s) -> CSVPrinter<Stream> {
	return CSVPrinter<Stream>(s);
}

void test_variadic_templates() {
	
	// let's simply use a template function to deduce the type of cout
	auto csvPrinter = createCSVPrinter(cout); 

	// we are outputting a row or parameters of different types
	csvPrinter.print_row(5, 20, 32.5f, 22, "Hello World");
	csvPrinter.print_row(5, 20, 32.5f, 22, "Hello World");
	csvPrinter.print_row(5, 20, 32.5f, 22, "Hello World");
	csvPrinter.print_row(5, 20, 32.5f, 22, "Hello World");
}

The CSVPrinter class is clearly a template of type Stream which includes a method, print_row which receives a variable number of parameters of different types.

template<typename Stream> class CSVPrinter {

private:
	int row_count = 0;
	Stream* stream = nullptr;

	// called for each parameter
	template<typename T> const T& validate(const T& t) { 
		//cout << typeid(T).name() << endl;
		return t;
	}

	template<typename FirstColumn, typename... Columns>  
	void print_row_internal(const FirstColumn& v, const Columns&... reminder_columns) {
		*stream << v << ", ";
		print_row_internal(reminder_columns...);
	}

	template<typename LastColumn> 
	void print_row_internal(const LastColumn& v) {
		*stream << v;
	}


public:

	CSVPrinter(Stream& _stream) : stream(&_stream) {}

	CSVPrinter(const CSVPrinter& other) {
		this->row_count = other->row_count;
		this->stream = other->stream;
	}

	~CSVPrinter() { cout << endl; }

	template<typename... Columns> 
	void print_row(const Columns&... cols) {

		if (row_count > 0) { cout << ", " << endl; }		
		row_count++;

		// function validate is called for each parameter in cols
		print_row_internal(validate(cols)...); 
	}
};

Let’s look at the print_row function:

It calls print_row_internal with validate(cols).... This syntax allows calling the validate function for each col prior to calling print_row_internal.
print_row_internal has two specializations. One for a single parameter, which closes the recursive loop, and one with one specialized (extracted) parameter and the rest of the colls as bulk. This recursive definition allows processing each parameter in a type-safe manner, one by one.