Articles & Books

Details of std::mdspan from C++23 -- Bartlomiej Filipek

right_left_layout.pngIn this article, we’ll see details of std::mdspan, a new view type tailored to multidimensional data. We’ll go through type declaration, creation techniques, and options to customize the internal functionality.

Details of std::mdspan from C++23

by Bartlomiej Filipek

From the article:

In this article, we’ll see details of std::mdspan, a new view type tailored to multidimensional data. We’ll go through type declaration, creation techniques, and options to customize the internal functionality.

Type declaration 

The type is declared in the following way:

template<  
     class T,  
     class Extents,  
     class LayoutPolicy = std::layout_right,  
     class AccessorPolicy = std::default_accessor<T> 
> class mdspan; 

And it has its own header <mdspan>.

The main proposal for this feature can be found at https://wg21.link/P0009

Following the pattern from std::span, we have a few options to create mdspan: with dynamic or static extent.

The key difference is that rather than just one dimension, we can specify multiple. The declaration also gives more options in the form of LayoutPolicy and AccessorPolicy. More on that later.

C++26: No More UB in Lexing -- Sandor Dargo

SANDOR_DARGO_ROUND.JPGUndefined behavior in C++ is a well-known source of headaches for developers, but surprisingly, even the lexing process contained cases of it—until now. Thanks to P2621R3 by Corentin Jabot, unterminated strings, macro-generated universal character names, and spliced UCNs are now formally defined, aligning the standard with real-world compiler behavior.

C++26: No More UB in Lexing

by Sandor Dargo

From the article:

If you ever used C++, for sure you had to face undefined behaviour. Even though it gives extra freedom for implementers, it’s dreaded by developers as it may cause havoc in your systems and it’s better to avoid it if possible.

Surprisingly, even the lexing process in C++ can result in undefined behaviour. Thanks to Corentin Jabot’s work and his P2621R3 that won’t be the case anymore. As it was accepted as a defect report starting from C++98, in fact, you benefit from this already if you use a new enough compiler.

Truth be told, compilers didn’t do any dangerous. They handled the below cases safely and deterministically. So this change is really about updating the standard and matching implementers’ work.

Let’s quickly see the three cases.

Unterminated strings

     // unterminated string used to be UB
     const char * foo = "

Who would have thought that an unterminated string or a character was UB?! Despite the permissive standard, all major compilers identified it as ill-formed. From now on, even the standard says so.
 

On Trying to Log an Exception as it Leaves your Scope -- Raymond Chen

RaymondChen_5in-150x150.jpgA customer attempted to log exceptions using a scope_exit handler, expecting it to capture and process exceptions during unwinding. However, they encountered crashes because ResultFromCaughtException requires an actively caught exception, which isn’t available during unwinding—leading to an unexpected termination.

On Trying to Log an Exception as it Leaves your Scope

by Raymond Chen

From the article:

A customer wanted to log exceptions that emerged from a function, so they used the WIL scope_exit object to specify a block of code to run during exception unwinding.

void DoSomething()
{
    auto logException = wil::scope_exit([&] {
        Log("DoSomething failed",
            wil::ResultFromCaughtException());
    });

    ⟦ do stuff that might throw exceptions ⟧

    // made it to the end - cancel the logging
    logException.release();
}

They found, however, that instead of logging the exception, the code in the scope_exit was crashing.

They debugged into the Result­From­Caught­Exception function, which eventually reaches something like this:

try
{
    throw;
}
catch (⟦ blah blah ⟧)
{
    ⟦ blah blah ⟧
}
catch (⟦ blah blah ⟧)
{
    ⟦ blah blah ⟧
}
catch (...)
{
    ⟦ blah blah ⟧
}

The idea is that the code rethrows the exception, then tries to catch it in various ways, and when it is successful, it uses the caught object to calculate a result code.

Bit Fields, Byte Order and Serialization -- Wu Yongwei

logo.pngNetwork packets can be represented as bit fields. Wu Yongwei explores some issues to be aware of and offers solutions.

Bit Fields, Byte Order and Serialization

by Wu Yongwei

From the article:

n order to store data most efficiently, the C language has supported bit fields since its early days. While saving a few bytes of memory isn’t as critical today, bit fields remain widely used in scenarios like network packets. Endianness adds complexity to bit field handling – especially since network packets are typically big-endian, while most modern architectures are little-endian. This article explores these problems and their solutions, including my reflection-based serialization project.

Memory layout of bit fields

The memory layout of bit fields is implementation-defined. In a typical little-endian environment, bit fields start from the lower bits of the lower byte and extend toward higher bits and bytes. In a typical big-endian environment, bit fields start from the higher bits of the lower byte and extend toward lower bits and higher bytes.

Let’s consider a practical scenario. Suppose we want to use a 32-bit integer to store a date. How should we achieve this? A simple approach is to store the number of days from a fixed point of time (e.g. 1 January 1900). We can calculate the number of years that can be expressed as follows:

However, with this approach, extracting specific year, month, and day information becomes very cumbersome. A simpler way is to store the year, month, and day as bit fields. We can define the following struct, using only 32 bits:

  struct Date {
    int      year  : 23;
    unsigned month : 4;
    unsigned day   : 5;
  };

Our intention is to use a 23-bit signed integer for the year (ranging from -4,194,304 to 4,194,303), a 4-bit unsigned integer for the month (0–15, covering legal values 1–12), and a 5-bit unsigned integer for the day (0–31, covering legal values 1–31). This representation is similarly compact, with a slightly narrower range, but it’s quite sufficient and much more convenient for many common usages (excepting interval calculation).

Creating a Generic Insertion Iterator, Part 2 -- Raymond Chen

RaymondChen_5in-150x150.jpgLast time, our generic insertion iterator failed to meet the requirements of default constructibility and assignability because it stored the lambda directly. To fix this, we now store a pointer to the lambda instead, ensuring the iterator meets standard requirements while still allowing flexible insertion logic.

Creating a Generic Insertion Iterator, Part 2

by Raymond Chen

From the article:

Last time, we tried to create a generic insertion iterator but ran into trouble because our iterator failed to satisfy the iterator requirements of default constructibility and assignability.

We ran into this problem because we stored the lambda as a member of the iterator.

So let’s not do that!

Instead of saving the lambda, we’ll just save a pointer to the lambda.

template<typename Lambda>
struct generic_output_iterator
{
    using iterator_category = std::output_iterator_tag;
    using value_type = void;
    using pointer = void;
    using reference = void;
    using difference_type = void;

    generic_output_iterator(Lambda&& lambda) noexcept :
        insert(std::addressof(lambda)) {}

    generic_output_iterator& operator*() noexcept
        { return *this; }
    generic_output_iterator& operator++() noexcept
        { return *this; }
    generic_output_iterator& operator++(int) noexcept
        { return *this; }

    template<typename Value>
    generic_output_iterator& operator=(
        Value&& value)
    {
        (*insert)(std::forward<Value>(value));
        return *this;
    }

protected:
    Lambda* insert;

};

template<typename Lambda>
generic_output_iterator<Lambda>
generic_output_inserter(Lambda&& lambda) noexcept {
    return generic_output_iterator<Lambda>(
        std::forward<Lambda>(lambda));
}

template<typename Lambda>
generic_output_iterator(Lambda&&) ->
    generic_output_iterator<Lambda>;

This requires that the lambda remain valid for the lifetime of the iterator, but that may not a significant burden. Other iterators also retain references that are expected to remain valid for the lifetime of the iterator. For example, std::back_inserter(v) requires that v remain valid for as long as you use the inserter. And if you use the iterator immediately, then the requirement will be satisfied:

Creating a Generic Insertion Iterator, Part 1 -- Raymond Chen

RaymondChen_5in-150x150.jpgIn our previous post, we created an inserter iterator for unhinted insertion, and now we’re taking it a step further by generalizing it into a boilerplate-only version. This generic output iterator allows for custom insertion logic using a lambda, but as we’ll see, it doesn’t fully satisfy iterator requirements—something we’ll attempt to fix next time.

Creating a Generic Insertion Iterator, Part 1

by Raymond Chen

From the article:

Last time, we created an inserter iterator that does unhinted insertion. We noticed that most of the iterator is just boilerplate, so let’s generalize it into a version that is all-boilerplate.

// Do not use: See discussion
template<typename Lambda>
struct generic_output_iterator
{
    using iterator_category = std::output_iterator_tag;
    using value_type = void;
    using pointer = void;
    using reference = void;
    using difference_type = void;

    generic_output_iterator(Lambda&& lambda) :
        insert(std::forward<Lambda>(lambda)) {}

    generic_output_iterator& operator*() noexcept
        { return *this; }
    generic_output_iterator& operator++() noexcept
        { return *this; }
    generic_output_iterator& operator++(int) noexcept
        { return *this; }

    template<typename Value>
    generic_output_iterator& operator=(
        Value&& value)
    {
        insert(std::forward<Value>(value));
        return *this;
    }

protected:
    std::decay_t<Lambda> insert;

};

template<typename Lambda>
generic_output_iterator<Lambda>
generic_output_inserter(Lambda&& lambda) {
    return generic_output_iterator<Lambda>(
        std::forward<Lambda>(lambda));
}

template<typename Lambda>
generic_output_iterator(Lambda&&) ->
    generic_output_iterator<Lambda>;

For convenience, I provided both a deduction guide and a maker function, so you can use whichever version appeals to you. (The C++ standard library has a lot of maker functions because they predate class template argument deduction (CTAD) and deduction guides.)

Using Senders/Receivers -- Lucian Radu Teodorescu

1.pngC++26 will introduce senders/receivers. Lucian Radu Teodorescu demonstrates how to use them to write multithreaded code.

Using Senders/Receivers

by Lucian Radu Teodorescu

From the article:

This is a follow-up to the article in the previous issue of Overload, which introduced the upcoming C++26 senders/receivers framework [WG21Exec]. While the previous article focused on presenting the main concepts and outlining what will be standardized, this article demonstrates how to use the framework to build concurrent applications.

The goal is to showcase examples that are closer to real-world software rather than minimal examples. We address three problems that can benefit from multi-threaded execution: computing the Mandelbrot fractal, performing a concurrent sort, and applying a graphical transformation to a set of images.

All the code examples are available on GitHub [ExamplesCode]. We use stdexec [stdexec], the reference implementation for the senders/receivers proposal. Additionally, some features included in the examples are not yet accepted by the standard committee, though we hope they will be soon.

How do I create an inserter iterator for unhinted insertion into std::map? -- Raymond Chen

RaymondChen_5in-150x150.jpgThe C++ standard library provides various inserters like back_inserter, front_inserter, and inserter, but for associative containers like std::map, only inserter is available, requiring a hint. However, if elements arrive in an unpredictable order, providing a hint could be inefficient, so a custom inserter that performs unhinted insertion can be a useful alternative.

How do I create an inserter iterator that does unhinted insertion into an associative container like std::map

by Raymond Chen

From the article:

The C++ standard library contains various types of inserters:

  • back_inserter(c) which uses c.push_back(v).
  • front_inserter(c) which uses c.push_front(v).
  • inserter(c, it) which uses c.insert(it, v).

C++ standard library associative containers do not have push_back or push_front methods; your only option is to use the inserter. But we also learned that the hinted insertion can speed up the operation if the hint is correct, or slow it down if the hint is wrong. (Or it might not have any effect at all.)

What if you know that the items are arriving in an unpredictable order? You don’t want to provide a hint, because that’s a pessimization. The inserter requires you to pass a hint. What do you do if you don’t want to provide a hint?

C++26: Erroneous Behaviour -- Sandor Dargo

SANDOR_DARGO_ROUND.JPGWith C++26, the introduction of erroneous behavior provides a well-defined alternative to undefined behavior when reading uninitialized values, making it easier to diagnose and fix potential bugs. This blog post explores the impact of P2795R5, how compilers will handle erroneous values, and the new [[indeterminate]] attribute for cases where deliberate uninitialized values are needed.

C++26: Erroneous Behaviour

by Sandor Dargo

From the article:

If you pick a random talk at a C++ conference these days, there is a fair chance that the speaker will mention safety at least a couple of times. It’s probably fine like that. The committee and the community must think about improving both the safety situation and the reputation of C++.

If you follow what’s going on in this space, you are probably aware that people have different perspectives on safety. I think almost everybody finds it important, but they would solve the problem in their own way.

A big source of issues is certain manifestations of undefined behaviour. It affects both the safety and the stability of software. I remember that a few years ago when I was working on some services which had to support a 10x growth, one of the important points was to eliminate undefined behaviour as much as possible. One main point for us was to remove uninitialized variables which often lead to crashing services.

Thanks to P2795R5 by Thomas Köppe, uninitialized reads won’t be undefined behaviour anymore - starting from C++26. Instead, they will get a new behaviour called “erroneous behaviour”.

The great advantage of erroneous behaviour is that it will work just by recompiling existing code. It will diagnose where you forgot to initialize variables. You don’t have to systematically go through your code and let’s say declare everything as auto to make sure that every variable has an initialized value. Which you probably wouldn’t do anyway.

But what is this new behaviour that on C++ Reference is even listed on the page of undefined behaviour? It’s well-defined, yet incorrect behaviour that compilers are recommended to diagnose. Is recommended enough?! Well, with the growing focus on safety, you can rest assured that an implementation that wouldn’t diagnose erroneous behaviour would be soon out of the game.

Contracts for C++ Explained in 5 Minutes -- Timur Doumler

CRussia2019_portrait-1-1024x683.jpgContract assertions, introduced in proposal P2900 for C++26, provide a robust mechanism for runtime correctness checks, offering more flexibility and power than the traditional assert macro. This blog post will explore how contract assertions work, their various evaluation semantics, and how they can improve code reliability with preconditions, postconditions, and custom violation handlers.

Contracts for C++ Explained in 5 Minutes

by Timur Doumler

From the article:

With P2900, we propose to add contract assertions to the C++ language. This proposal is in the final stages of wording review before being included in the draft Standard for C++26.

It has been suggested by some members of the C++ standard committee that this feature is too large, too complicated, and hard to teach. As it turns out, the opposite is true: contract assertions are actually very simple and can be explained in just five minutes. In this blog post, we will do exactly this!

As the name says, contract assertions are assertions — correctness checks that the programmer can add to their code to detect bugs at runtime. So they’re just like the existing assert macro, except they’re not macros (which fixes a bunch of problems) and they’re way more flexible and powerful!


contract_assert replaces assert

Our replacement for assert is called contract_assert. Unlike assertcontract_assert is a proper keyword, but it works in the same way:
 
auto w = getWidget(); 
contract_assert(w.isValid());  // a contract assertion
processWidget(w);
 
The default behaviour is also the same: the assertion is checked, and if the check fails, the program prints a diagnostic message and terminates.