Frank de Alcantara
Pai, marido, professor e engenheiro.
Siga no Twitter

C++ Here the Journey Begins

Item: C++ Here the Journey Begins
Rating: 5
Author: Frank de Alcantara

This chapter explores the fundamental building blocks of C++20 that are essential for competitive programming. By understanding and mastering these concepts, programmers can write more efficient and effective code to solve complex problems.

Throughout the chapter, we’ll focus on techniques for efficient input and output handling, which is critical in competitive programming when dealing with large datasets. We’ll cover I/O optimizations, such as using std::ios_base::sync_with_stdio(false) and std::cin.tie(nullptr) to reduce synchronization overhead, and explore how to use mmap for fast file I/O.

After I/O and errors we’ll deal with vectors and matrices, two data structures that are widely used in competitive programming algorithms. C++20 introduces new features, such as std::span and std::ranges, which provide additional ways to work with these data structures. We’ll explore how to leverage these features to handle and process data in a competitive environment.

Next, we delve into loops, which are a core part of any C++ program. We’ll look at various types of loops, from traditional for loops to the new range-based for loops in C++20. We’ll also investigate how to optimize loops using techniques like early exit conditions and the std::execution::par execution policy for parallel execution.

By the end of this chapter, you’ll have a solid understanding of the fundamental C++20 concepts and techniques that are essential for competitive programming. You’ll be well-prepared to apply these skills to solve a variety of programming challenges efficiently and effectively.

2.4. Data Manipulation in C++20

Modern C++20 provides powerful tools and abstractions for handling structured data efficiently. This section explores four fundamental approaches to managing and manipulating data collections: vectors for one-dimensional data structures, matrices for two-dimensional data arrangements, and spans and ranges for flexible data views and iterations. These mechanisms form the backbone of efficient data handling in competitive programming and general-purpose applications.

Vectors are flexible. They change size and are easy to use. You can insert, remove, and resize them with little effort. They work well for many tasks in competitive programming. Vectors hold one-dimensional data, but can also represent matrices. These two-dimensional vectors are often used for grids, tables, or simulations.

Matrices, built from vectors, handle multi-dimensional problems. They are good for game boards, adjacency matrices, and dynamic programming. You can change rows and columns, transpose the matrix, or access submatrices. Vectors and matrices give you control over how you store and process data. Starting with vectors.

2.4.1. Working with Vectors

In C++20, vectors have several important features that make them a powerful tool for managing collections of elements. First, vectors dynamically resize. This means they grow automatically when you add elements beyond their current capacity. You don’t need to manually manage memory like you would with arrays.

Vectors also provide random access in constant time, $O(1)$. You can access any element by its index using [] notation, just as you would with a regular array. This makes it easy to work with elements directly without needing to traverse the vector.

C++ also has the std::array. While std::array offers better performance due to its fixed size and stack allocation, making it more efficient in terms of memory access and iteration speed, it’s practically unsuitable for competitive programming scenarios. The core reason is that competitive programming problems typically deal with dynamic, runtime-determined input sizes. Since std::array requires the size to be known at compile time, it becomes essentially useless when you don’t know beforehand how many elements you’ll need to store, which is the case for the vast majority of competitive programming challenges. In contrast, std::vector’s ability to dynamically resize and handle arbitrary input sizes makes it the go-to container for competitive programming, despite its slightly higher overhead. The minimal performance advantage of std::array becomes irrelevant when weighed against the fundamental need for runtime size flexibility in competitive programming contexts.

In C++, the std::vector class is part of the Standard Template Library (STL). It is a dynamic array that manages collections of elements. Unlike arrays, vectors resize when elements are added or removed. This makes them useful in competitive programming when data sizes change during execution.

Vectors offer random access to elements. They support iteration with iterators and allow dynamic resizing. They provide functions like push_back, pop_back, insert, erase, and resize. These make managing data easier without manual memory handling.

The following code fragment shows how to create a vector in C++20, reserve memory, and initialize elements. We start by creating a vector, reserving space for $10$ elements, and then resizing it to hold $10$ elements initialized to $0$.

std::vector<int> vec;
vec.reserve(10); // Reserves memory for 10 elements without changing size
vec.resize(10, 5); // Resizes the vector to 10 elements, all initialized to 0

vec.reserve(10); reserves memory for $10$ elements but doesn’t affect the vector’s size. This means no elements are created yet, but space is allocated in advance to avoid future reallocations. Then, vec.resize(10, 0); creates $10$ elements, all initialized to $5$, using the reserved memory.

The class template std::vector is part of the Standard Template Library (STL) and is a dynamic array. It allows for efficient management of collections of elements. When you write std::vector<int>, you are creating a vector where each element is of type int.

std::vector is a generic class. This means it can store elements of any type, not just integers. For example, std::vector<double> stores double values, and std::vector<std::string> stores strings. The class is defined with a template parameter that specifies the type of the elements.

Here’s an example of creating vectors with different types:

std::vector<int> intVec; // Vector of integers
std::vector<double> doubleVec; // Vector of doubles
std::vector<std::string> stringVec; // Vector of strings

When you create a std::vector<int>, the compiler generates a specialized version of the std::vector class for integers. The same applies to any other type you use with std::vector. This is the power of generic programming, the ability to write code that works with any type while maintaining type safety.

One important feature of std::vector is efficient memory management. By using methods like reserve and shrink_to_fit, you can control the vector’s capacity. Reserving memory early ensures that the vector has enough space for future elements. This avoids multiple reallocations, which can be expensive because each reallocation involves copying the entire vector to a new memory location. When you know in advance how many elements the vector will need, calling reserve improves performance by minimizing these costly reallocations. We saw reserve before so let’s take a look on shrink_to_fit.

shrink_to_fit requests the vector to reduce its capacity to match its size. This helps free unused memory after adding or removing elements. It’s not guaranteed, but it allows the system to optimize memory usage. Here’s a simple example:

std::vector<int> vec;
vec.reserve(100);  // Reserve space for 100 elements
vec.resize(10);    // Resize to hold 10 elements
vec.shrink_to_fit(); // Reduce capacity to fit size

In this code fragment, we reserve space for $100$ elements, resize it to $10$, and then call shrink_to_fit to match capacity to the actual number of elements. It ensures memory is not wasted on unused space.

Vectors classe, std::vector, also come with several built-in functions that make them easy to use. For example, push_back allows you to add an element to the end of the vector, while pop_back removes the last element. The insert function lets you add elements at specific positions within the vector. For example, inserting a value into a vector’ end:

intVec.push_back(5); // Adds the value 5 to the end of the vector

intVec.push_back(5); appends the value $5$ to the end of the vectorWhen the vector’s capacity is reached, this operation triggers a reallocation of memory, typically doubling the vector’s capacity, which incurs an $O(n)$ time complexity for copying all elements to the new memory location. The following code fragment shows how to inserts a value into a vector at position $5$, provided that the vector has at least $6$ elements:

if (vec.size() > 5) {
    vec.insert(vec.begin() + 5, 32);
}

vec.insert(vec.begin() + 5, 42); inserts the value $42$ at position $5$ in the vector vec. The condition vec.size() > 5 ensures the vector has enough elements to avoid out-of-bounds errors. This operation has a time complexity of $O(n)$ since all elements after position $5$ need to be shifted one position to the right to make room for the new value. Now, let’s see another code fragment.

if (vec.size() > 5) vec.insert(vec.begin() + 5, 42);

By removing the block braces {}, the code becomes more concise while keeping the logic intact. This is useful when clarity is maintained without additional syntax.

As we are studying competitive programming, we must try to type less and faster. This is where using comes in:

using VI = std::vector<int>; //defining an alias to std::vector<int>
VI vec;
if (vec.size() > 5) vec.insert(vec.begin() + 5, 42);

The using statement creates a shorthand for std::vector<int>, reducing typing. VI vec; now declares the vector vec as a vector of integers. The rest of the logic remains unchanged, making the code easier to write without losing clarity.

When working with data, insert is not enough. The following code fragment removes the last element from the vector, followed by the removal of the element at position $3$, assuming the vector has at least $4$ elements:

if (!vec.empty()) { //vector is not empty so we can remove the last element
    vec.pop_back();
}

if (vec.size() > 3) { //as vec size is bigger than 3 we can erase elements at position 3
    vec.erase(vec.begin() + 3);
}

vec.pop_back(); removes the last element from the vector while vec.erase(vec.begin() + 3); removes the element at position $3$. Using predefined macros, we can also reduce typing for common operations:

#define ERASE_AT(vec, pos) vec.erase(vec.begin() + pos)
if (!vec.empty()) vec.pop_back();
if (vec.size() > 3) ERASE_AT(vec, 3);

In the previous code fragment, we must use the #define macro since the alternatives won’t bring any advantages in a competitive programming environment. Creating a template function would require more typing during the competition and could potentially introduce function call overhead, while using a lambda would not only require more typing but also make the code less readable. The #define macro simply tells the preprocessor to perform a text substitution, which is both concise to write and efficiently executed, making it the most practical choice for competitive programming where typing speed and code performance are indispensable factors.

We also can create vectors with a default value in all positions. The following code fragment shows hot to creates a new integer vector with $5$ elements, all initialized to the value $7$:

std::vector<int> vec2(5, 7);

No significant reduction can be achieved here without compromising clarity.

In the next code fragment, a more complex example, the vector vec2 is resized to $10$ elements, and each element is filled with a random value between $1$ and $100$:

vec2.resize(10);

auto seed = std::chrono::high_resolution_clock::now().time_since_epoch().count();

std::xoshiro256** gen(seed);
std::uniform_int_distribution<int> distribution(1, 100);

for (size_t i = 0; i < vec2.size(); ++i) {
    vec2[i] = distribution(gen);
}

Let’s examine this code fragment thoroughly. The line vec2.resize(10); changes the size of the vector to hold $10$ elements. If the vector had fewer than $10$ elements, new ones are added. If it had more, extra elements are removed.

Next, the line auto seed = std::chrono::high_resolution_clock::now().time_since_epoch().count(); creates a seed for the random number generator. It uses the current time, converts it to a count of time units. Since Xoshiro256**[^note1] works with 64-bit integers directly, we don’t need type casting. This ensures the random numbers change each time the program runs.

The code then defines the generator std::xoshiro256** gen(seed); using the seed. This creates a Xoshiro256** random number generator, which is notably faster than Mersenne Twister while maintaining high-quality randomization. The next line, std::uniform_int_distribution<int> distribution(1, 100);, sets up the distribution. It ensures random numbers between 1 and 100 are generated evenly.

Finally, the for loop runs through each element in the vector. It assigns a random number to each position, filling the vector with random values between $1$ and $100$. We can rewrite this same code with less typing.

vec2.resize(10);

std::xoshiro256** gen(std::random_device{}());
std::uniform_int_distribution<int> dist(1, 100);

for (auto& v : vec2) v = dist(gen);

In this shorter version, we use std::random_device{}() directly as the seed source, which provides a non-deterministic seed when available. The range-based for loop makes the code more concise and readable. Both versions produce the same result: a vector of $10$ random integers between $1$ and $100$.

In C++20, we use the <random> library for generating random numbers. This library gives us flexible and efficient ways to create random data. It includes generators and distributions to produce random numbers in various forms. Generators are engines that produce random raw numbers, like std::xoshiro256**, which is one of the newest and most efficient generators in C++20; Distributions are objects that transform the raw numbers from generators into specific ranges or patterns, like uniform_int_distribution for integers in a range, or normal_distribution for normally distributed values. The Table 2.4.A presents a summary of the generators available in C++20.

Name Availability in C++ Purpose Algorithm Performance

std::mt19937 C++11 and later General purpose Mersenne Twister with period $2^{19937}-1$ Good overall performance, but requires 2.5 KB of state

std::mt19937_64 C++11 and later General purpose (64-bit) Mersenne Twister optimized for 64 bits Similar to mt19937, with better performance on 64-bit systems

std::xoshiro256** C++20 High performance Xoshiro256** (xor/shift/rotate) Excellent performance, small state (256 bits), period $2^{256}-1$

std::ranlux24 C++11 and later High statistical quality Subtract-with-borrow + discarding Lower performance due to number discarding

std::ranlux48 C++11 and later Cryptographic quality Subtract-with-borrow + intensive discarding Lower performance, higher statistical quality

std::minstd_rand C++11 and later Minimalistic Linear Congruential Generator (LCG) Very fast, but lower statistical quality

std::default_random_engine C++11 and later Implementation-specific Depends on the compiler Varies depending on implementation

Table 2.4.A: This table provides an overview of the random number generators available in C++20, including their purpose, algorithms used, and performance characteristics. It highlights the differences in performance, statistical quality, and state size among various generators.

Now, we’ll take a closer look at std::xoshiro256**, which is a modern random number generator. This generator is known for being significantly faster than Mersenne Twister while maintaining high-quality randomization. The generator works by taking a seed, typically a 64-bit integer, to initialize its random sequence. In the previous code fragment, we used:
std::xoshiro256** gen(seed);
Here, seed is a 64-bit integer. It ensures that every run of the program generates different numbers. We seed the generator using the current time from std::chrono::high_resolution_clock::now(). Since Xoshiro256** works directly with 64-bit values, we don’t need additional type casting.

The generator itself only produces random bits. To convert those bits into a useful range, we use a distribution. C++20 offers several kinds of distributions. For example, we use std::uniform_int_distribution<int>, which produces integers evenly spread across a given range:
std::uniform_int_distribution<int> distribution(1, 100);
This tells the program to create numbers between $1$ and $100$, all with equal probability. When we call:
distribution(gen);
The generator provides the random bits, and the distribution maps those bits to numbers between $1$ and $100$:

Name Availability in C++ Purpose Description Characteristics

std::uniform_int_distribution C++11 and later Uniform distribution Generates integers uniformly within a range Constant time performance, unbiased results

std::uniform_real_distribution C++11 and later Uniform distribution Generates real numbers uniformly within a range High precision, constant time performance

std::normal_distribution C++11 and later Gaussian distribution Generates values with a normal distribution Good for simulations, may have performance overhead

std::bernoulli_distribution C++11 and later Bernoulli trial Generates true/false with a given probability Fast, suitable for binary outcomes

std::binomial_distribution C++11 and later Binomial distribution Number of successes in a sequence of trials Effective for modeling discrete probability events

std::geometric_distribution C++11 and later Geometric distribution Generates number of failures before success Useful for waiting-time scenarios, fast computation

std::poisson_distribution C++11 and later Poisson distribution Models number of events in fixed interval Efficient for modeling rare events over time

std::exponential_distribution C++11 and later Exponential distribution Time between events in Poisson process Suitable for modeling interarrival times

std::gamma_distribution C++11 and later Gamma distribution Models waiting times with shape and scale Flexible, used in various statistical applications

std::weibull_distribution C++11 and later Weibull distribution Models failure times in reliability analysis Good for reliability modeling, shape-adjustable

std::chi_squared_distribution C++11 and later Chi-squared distribution Statistical measure of goodness-of-fit Common in hypothesis testing, moderate performance

std::student_t_distribution C++11 and later Student’s t-distribution Models differences between sample means Useful in statistical testing, heavy-tailed behavior

std::cauchy_distribution C++11 and later Cauchy distribution Heavy-tailed distribution, undefined variance Models outcomes with extreme values

std::lognormal_distribution C++11 and later Log-normal distribution Models growth rates of processes Skewed distribution, good for financial modeling

std::discrete_distribution C++11 and later Discrete distribution Generates integers based on a set of probabilities Useful for weighted random selection

std::piecewise_constant_distribution C++11 and later Piecewise constant distribution Generates values with constant probability density within subranges Good for custom-defined distributions

std::piecewise_linear_distribution C++11 and later Piecewise linear distribution Generates values with linear probability density within subranges Useful for modeling linear trends

Table 2.4.B: This table outlines the distributions available in C++20, describing their purpose, characteristics, and usage. It covers a range of statistical distributions, highlighting their computational performance, suitability for different scenarios, and potential applications.

C++20 keeps these ideas separate: generators produce bits, and distributions map those bits to useful values. You can also use other distributions like std::normal_distribution for normal (Gaussian) distributions or std::bernoulli_distribution for true/false outcomes.

Now that you have a understanding of the basics of std::vector, let’s move forward to one of its most common uses: sorting vectors. Sorting is a fundamental operation in computer science, and C++ provides efficient algorithms to handle this task directly. The std::vector allows seamless integration with sorting functions from the Standard Library, making it easy to sort elements in ascending, descending, or custom-defined order.

Vectors as Inputs and Outputs

In competitive programming, a common input format involves receiving the size of a vector as the first integer, followed by the elements of the vector separated by spaces, with a newline at the end. Below is an optimized version using fread for input and putchar for output, ensuring minimal system calls and fast execution.

This version reads the input, processes it, and then outputs the vector’s elements using the fastest possible I/O methods in C++.

#include <cstdio>
#include <vector>

int main() {
    // Buffer for reading input
    char buffer[1 << 16]; // 64 KB buffer size
    int idx = 0;

    // Read the entire input at once
    size_t bytesRead = fread(buffer, 1, sizeof(buffer), stdin);

    // Parse the size of the vector from the input
    int n = 0;
    while (buffer[idx] >= '0' && buffer[idx] <= '9') {
        n = n * 10 + (buffer[idx++] - '0');
    }
    ++idx; // Skip the space or newline after the number

    // Create the vector and fill it with elements
    std::vector<int> vec(n);
    for (int i = 0; i < n; ++i) {
        int num = 0;
        while (buffer[idx] >= '0' && buffer[idx] <= '9') {
            num = num * 10 + (buffer[idx++] - '0');
        }
        vec[i] = num;
        ++idx; // Skip the space or newline after each number
    }

    // Output the vector elements using putchar
    for (int i = 0; i < n; ++i) {
        if (vec[i] == 0) putchar('0');
        else {
            int num = vec[i], digits[10], digitIdx = 0;
            while (num) {
                digits[digitIdx++] = num % 10;
                num /= 10;
            }
            // Print digits in reverse order
            while (digitIdx--) putchar('0' + digits[digitIdx]);
        }
        putchar(' '); // Space after each number
    }
    putchar('\n'); // End the output with a newline

    return 0;
}

In the previous code, we have the following functions:

Input with fread: fread is used to read the entire input into a large buffer at once. This avoids multiple system calls, which are slower than reading in bulk.
Parsing the Input: The input is parsed from the buffer using simple character arithmetic to convert the string of numbers into integers.
Output with putchar: putchar is used to print the numbers, which is faster than std::cout for individual characters. The digits of each number are processed and printed in reverse order.

The previous code method minimizes system calls and avoids using slower I/O mechanisms like std::cin and std::cout, making it highly optimized for competitive programming scenarios where speed is the success key.

In competitive programming, it’s also common to handle input from a file provided via the command line. This scenario requires efficient reading and processing, especially when dealing with large datasets. Below is the optimized version using fread to read from a file specified in the command line argument and putchar for output.

Optimized Version Using `fread` and `putchar` with Command-Line File Input

This version reads the input file, processes it, and outputs the vector’s elements, ensuring fast I/O performance.

#include <cstdio>
#include <vector>

int main(int argc, char* argv[]) {
    // Check if the filename was provided
    if (argc != 2) {
        return 1;
    }

    // Open the file from the command line argument
    FILE* file = fopen(argv[1], "r");
    if (!file) {
        return 1;
    }

    // Buffer for reading input
    char buffer[1 << 16]; // 64 KB buffer size
    int idx = 0;

    // Read the entire input file at once
    size_t bytesRead = fread(buffer, 1, sizeof(buffer), file);
    fclose(file); // Close the file after reading

    // Parse the size of the vector from the input
    int n = 0;
    while (buffer[idx] >= '0' && buffer[idx] <= '9') {
        n = n * 10 + (buffer[idx++] - '0');
    }
    ++idx; // Skip the space or newline after the number

    // Create the vector and fill it with elements
    std::vector<int> vec(n);
    for (int i = 0; i < n; ++i) {
        int num = 0;
        while (buffer[idx] >= '0' && buffer[idx] <= '9') {
            num = num * 10 + (buffer[idx++] - '0');
        }
        vec[i] = num;
        ++idx; // Skip the space or newline after each number
    }

    // Output the vector elements using putchar
    for (int i = 0; i < n; ++i) {
        if (vec[i] == 0) putchar('0');
        else {
            int num = vec[i], digits[10], digitIdx = 0;
            while (num) {
                digits[digitIdx++] = num % 10;
                num /= 10;
            }
            // Print digits in reverse order
            while (digitIdx--) putchar('0' + digits[digitIdx]);
        }
        putchar(' '); // Space after each number
    }
    putchar('\n'); // End the output with a newline

    return 0;
}

In the previous code we have:

File Input with fread: The input is read from a file specified in the command line argument using fread. This reads the entire file into a buffer in one go, improving efficiency by reducing system calls.
File Handling: The file is opened using fopen and closed immediately after reading the data. This ensures that system resources are released as soon as the file reading is complete.
Parsing and Output: The rest of the program processes the input similarly to the previous version, parsing the numbers from the buffer and outputting them efficiently using putchar.

This approach remains highly optimized for competitive programming environments where fast I/O handling is critical. But, in Linux we can use mmap. As we can see in Code 2.4.A.

#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <vector>
#include <iostream>

int main(int argc, char* argv[]) {
    if (argc != 2) {
        return 1;
    }

    // Open the file
    int fd = open(argv[1], O_RDONLY);
    if (fd == -1) {
        return 1;
    }

    // Get the file size
    struct stat sb;
    if (fstat(fd, &sb) == -1) {
        close(fd);
        return 1;
    }
    size_t fileSize = sb.st_size;

    // Memory-map the file
    char* fileData = static_cast<char*>(mmap(nullptr, fileSize, PROT_READ, MAP_PRIVATE, fd, 0));
    if (fileData == MAP_FAILED) {
        close(fd);
        return 1;
    }

    close(fd); // The file descriptor can be closed after mapping

    // Parse the vector size
    int idx = 0;
    int n = 0;
    while (fileData[idx] >= '0' && fileData[idx] <= '9') {
        n = n * 10 + (fileData[idx++] - '0');
    }
    ++idx; // Skip the space or newline

    // Create the vector and fill it with values from the memory-mapped file
    std::vector<int> vec(n);
    for (int i = 0; i < n; ++i) {
        int num = 0;
        while (fileData[idx] >= '0' && fileData[idx] <= '9') {
            num = num * 10 + (fileData[idx++] - '0');
        }
        vec[i] = num;
        ++idx; // Skip the space or newline
    }

    // Output the vector
    for (const int& num : vec) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    // Unmap the file from memory
    munmap(fileData, fileSize);

    return 0;
}

Code 2.4.A: Using mmap in linux.

2.4.2. Using Matrices

In C++20, matrices are typically represented as vectors of vectors (std::vector<std::vector<T>>), where each inner vector represents a row of the matrix. This approach allows for dynamic sizing and easy manipulation of multi-dimensional data, making matrices ideal for problems involving grids, tables, or any 2D structure.

Matrices in C++ offer flexibility in managing data: you can resize rows and columns independently, access elements using intuitive indexing, and leverage standard vector operations for rows. Additionally, the use of ranges and views introduced in C++20 boosts the ability to iterate and transform matrix data more expressively and efficiently.

The use of matrices is common in competitive programming for tasks such as implementing dynamic programming tables, graph adjacency matrices, or performing transformations on 2D data. With the powerful capabilities of C++20’s STL, matrices become a highly adaptable and efficient way to handle complex, multi-dimensional computations in a structured manner.

Creating and Filling a Matrix

The code creates a 2x2 matrix (a vector of vectors) and fills each element with the value 1:

Standard Version:

int rows = 2, cols = 2;
std::vector<std::vector<int>> matrix(rows, std::vector<int>(cols));

for (int i = 0; i < rows; ++i) {
    for (int j = 0; j < cols; ++j) {
        matrix[i][j] = 1;
    }
}

In this fragment, $\text{std::vector<std::vector> matrix(rows, std::vector(cols));}$ creates a matrix of size $2\times 2$. The nested `for` loop fills each element of the matrix with $1$.

Optimized for Minimal Typing:

std::vector<std::vector<int>> matrix(2, std::vector<int>(2, 1));

This version eliminates the need for the explicit loop by using the constructor to initialize the matrix with 1s directly.

Displaying the Matrix

Finally, the matrix is printed in the standard format:

Standard Version:

for (const auto& row : matrix) {
    for (const auto& element : row) {
        std::cout << element << " ";
    }
    std::cout << std::endl;
}

The loop iterates over each row and prints all elements in the row, followed by a newline.

Optimized for Minimal Typing:

for (const auto& row : matrix) {
    for (int el : row) std::cout << el << " ";
    std::cout << "\n";
}

Here, we replaced std::endl with "\n" to improve performance by avoiding the unnecessary flushing of the output buffer.

Inserting Elements at a Specific Position

To insert an element at a specific position in a matrix (vector of vectors) in C++ 20, we use the insert function. This function can insert rows or columns in a specific location, modifying the structure of the matrix.

#include <iostream>
#include <vector>

int main() {
    std::vector<std::vector<int>> matrix = { {1, 2}, {3, 4} };

    // Insert a row at position 1
    matrix.insert(matrix.begin() + 1, std::vector<int>{5, 6});

    // Insert a column value at position 0 in the first row
    matrix[0].insert(matrix[0].begin(), 0);

    // Display the modified matrix
    for (const auto& row : matrix) {
        for (int el : row) std::cout << el << " ";
        std::cout << "\n";
    }

    return 0;
}

Code 2.4.B: Example to show how to insert a new row at position 1 and a new column value at position 0 in the first row. The result is a modified matrix.{: class=”legend”}

Removing the Last Element and a Specific Element

To remove the last element of a matrix or a specific element, you can use the pop_back function for removing the last row and the erase function for removing specific rows or columns.

#include <iostream>
#include <vector>

int main() {
    std::vector<std::vector<int>> matrix = { {1, 2}, {3, 4}, {5, 6} };

    // Remove the last row
    matrix.pop_back();

    // Remove the first element of the first row
    matrix[0].erase(matrix[0].begin());

    // Display the modified matrix
    for (const auto& row : matrix) {
        for (int el : row) std::cout << el << " ";
        std::cout << "\n";
    }

    return 0;
}

Code 2.4.C: Example to show how to removes the last row from the matrix and removes the first element of the first row.{: class=”legend”}

Creating a New Matrix with a Default Value

To create a new matrix filled with a default value, you can specify this value in the constructor of the vector.

#include <iostream>
#include <vector>

int main() {
    // Create a 3x3 matrix filled with the default value 7
    std::vector<std::vector<int>> matrix(3, std::vector<int>(3, 7));

    // Display the matrix
    for (const auto& row : matrix) {
        for (int el : row) std::cout << el << " ";
        std::cout << "\n";
    }

    return 0;
}

Code 2.4.C: Example to show how to initializes a 3x3 matrix with all elements set to $7$.{: class=”legend”}

Resizing and Filling with Random Values

To resize a matrix and fill it with random values, you can use the resize function along with the <random> library.

#include <iostream>
#include <vector>
#include <random>

int main() {
    std::vector<std::vector<int>> matrix;
    int rows = 3, cols = 3;

    // Resize the matrix
    matrix.resize(rows, std::vector<int>(cols));

    // Fill the matrix with random values
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dis(1, 10);

    for (auto& row : matrix) {
        for (auto& el : row) {
            el = dis(gen);
        }
    }

    // Display the matrix
    for (const auto& row : matrix) {
        for (int el : row) std::cout << el << " ";
        std::cout << "\n";
    }

    return 0;
}

Code 2.4.E: Example to show how resizes the matrix to 3x3 and fills it with random values between $1$ and $2.8$.{: class=”legend”}

Sorting Matrices by Rows and Columns

In C++20, we can sort matrices (represented as vectors of vectors) both by rows and by columns. Here are examples of how to do both:

2.5.7.1 Sorting by Rows

Sorting by rows is straightforward, as we can use the std::sort function directly on each row of the matrix.

#include <iostream>
#include <vector>
#include <algorithm>

int main() {
   std::vector<std::vector<int>> matrix = {
        {3, 1, 4}, {1, 5, 9}, {2, 6, 5}
    };

   // Sort each row of the matrix
   for (auto& row : matrix) {
       std::sort(row.begin(), row.end());
   }

   // Display the sorted matrix
   for (const auto& row : matrix) {
       for (int el : row) std::cout << el << " ";
       std::cout << "\n";
   }

   return 0;
}

This code sorts each row of the matrix independently. The time complexity for sorting by rows is $O(m \cdot n \log n)$, where $m$ is the number of rows and $n$ is the number of columns.

2.5.7.2 Sorting by Columns

Sorting by columns is more complex because the elements in a column are not contiguous in memory. We need to extract each column, sort it, and then put the sorted elements back into the matrix.

#include <iostream>
#include <vector>
#include <algorithm>

int main() {
   std::vector<std::vector<int>> matrix = { {3, 1, 4}, {1, 5, 9}, {2, 6, 5} };
   int rows = matrix.size();
   int cols = matrix[0].size();

   // Sort each column of the matrix
   for (int j = 0; j < cols; ++j) {
       std::vector<int> column;
       for (int i = 0; i < rows; ++i) {
           column.push_back(matrix[i][j]);
       }
       std::sort(column.begin(), column.end());
       for (int i = 0; i < rows; ++i) {
           matrix[i][j] = column[i];
       }
   }

   // Display the sorted matrix
   for (const auto& row : matrix) {
       for (int el : row) std::cout << el << " ";
       std::cout << "\n";
   }

   return 0;
}

This code sorts each column of the matrix independently. The time complexity for sorting by columns is $O(n \cdot m \log m)$, where $n$ is the number of columns and $m$ is the number of rows.

Note that this method of sorting by columns is not the most efficient for very large matrices, as it involves many data copies. For large matrices, it might be more efficient to use an approach that sorts the row indices based on the values in a specific column.

Optimizing Matrix Input and Output in Competitive Programming

Let’s explore optimized techniques in C++ that minimize system calls and maximize execution speed.

Typically, the input for a matrix consists of:

Two integers $n$ and $m$, representing the number of rows and columns, respectively.
$n \times m$ elements of the matrix, separated by spaces and newlines.

For example:

Optimized Reading with `fread`

To optimize reading, we can use fread to load the entire input at once into a buffer, then parse the numbers from the buffer. This approach reduces the number of system calls compared to reading the input one character or one line at a time.

#include <cstdio>
#include <vector>

int main() {
    char buffer[1 << 16];
    size_t bytesRead = fread(buffer, 1, sizeof(buffer), stdin);
    size_t idx = 0;

    auto readInt = [&](int& num) {
        while (idx < bytesRead && (buffer[idx] < '0' || buffer[idx] > '9') && buffer[idx] != '-') ++idx;
        bool neg = false;
        if (buffer[idx] == '-') {
            neg = true;
            ++idx;
        }
        num = 0;
        while (idx < bytesRead && buffer[idx] >= '0' && buffer[idx] <= '9') {
            num = num * 10 + (buffer[idx++] - '0');
        }
        if (neg) num = -num;
    };

    int n, m;
    readInt(n);
    readInt(m);

    std::vector<std::vector<int>> matrix(n, std::vector<int>(m));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < m; ++j) {
            readInt(matrix[i][j]);
        }
    }

    // Matrix processing...

    return 0;
}

In this code: We define a lambda function readInt to read integers from the buffer, handling possible whitespace and negative numbers. The readInt function skips over any non-digit characters and captures negative signs. This ensures robust parsing of the input data.

Optimized Output with `putchar_unlocked`

For output, using putchar_unlocked offers better performance than std::cout or even putchar, as it is not thread-safe and thus faster.

#include <cstdio>
#include <vector>

void writeInt(int num) {
    if (num == 0) {
        putchar_unlocked('0');
        return;
    }
    if (num < 0) {
        putchar_unlocked('-');
        num = -num;
    }
    char digits[10];
    int idx = 0;
    while (num) {
        digits[idx++] = '0' + num % 10;
        num /= 10;
    }
    while (idx--) {
        putchar_unlocked(digits[idx]);
    }
}

int main() {
    // Assume matrix is already populated
    int n = /* number of rows */;
    int m = /* number of columns */;
    std::vector<std::vector<int>> matrix = /* your matrix */;

    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < m; ++j) {
            writeInt(matrix[i][j]);
            putchar_unlocked(j == m - 1 ? '\n' : ' ');
        }
    }

    return 0;
}

In this code: We define a function writeInt to output integers efficiently. It handles zero and negative numbers correctly, and we use putchar_unlocked for faster character output.

putchar_unlocked is a non-thread-safe version of putchar. It writes a character to stdout without locking the output stream, eliminating the overhead associated with ensuring thread safety. This makes putchar_unlocked faster than putchar, which locks the output stream to prevent concurrent access from multiple threads.

When comparing putchar and putchar_unlocked, we find that putchar is thread-safe and locks stdout to prevent data races, but incurs overhead due to locking. On the other hand, putchar_unlocked is not thread-safe and does not lock stdout, making it faster due to the absence of locking overhead.

Here’s an example of using putchar_unlocked to output an integer efficiently:
#include <cstdio>

void writeInt(int num) {
   if (num == 0) {
       putchar_unlocked('0');
       return;
   }
   if (num < 0) {
       putchar_unlocked('-');
       num = -num;
   }
   char digits[10];
   int idx = 0;
   while (num) {
       digits[idx++] = '0' + (num % 10);
       num /= 10;
   }
   while (idx--) {
       putchar_unlocked(digits[idx]);
   }
}

int main() {
   int number = 12345;
   writeInt(number);
   putchar_unlocked('\n');
   return 0;
}
In contrast, using putchar would involve replacing putchar_unlocked with putchar:
#include <cstdio>

void writeInt(int num) {
   if (num == 0) {
       putchar('0');
       return;
   }
   if (num < 0) {
       putchar('-');
       num = -num;
   }
   char digits[10];
   int idx = 0;
   while (num) {
       digits[idx++] = '0' + (num % 10);
       num /= 10;
   }
   while (idx--) {
       putchar(digits[idx]);
   }
}

int main() {
   int number = 12345;
   writeInt(number);
   putchar('\n');
   return 0;
}
putchar_unlocked is best used in single-threaded programs where maximum output performance is required. It’s particularly >useful in competitive programming scenarios where execution time is critical and the program is guaranteed to be >single-threaded.

However, caution must be exercised when using putchar_unlocked. It is not thread-safe, and in multi-threaded applications, >using it can lead to data races and undefined behavior. Additionally, it is a POSIX function and may not be available or >behave differently on non-POSIX systems.

Both putchar and putchar_unlocked are functions from the C standard library <cstdio>, which is included in C++ for >compatibility purposes. The prototype for putchar is int putchar(int character);, which writes the character to stdout >and returns the character written, or EOF on error. It is thread-safe due to internal locking mechanisms.

The prototype for putchar_unlocked is int putchar_unlocked(int character);. It’s a faster version of putchar without >internal locking, but it’s not thread-safe and may not be part of the C++ standard in all environments.

If both performance and thread safety are needed, consider using buffered output or high-performance C++ I/O techniques. For >example:
#include <iostream>
#include <vector>

int main() {
   std::ios::sync_with_stdio(false);
   std::cin.tie(nullptr);

   std::vector<int> numbers = {1, 2, 3, 4, 5};
   for (int num : numbers) {
       std::cout << num << ' ';
   }
   std::cout << '\n';

   return 0;
}
By untethering C++ streams from C streams using std::ios::sync_with_stdio(false); and untangling cin from cout with >std::cin.tie(nullptr);, you can achieve faster I/O while maintaining thread safety and standard compliance.

The time complexity for reading and writing matrices is $O(nm)$, where $n$ and $m$ are the dimensions of the matrix. The space complexity is also $O(nm)$, as we store the entire matrix in memory. However, the constant factors are significantly reduced compared to standard I/O methods, leading to faster execution times in practice.

Using `mmap` on Unix Systems for Matrices

On Unix systems, we can use mmap to map a file (or standard input) directly into memory, potentially improving I/O performance even further.

#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <vector>
#include <cstdio>

int main() {
    struct stat sb;
    fstat(0, &sb); // File descriptor 0 is stdin
    size_t fileSize = sb.st_size;
    char* data = static_cast<char*>(mmap(nullptr, fileSize, PROT_READ, MAP_PRIVATE, 0, 0));

    size_t idx = 0;

    auto readInt = [&](int& num) {
        while (idx < fileSize && (data[idx] < '0' || data[idx] > '9') && data[idx] != '-') ++idx;
        bool neg = false;
        if (data[idx] == '-') {
            neg = true;
            ++idx;
        }
        num = 0;
        while (idx < fileSize && data[idx] >= '0' && data[idx] <= '9') {
            num = num * 10 + (data[idx++] - '0');
        }
        if (neg) num = -num;
    };

    int n, m;
    readInt(n);
    readInt(m);

    std::vector<std::vector<int>> matrix(n, std::vector<int>(m));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < m; ++j) {
            readInt(matrix[i][j]);
        }
    }

    munmap(data, fileSize);

    // Matrix processing...

    return 0;
}

Note: Using mmap can be risky, as it relies on the entire input being available and may not be portable across different systems or handle input streams properly. Use it only when you are certain of the input’s nature and when maximum performance is essential.

Remember: The efficiency of these approaches comes at the cost of increased code complexity and reduced readability. In scenarios where performance is not critical, standard I/O methods are preferable for their simplicity and maintainability.

2.5 Fundamental Tool: Sorting

In C++, sorting a vector is typically done using the std::sort algorithm, which has a time complexity of $O(n \log n)$.. This algorithm employs highly optimized techniques, ensuring efficient sorting even for large datasets. Next, we will explore how to implement sorting operations on vectors, covering different sorting criteria and practical examples.

In C++20, std::sort is part of the <algorithm> header, retaining a similar interface to previous versions but with performance improvements and support for new C++20 features like ranges.

The sorting function uses the Introsort algorithm (a combination of QuickSort, HeapSort, and InsertionSort) to ensure an average time complexity of $O(n \log n)$ and is highly optimized for various data types.

Introsort, short for Introspective Sort¹, is the underlying algorithm used by std::sort in the C++ Standard Library. It is designed to be an efficient hybrid sorting algorithm that adapts its behavior based on the input data characteristics. Introsort was invented by David Musser in 1997 and is chosen for its combination of speed, worst-case performance guarantees, and memory efficiency.

Introsort starts by using QuickSort, a fast divide-and-conquer sorting algorithm. However, if the recursion depth becomes too large, indicating the possibility of poor performance (close to $O(n^2)$), it switches to HeapSort, which guarantees $O(n \log n)$ complexity. Finally, when the remaining partitions are small (typically fewer than 16 elements), it switches to InsertionSort, which is efficient for small arrays. Going deeper:

QuickSort Phase: Initially, Introsort uses QuickSort, partitioning the array around a pivot and recursively sorting the subarrays. QuickSort is selected for its average-case complexity of $O(n \log n)$ and excellent practical performance. The pivot selection in modern implementations of Introsort often uses the median-of-three method, which improves partitioning stability. The median-of-three deserves a bit of attention:

The median-of-three method is a technique used in sorting algorithms, particularly in QuickSort and Introsort, to select a better pivot element. The pivot selection in QuickSort is crucial for its efficiency, as it determines how well the array is partitioned at each step. Poor pivot selection can lead to imbalanced partitions and degraded performance, potentially reaching $O(n^2)$ time complexity. The median-of-three approach helps mitigate this issue by choosing the pivot as the median of three specific elements in the array:

First element;

Middle element;

Last element.

The median of these three values is then used as the pivot, rather than just using the first or last element, which could be problematic for sorted or nearly sorted arrays. Let’s work an example:

Consider an array $[5, 1, 9, 4, 6, 7, 3, 8, 2]$.

First element: $5$

Middle element: $6$ (index 4)

Last element: $2$

Now, we compare these three values: $5$, $6$, and $2$. The median of $5$, $6$, and $2$ is $5$. So, $5$ becomes the pivot. Or, using C++:
int medianOfThree(int a, int b, int c) {
   if ((a > b) == (a < c)) return a; // a é a mediana
   else if ((b > a) == (b < c)) return b; // b é a mediana
   else return c; // c é a mediana
}
The median-of-three technique improves QuickSort’s performance by avoiding Worst-Case Partitions. In basic QuickSort, using the first or last element as the pivot can lead to worst-case scenarios, especially for already sorted or nearly sorted arrays. The median-of-three minimizes the chance of selecting a very poor pivot by making a more balanced choice. Also, the median-of-three improves partitioning quality. By choosing the median of three, the partitioning step becomes more likely to divide the array into relatively equal-sized subarrays, which is essential for achieving the $O(n \log n)$ average-case complexity. Finally, the median-of-three technique requires only a few additional comparisons, making it a low-cost improvement in most cases.

In C++, the median-of-three is often used as part of the partitioning logic in QuickSort and Introsort implementations:
   #include <iostream>
   #include <vector>

   // Helper function to find the median of three
   int medianOfThree(int a, int b, int c) {
       if ((a > b) == (a < c)) return a; // a is the median
       else if ((b > a) == (b < c)) return b; // b is the median
       else return c; // c is the median
   }

   int main() {
       std::vector<int> arr = {5, 1, 9, 4, 6, 7, 3, 8, 2};
       int first = arr[0];
       int middle = arr[arr.size() / 2];
       int last = arr[arr.size() - 1];

       int pivot = medianOfThree(first, middle, last);

       std::cout << "The median-of-three pivot is: " << pivot << std::endl;
       return 0;
   }
Code XXXX:the function medianOfThree determines the median value among the three candidates, which is then used as the pivot for sorting

HeapSort Phase: If the recursion depth exceeds a certain limit (usually set to $2 \log n$), Introsort switches to HeapSort. This switch ensures a worst-case time complexity of $O(n \log n)$, as HeapSort is not sensitive to bad pivot choices like QuickSort. HeapSort is used to prevent QuickSort’s performance degradation on certain pathological cases, such as sorted or nearly sorted arrays.

InsertionSort Phase: When partitions become small (typically fewer than 16 elements), Introsort switches to InsertionSort. InsertionSort is faster for small arrays due to its lower overhead, despite having a $O(n^2)$ worst-case complexity. This phase takes advantage of the partially sorted state left by previous steps, making InsertionSort very efficient for these cases. Table XXXX has a summary of Introsort characteristics.

Property Description

Average-case time complexity $O(n \log n)$

Worst-case time complexity $O(n \log n)$, due to the switch to HeapSort when necessary

Space complexity $O(\log n)$ for the recursion stack

Stable vs. unstable sorting Introsort is an unstable sort, meaning that equal elements might not retain their original order after sorting

In-place sorting Introsort is an in-place sort, meaning it requires a constant amount of additional memory beyond the input array

Table XXXX: Introsort characteristics

Introsort is used in std::sort and std::ranges::sort because of its adaptive behavior, guaranteed performance, and efficiency. It adjusts based on input data characteristics, avoiding QuickSort’s $O(n^2)$ worst-case time complexity, and provides speed and efficiency across different data types and distributions.

Using Introsort in std::ranges::sort, the following code sorts the vector vec2 in ascending order. While the traditional approach using std::sort is still valid, C++20 offers a more concise syntax:

// Traditional C++17 way - still perfectly valid
std::sort(vec2.begin(), vec2.end());

// C++20 way - shorter and cleaner
std::ranges::sort(vec2);

One of the most powerful features of C++20’s ranges is the ability to compose algorithms through views. Views allow you to create a pipeline of operations that can be chained together without creating intermediate containers. For example, when sorting, you might want to filter elements, transform them, and then sort the results. With ranges, this becomes elegant and efficient:

#include <ranges>
#include <vector>
#include <algorithm>

std::vector<int> numbers = {1, -3, 4, -2, 5, -6};

// Create a pipeline: filter positive numbers, transform to squares
auto view = numbers 
    | std::views::filter([](int n) { return n > 0; })
    | std::views::transform([](int n) { return n * n; });

// Materialize the view into a vector in C++20
std::vector<int> result(view.begin(), view.end());

// Sort the result
std::ranges::sort(result);

// result now contains {1, 16, 25}

Code XXXX: Using std::ranges e std::views

This composability is particularly useful in competitive programming scenarios where you need to perform multiple operations on data before sorting. The views are lazy-evaluated, meaning they don’t create intermediate collections, making them memory-efficient. Each element flows through the entire pipeline only once, improving performance compared to separate operations that would each traverse the entire container.

The C++20 standard introduced std::ranges and std::views as a modern approach to handle sequences of elements. A range represents any sequence that provides access to its beginning and end through begin() and end() functions. In mathematical terms, a range $[a,b)$ defines a sequence where each element $x$ satisfies $a \leq x < b$.

The std::views namespace provides tools to create lightweight “views” of sequences. A view doesn’t own or modify the underlying data - it presents a different way of looking at the same sequence. When we create a view, we’re defining a transformation that will be applied to elements as we iterate over them. This lazy evaluation means no temporary containers are created.

The std::views::filter creates a view that includes only elements satisfying a given predicate. For a sequence $S$ and a predicate $P(x)$, the filtered view represents the mathematical set ${x \in S \mid P(x)}$. For example:
std::vector<int> numbers = {1, -2, 3, -4, 5};
auto positive = numbers | std::views::filter([](int n) { return n > 0; });
// positive represents the sequence {1, 3, 5}
The std::views::transform applies a function to each element in a sequence. Given a function $f(x)$, the transformed view represents the set ${f(x) \mid x \in S}$. Consider:
auto squares = numbers | std::views::transform([](int n) { return n * n; });
// squares represents the sequence {1, 4, 9, 16, 25}
In C++20, to materialize a view into a container, we use iterator construction or std::ranges::copy:
// Using iterator construction
std::vector<int> result1(view.begin(), view.end());

// Or using ranges::copy
std::vector<int> result2;
std::ranges::copy(view, std::back_inserter(result2));
These operations can be combined using the pipe operator (|), creating a pipeline where each operation feeds into the next:
auto view = numbers 
    | std::views::filter([](int n) { return n > 0; })
    | std::views::transform([](int n) { return n * n; });

// Materialize the view
std::vector<int> result(view.begin(), view.end());
This creates a mathematical transformation that can be expressed as:
\[\{ x^2 \mid x \in \text{numbers} \land x > 0 \}\]
The power of views lies in their lazy evaluation. When we create a pipeline of view operations, no computation happens until we actually need the results. Each element flows through the entire pipeline only when requested. This means that with complex transformations, we only traverse the sequence once, and we never create temporary containers for intermediate results.

This lazy evaluation becomes particularly valuable when working with large sequences or when only a portion of the results will be used. Unlike traditional algorithms that would process the entire sequence immediately, views allow us to define the transformations we want and only compute the values we actually need, when we need them. This is especially useful in competitive programming scenarios where memory efficiency is crucial.

Views can also be combined with other range algorithms. After materializing a view, we can apply any of the C++20 range algorithms to the result. This creates a powerful tool set for expressing complex sequence operations in a clear, readable manner while maintaining high performance.

Note: C++23 introduces std::ranges::to which provides a more convenient syntax for materializing views: auto result = view | std::ranges::to<std::vector>();. However, in C++20 we must use the approaches shown above.

The compile-time behavior of ranges also deserves attention. When using ranges, the C++20 compiler provides clearer and more helpful error messages when constraints are not met. For example, if we try to use std::ranges::sort with a forward list (which doesn’t provide random access), the compiler will give us a clear message about the constraint failure:
std::forward_list<int> numbers = {3, 1, 4, 1, 5};
std::ranges::sort(numbers);  // Won't compile
Instead of the cryptic template instantiation errors common in pre-C++20 code, we get a clear message indicating that std::forward_list doesn’t satisfy the random_access_range concept required by sort. Similarly, if we try to use a view with incompatible element types, the compiler will point out exactly where and why the constraint failed:
std::vector<std::string> words = {"hello", "world"};
auto view = words | std::views::transform([](int n) { return n * 2; });  // Won't compile
The compiler will clearly indicate that we’re trying to use an int parameter in a lambda that’s receiving a std::string. These improved error messages make it easier to identify and fix type-related issues when working with ranges and views.

For competitive programming, we might want a shorter syntax. While macros are one option:

#define SORT(v) std::ranges::sort(v)
SORT(vec2);

A type-safe alternative using C++20 features would be a simple constexpr lambda:

constexpr auto sort = [](auto& v) { 
   std::ranges::sort(v); 
};
sort(vec2);

For cases where custom comparators are needed (like sorting in descending order), both approaches work:

// Traditional way
std::sort(vec2.begin(), vec2.end(), std::greater<>());

// C++20 way
std::ranges::sort(vec2, std::greater<>());

// Or using our lambda
constexpr auto rsort = [](auto& v) { 
   std::ranges::sort(v, std::greater<>()); 
};

rsort(vec2);

The time complexity remains $O(n \log n)$ where $n$ is the number of elements in the vector. For competitive programming, both traditional and C++20 approaches are equally efficient, but the C++20 syntax can save a few keystrokes while maintaining type safety.

Note: While C++20 offers more advanced features like concepts and complex constraints, in competitive programming scenarios, the simpler approaches shown above are usually sufficient and more practical under time pressure.

We can use std::ranges::sort with std::greater<>() comparator to sort elements in descending order. The implementation requires the ranges header and has a time complexity of $O(n \log n)$ where $n$ is the number of elements.

#include <ranges>
#include <vector>

std::vector<int> vec = {5, 2, 9, 1, 5, 6};
std::ranges::sort(vec, std::greater<>());

This code fragment shows how to create a sequence where each element $a_i$ and $a_{i+1}$ satisfy:

\[a_i \geq a_{i+1} \text{ for } i \in [0, n-1]\]

std::ranges::sort allows the creation of custom projections and comparators (e.g. Odd Numbers First). The following code fragment implements this using C++20 features:

constexpr auto odd_even_sort = [](std::ranges::input_range auto&& range) {
   std::ranges::sort(range, {}, [](int x) { 
       return std::pair{x % 2 == 0, x}; 
   });
};

std::vector<int> vec = {5, 2, 9, 1, 5, 6};
odd_even_sort(vec);

In this case, for elements $a$ and $b$, the ordering follows:

\[\begin{cases} \text{true} & \text{if } a \bmod 2 > b \bmod 2 \\ a < b & \text{if } a \bmod 2 = 1 \text{ and } b \bmod 2 = 1 \\ a > b & \text{if } a \bmod 2 = 0 \text{ and } b \bmod 2 = 0 \end{cases}\]

The next example uses a projection to sort elements based on their remainder when divided by 3. The operation has time complexity $O(n \log n)$.

std::vector<int> vec = {5, 2, 9, 1, 5, 6};
std::ranges::sort(vec, {}, [](int x) { return x % 3; });

For elements $a$ and $b$, the ordering is determined by:

\[a \prec b \iff a \bmod 3 < b \bmod 3\]

In addition to basic types, std::ranges::sort efficiently handles custom data types through projections and comparators. The following code fragment demonstrates sorting a vector of student records based on multiple criteria: GPA (descending) and age (ascending when GPAs are equal).

struct Student {
   std::string name;
   double gpa;
   int age;
   
   Student(std::string n, double g, int a) 
       : name(n), gpa(g), age(a) {}
};

std::vector<Student> students = {
   {"Alice", 3.8, 20},
   {"Bob", 3.8, 19},
   {"Charlie", 3.9, 22},
   {"David", 3.7, 21}
};

constexpr auto float_eq = [](double a, double b) {
   return std::abs(a - b) < 
          std::numeric_limits<double>::epsilon() * 
          std::max(std::abs(a), std::abs(b));
};

std::ranges::sort(students, {}, [&](const Student& s) {
   return std::tuple{-s.gpa, s.age};  // negative gpa for descending order
});

In this case, the time complexity remains $O(n \log n)$ where $n$ is the number of students. The comparator performs constant time operations:

\[\begin{cases} \text{Compare GPAs} & O(1) \\ \text{Compare ages if GPAs equal} & O(1) \end{cases}\]

When dealing with floating-point comparisons, the code uses $\epsilon$-comparison ($1e-9$) to handle potential floating-point precision issues:

\[|a - b| < \epsilon \implies a \approx b\]

The space complexity is $O(1)$ as the comparison operations require only a constant amount of additional memory, independent of the input size.

This example demonstrates a composite projection using std::tuple. In C++20, we can go further and chain projections in different ways. For instance, if we want to normalize GPA scores before sorting:

auto view = students 
    | std::views::transform([](Student& s) {
        s.gpa = std::clamp(s.gpa, 0.0, 4.0); 
        return s;
    });
std::ranges::sort(view, {}, [](const Student& s) {
    return std::tuple{-s.gpa, s.age};
});

Projections in C++20 provide a powerful way to customize how algorithms view and process data. A projection is a unary callable that transforms elements before they are processed by the algorithm, without modifying the original data. While the previous examples showed basic projections, C++20’s design allows for sophisticated projection patterns:

Projections can transform data without copying:
std::ranges::sort(students, {}, &Student::gpa);  // Project directly to gpa
Multiple transformations can be composed:
 std::ranges::sort(students, {}, [](const Student& s) {
    return std::make_tuple(
        s.gpa / 4.0,  // Normalize GPA
        s.age,        // Secondary key
        s.name        // Tertiary key
    );
});
Projections can be used with views:
auto ranked_students = students
    | std::views::transform([](const Student& s) {
        return std::tuple{s.department, s.gpa}; 
    })
    | std::views::filter([](const auto& t) { 
        return std::get<0>(t) == "CS"; 
    });
std::ranges::sort(ranked_students);
In each case, the projection maintains the algorithm’s original complexity ($O(n \log n)$ for sort) while providing a clean separation between data access and algorithmic logic. This separation makes the code more maintainable and allows for flexible composition of operations, a key feature for competitive programming where quick modifications might be needed.

These projection capabilities in C++20’s std::ranges::sort provide a robust foundation for complex sorting operations. However, sometimes maintaining the relative order of equal elements is crucial - a property that std::ranges::sort doesn’t guarantee. For these cases, C++20 provides std::ranges::stable_sort, which ensures that elements comparing equal maintain their original relative order. Let’s explore how stable sorting works and when to use it.

2.5.1 Stable Sorting of Vectors

In C++, sorting a vector while preserving the relative order of equal elements is achieved using std::stable_sort from the <algorithm> header. Starting with C++20, stable_sort fully integrates with concepts, requiring elements to satisfy std::sortable, which encompasses std::movable and std::totally_ordered. This function ensures that elements considered equal retain their original order after sorting², which is essential in applications requiring stable sorting, such as when sorting records based on multiple criteria.

Concepts are predicates on types that can be used to constrain templates. The std::stable_sort uses several important concepts:

std::sortable: The fundamental concept required for types that can be sorted

std::movable: Ensures the type can be efficiently moved

std::totally_ordered: Ensures the type has a well-defined total ordering

std::totally_ordered_with<T, U>: Used when comparing two different types

Usage example:
template<typename T>
requires std::sortable<T>
void sort_data(std::vector<T>& data) {
    std::stable_sort(data.begin(), data.end());
}
Concepts make template errors clearer and allow the compiler to check type requirements at compile time. The Code XXXX, show’s a example of all those concepts use.

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
#include <concepts>
#include <compare>

    /**
    * @brief Student structure that demonstrates the implementation of C++20 concepts
    * 
    * This structure satisfies multiple C++20 concepts:
    * 1. std::movable - Through move constructor and move assignment operator
    * 2. std::totally_ordered - Through the spaceship operator (<=>)
    * 3. std::sortable - Implicitly through the combination of movable and totally_ordered
    * 
    * The structure also demonstrates proper memory management practices with
    * the rule of five implementation.
    */
struct Student {
    std::string name;
    int id;
    float grade;

    /**
    * @brief Implements three-way comparison operator (spaceship operator)
    * 
    * The '= default' implementation automatically provides:
    * - All six comparison operators (<, <=, >, >=, ==, !=)
    * - Member-wise comparison in declaration order
    * - Type safety through proper return type deduction
    * 
    * This satisfies std::totally_ordered concept requirements.
    */
    auto operator<=>(const Student&) const = default;
    
    /**
    * @brief Move constructor and assignment operator
    * 
    * These operations are marked noexcept to guarantee strong exception safety
    * and to satisfy std::movable concept requirements.
    * The '= default' implementation provides efficient member-wise moving.
    */
    Student(Student&&) noexcept = default;
    Student& operator=(Student&&) noexcept = default;
    
    /**
    * @brief Custom constructor using member initializer list
    * 
    * Note the use of std::move for the string member to prevent unnecessary copying.
    * This is an optimization that's especially important when dealing with 
    * resource-managing types like std::string.
    */
   Student(std::string n, int i, float g) 
       : name(std::move(n)), id(i), grade(g) {}

    /**
    * @brief Copy operations
    * 
    * Even though we implement move operations, we still want our type to be copyable.
    * The '= default' implementation provides member-wise copying.
    */
    Student(const Student&) = default;
    Student& operator=(const Student&) = default;
};

/**
* @brief Utility function to print Student information
* 
* Takes a const reference to prevent copying and allow printing of both 
* lvalues and rvalues without modification.
*/
void print_student(const Student& s) {
    std::cout << "Name: " << s.name 
              << ", ID: " << s.id 
              << ", Grade: " << s.grade << "\n";
}

/**
* @brief Compile-time concept verification
* 
* These static_assert statements verify at compile-time that our Student type
* satisfies the required concepts. If any assertion fails, we get a clear
* compilation error instead of cryptic template error messages.
*/
static_assert(std::movable<Student>, 
   "Student must satisfy std::movable to ensure efficient container operations");
static_assert(std::totally_ordered<Student>, 
     "Student must satisfy std::totally_ordered to enable sorting and comparison");

/**
* @brief Template function demonstrating concept constraints
* 
* This function template shows how to use concepts in constraints:
* - std::sortable requires that elements can be sorted (moved and compared)
* - std::totally_ordered ensures all comparison operators are available
* 
* @tparam T The type of elements to sort
* @param data Vector of elements to be sorted
*/
template<typename T>
requires std::sortable<T*> && std::totally_ordered<T>
void stable_sort_data(std::vector<T>& data) {
    std::stable_sort(data.begin(), data.end());
}

/**
* @brief Demonstrates std::totally_ordered_with concept
* 
* This template function shows how to compare different types that have a
* valid ordering relationship between them. The concept ensures that both
* types can be meaningfully compared.
* 
* @tparam T First type to compare
* @tparam U Second type to compare
* @return bool Result of the comparison
*/
template<typename T, typename U>
requires std::totally_ordered_with<T, U>
bool compare_different_types(const T& t, const U& u) {
    return t < u;  // Valid because T and U satisfy totally_ordered_with
}

int main() {
    // Creating a test vector with intentionally duplicated grades
    // to demonstrate the stability of std::stable_sort
    std::vector<Student> students = {
        {"Alice", 2, 85.5f},
        {"Bob", 1, 85.5f},    // Same grade as Alice
        {"Charlie", 3, 90.0f},
        {"David", 4, 85.5f}   // Same grade as Alice and Bob
    };

    std::cout << "Original order:\n";
    for (const auto& s : students) print_student(s);

    /**
    * Using std::stable_sort with a projection.
    * Key points:
    * 1. std::less<>{} - Empty comparator using type deduction
    * 2. &Student::grade - Member pointer projection
    * 3. Stability guarantee - Equal elements maintain relative order
    */
    std::stable_sort(students.begin(), students.end(),
                    std::less<>{},        // Comparator
                    &Student::grade);      // Projection

    std::cout << "\nAfter stable sort by grade (notice preserved order of equal grades):\n";
    for (const auto& s : students) print_student(s);

    // Using our concept-constrained template function
    std::cout << "\nUsing template function with concepts (sorts using operator<=):\n";
    stable_sort_data(students);  // Will use the spaceship operator
    for (const auto& s : students) print_student(s);

    // Demonstrating std::totally_ordered_with between different types
    Student s1{"Test", 1, 90.0f};
    float grade = 85.0f;
    
    /**
     * This comparison is possible because:
     * 1. Student::grade is a float
     * 2. float satisfies totally_ordered_with float
     * 3. Our compare_different_types function is properly constrained
     */
    bool result = compare_different_types(s1.grade, grade);
    std::cout << "\nComparing different types (grade comparison): " 
              << (result ? "greater" : "less") << "\n";

    return 0;
}

Code XXX: put legend here

The static_assert mechanism in C++ provides compile-time verification of conditions. When combined with C++20 concepts, it becomes an essential tool for type checking and constraint validation during compilation. The basic syntax is straightforward:
static_assert(constant_expression, message);
One of the fundamental uses of static_assert is verifying type properties. For instance, when working with container types, we often need to ensure that our types support move operations. Here’s how we verify that a type satisfies the movable concept:
static_assert(std::movable<Student>, 
    "Student type must satisfy movable requirements for container operations");
When the compiler processes this static_assert, it verifies that Student implements the necessary operations for the std::movable concept. This includes checking for:
Student(Student&&) noexcept;              // Move constructor
Student& operator=(Student&&) noexcept;   // Move assignment operator
When working with sorting algorithms, we need to ensure our types can be properly ordered. The totally_ordered concept helps verify this requirement:
static_assert(std::totally_ordered<Student>,
    "Student type must provide consistent ordering operations");
This verification ensures that Student provides all necessary comparison operations, either through individual operators or through the three-way comparison operator (<=>). A complete implementation might look like:
struct Student {
    std::string name;
    float grade;
    
    // The spaceship operator provides all comparison operations
    auto operator<=>(const Student&) const = default;
    
    // Move operations
    Student(Student&&) noexcept = default;
    Student& operator=(Student&&) noexcept = default;
};
C++20 allows us to create more complex requirements by combining concepts. Here’s how we might define and check a custom concept for student-like types:
template<typename T>
concept StudentLike = requires(T t) {
    { t.name } -> std::convertible_to<std::string>;
    { t.grade } -> std::convertible_to<float>;
    requires std::movable<T>;
 requires std::totally_ordered<T>;
};

static_assert(StudentLike<Student>,
    "Student type must provide name and grade accessors");
When a static_assert fails, the compiler provides detailed information about the failure. For example, trying to use a type with const members that prevent moving:
struct StudentWithConstMember {
    const std::string name;  // const prevents moving
    float grade;
};

// This would fail at compile time with a detailed error message
// static_assert(std::movable<StudentWithConstMember>);
The compiler would generate an error message explaining that the type cannot be moved due to the const member.

Template Code Verification: Static assertions are particularly useful in template code to verify that template parameters meet necessary requirements:
template<typename T>
void process_student_data(std::vector<T>& data) {
    static_assert(StudentLike<T>,
        "Template type must provide student-like interface");
    // Processing code
}
Complex Requirements: We can create more specific checks by combining concepts with type traits. This allows us to verify multiple properties simultaneously:
static_assert(std::sortable<Student*>,
    "Student type must support sorting operations");

static_assert(std::is_nothrow_move_constructible_v<Student> &&
              std::totally_ordered<Student>,
    "Student type must support efficient operations");
Remember that static_assert operates at compile time, which means it can verify properties of types but not properties of objects:
void example_function(const std::vector<Student>& students) {
    // This wouldn't compile - size() is a runtime value
    // static_assert(students.size() > 0);
    
    // Instead, we verify type properties
    static_assert(std::ranges::range<decltype(students)>);
}
Static assertions help catch potential issues early in the development process, provide clear error messages when requirements aren’t met, and serve as inline documentation of type requirements. When used with C++20 concepts, they create a powerful system for ensuring type safety and correct usage of template code.

Unlike std::sort, which may not maintain the original order of equal elements due to its underlying algorithm, std::stable_sort guarantees stability by using a variation of the Merge Sort algorithm². Merge Sort has a time complexity of $O(n \log n)$ and is inherently stable because it preserves the order of equal elements during the merge steps.

For example, consider a vector of Person structures. We can define the Person struct without unnecessary templates and ensure type safety by correctly using C++20 features:

struct Person {
    std::string name;
    int age;
};

std::vector<Person> people = {
    {"Alice", 30},
    {"Bob", 25},
    {"Charlie", 25},
    {"David", 35}
};

If we want to sort people by age while maintaining the original order among people with the same age, we can use std::stable_sort with C++20 projections:

// Using C++20 projections with std::stable_sort
std::stable_sort(people.begin(), people.end(), std::less<>{}, &Person::age);

// Or using the ranges version (C++20)
std::ranges::stable_sort(people, std::less<>{}, &Person::age);

Where std::less<>{} is the default comparator that uses the < operator. &Person::age is the projection that tells the sort function to compare the age member of Person objects.

In C++20, projections allow algorithms to operate on a specific member or property of an object. A projection maps an object to a value used for comparisons or other operations. This is done by passing a pointer to the member variable or member function to the algorithm.

When sorting objects, projections extract the value of a specified member from each element in the collection. For example, sorting a vector of Person objects by age uses a projection that extracts the age member from each Person.

Example of using projections with std::ranges::stable_sort:
#include &lt;algorithm&gt;
#include &lt;vector&gt;
#include &lt;string&gt;
#include &lt;ranges&gt;

struct Person {
    std::string name;
    int age;
};

int main() {
    std::vector&lt;Person&gt; people = {
        {"Alice", 30},
        {"Bob", 25},
        {"Charlie", 25},
        {"David", 35}
    };
    
    // Sorting with a projection on the 'age' member
    std::ranges::stable_sort(people, std::less&lt;&gt;{}, &amp;Person::age);
    
    return 0;
}
In this code, &Person::age extracts the age member from each Person object in the vector. The comparison uses these extracted values, sorting the vector based on age.

Consider a vector $V$ containing $n$ elements of type T. Each element $v_i$ in $V$ has a member $m$, represented as $v_i.m$. When using a projection in a sorting algorithm:
\[\text{compare}(v_i.m, v_j.m) \quad \forall \, i, j \in [0, n-1]\]
Here, the projection extracts the member $m$ from each element $v_i$, and the sorting is based on $m$. The sorting algorithm rearranges the elements such that:
\[v_i.m \leq v_{i+1}.m \quad \text{for all} \, i \in [0, n-2]\]
Projections can be used in any C++20 algorithm that accepts comparators, enabling operations based on specific members of objects.

Comparison Operators in C++20: C++20 introduces new comparison operators and concepts, enhancing type safety and consistency. The standard library includes several predefined comparators, simplifying sorting and related algorithms. Among these are std::less<>, std::greater<>, and the three-way comparison operator (<=>), also known as the spaceship operator.

Standard Comparators: std::less<> and std::greater<>. std::less<> performs a less-than comparison ($<$), returning true if the first argument is less than the second. It is often used as the default comparator in sorting functions.

Example of using std::less<> in sorting:
#include <algorithm>
#include <vector>

std::vector<int> numbers = {4, 2, 5, 1, 3};

int main() {
    // Sorting with std::less<>
    std::sort(numbers.begin(), numbers.end(), std::less<>{});
    
    return 0;
}
Similarly, std::greater<> is a comparator that represents the greater-than operator ($>$). It can be used to sort elements in descending order:
#include <algorithm>
#include <vector>

std::vector<int> numbers = {4, 2, 5, 1, 3};

int main() {
    // Sorting with std::greater<>
    std::sort(numbers.begin(), numbers.end(), std::greater<>{});
    
    return 0;
}
Both comparators are part of the function objects defined in <functional> header, making comparisons straightforward and intuitive.

The Three-Way Comparison Operator (<=>): The new kid in the block, the three-way comparison operator, allows comparisons to be performed in a single function, returning an integer-like result that indicates whether the left operand is less than, equal to, or greater than the right operand. This operator supports all relational comparisons (<, <=, >, >=, ==, !=).

To use the operator, a type must implement operator<=>. It can be defined explicitly or generated automatically using = default:
#include <compare>

struct Number {
    int value;
    
    // Three-way comparison operator
    auto operator<=>(const Number&) const = default;
};

int main() {
    Number a{5}, b{10};
    
    if (a < b) {
        // This uses the <=> operator for comparison
    }
    
    return 0;
}
When operator<=> is defined, the relational comparison results align as follows:
\[x <=> y = \begin{cases} -1 & \text{if } x < y \\ 0 & \text{if } x = y \\ 1 & \text{if } x > y \end{cases}\]
Mathematical Interpretation: The standard comparators correspond to the following relations:

std::less<>(x, y):
\[x < y \quad \text{if and only if} \quad \text{std::less<>{}}(x, y) = \text{true}\]

std::greater<>(x, y):

\[x > y \quad \text{if and only if} \quad \text{std::greater<>{}}(x, y) = \text{true}\]
The three-way comparison operator combines all relational checks into one expression, reducing redundancy and making comparisons more uniform. This operator simplifies the comparison logic in functions, providing clear outcomes for each comparison case.

By using std::stable_sort, we ensure that if two Person objects have the same age, their relative order in the original vector is preserved. So, after sorting, “Bob” and “Charlie”, both aged 25, will remain in their original order (“Bob” before “Charlie”) in the sorted vector. Also, after sorting, “Bob” and “Charlie”, both aged 25, will remain in their original order (Bob before Charlie) in the sorted vector.

The key difference between std::sort and std::stable_sort lies in their stability and underlying algorithms. std::sort uses Introsort, which is a hybrid of QuickSort, HeapSort, and InsertionSort, and is not guaranteed to be stable. In contrast, std::stable_sort uses a stable sorting algorithm, typically a hybrid of Merge Sort and InsertionSort, to ensure stability.

In competitive programming, using std::stable_sort can be advantageous when sorting data based on multiple criteria. With C++20, we can leverage ranges and views for more expressive and composable sorting operations. Instead of writing complex comparison functions to handle all sorting criteria at once, one can perform multiple stable sorts in succession using either traditional iterators or the new ranges interface. By sorting first on the least significant criterion and last on the most significant one, each stable sort preserves the ordering of the previous sorts for elements considered equal. For instance, suppose we have a list of students with name, grade, and age, and we want to sort them by grade in descending order, then by age in ascending order, and finally by name alphabetically. We can achieve this using either the traditional approach or C++20 ranges:

// Traditional approach with C++20 projections
std::stable_sort(students.begin(), students.end(), std::less{}, &Student::name);
std::stable_sort(students.begin(), students.end(), std::less{}, &Student::age);
std::stable_sort(students.begin(), students.end(), std::greater{}, &Student::grade);

// Using C++20 ranges and views
auto sorted_students = students 
    | std::views::stable_sort(std::less{}, &Student::name)
    | std::views::stable_sort(std::less{}, &Student::age)
    | std::views::stable_sort(std::greater{}, &Student::grade);

This sequence of stable sorts ensures that the final ordering reflects all the sorting criteria, with stability preserving the relative order from previous sorts.

Regarding the time and space complexity of std::stable_sort, its time complexity remains $O(n \log n)$, similar to std::sort. Starting with C++20, we can also use execution policies for potential parallel execution:

std::stable_sort(std::execution::par_unseq, 
                students.begin(), students.end(),
                std::less{}, &Student::grade);

However, the space complexity of std::stable_sort can be higher than that of std::sort. Specifically, std::stable_sort may require additional memory proportional to $O(n)$ in the worst case. This extra space is needed because the merging process in Merge Sort requires temporary storage to combine the sorted subarrays. During the merge step, elements from the divided subarrays are copied into a temporary array to ensure that the stability of the sort is maintained—that is, equal elements retain their original relative positions.

In contrast, std::sort often uses in-place sorting algorithms like Introsort, which require only $O(\log n)$ additional space for the recursion stack. This difference means that while std::stable_sort provides the benefit of stability, it does so at the potential cost of increased memory usage. When working with large datasets or in environments with limited memory resources, this additional space requirement can be a significant consideration.

C++20 also introduces constexpr support for std::stable_sort, enabling compile-time sorting when possible:

constexpr bool test_stable_sort() {
    std::array<int, 5> arr = {3, 1, 4, 1, 5};
    std::stable_sort(arr.begin(), arr.end());
    return std::is_sorted(arr.begin(), arr.end());
}
static_assert(test_stable_sort());

When working with types that implement the spaceship operator (operator<=>), C++20’s stable_sort automatically leverages the three-way comparison for potentially more efficient sorting operations. Additionally, the implementation may optimize memory usage through efficient move operations when working with types that satisfy std::move_constructible and std::move_assignable.

2.4.3. Using Span and Ranges

In the fast-paced world of competitive programming and high-performance computing, efficient data manipulation is paramount. C++20 introduces two powerful features - std::span and std::ranges for that.

These features are particularly important because they address common performance bottlenecks in data-intensive applications. std::span provides a lightweight, non-owning view into contiguous data, reducing unnecessary copying and allowing for flexible, efficient data access. std::ranges, on the other hand, offers a unified, composable interface for working with sequences of data, enabling more intuitive and often more performant algorithm implementations. Together, they form a potent toolkit for developers seeking to push the boundaries of what’s possible in terms of code efficiency and elegance in C++.

Using `std::span`

The std::span is a new feature introduced in C++20 that allows you to create lightweight, non-owning views of arrays and containers, such as std::vector. This avoids unnecessary copying of data and provides a flexible and efficient way to access and manipulate large blocks of data. std::span can be particularly useful when working with large datasets, file I/O, or when optimizing memory usage in competitive programming.

Unlike containers such as std::vector, std::span doesn’t own the data it references. This means it doesn’t allocate new memory and works directly with existing data, leading to lower memory overhead. Additionally, std::span can work with both static arrays and dynamic containers (like std::vector) without requiring copies. It provides safer array handling compared to raw pointers, as it encapsulates size information. Since std::span eliminates the need for memory copies, it can speed up operations where large datasets need to be processed in-place, or only certain views of data are required.

Example of std::span for Efficient Data Access:

In this example, we create a std::span from a std::vector of integers, allowing us to iterate over the vector’s elements without copying the data:

#include <iostream>
#include <span>
#include <vector>

int main() {
    // Create a vector of integers
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    // Create a span view of the vector
    std::span<int> view(numbers);

    // Iterate over the span and print the values
    for (int num : view) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    return 0;
}

std::span<int> view(numbers); creates a non-owning view of the std::vector<int> numbers. This allows access to the elements of the vector without copying them. The loop for (int num : view) iterates over the elements in the std::span, just like it would with the original std::vector, but with no additional overhead from copying the data.

Efficient Use Cases for `std::span`

std::span is especially useful when you want to work with sub-ranges of arrays or vectors. For example, when working with just part of a large dataset, you can use std::span to reference a subset without slicing or creating new containers:

std::span<int> subrange = view.subspan(1, 3); // Access elements 1, 2, and 3
for (int num : subrange) {
    std::cout << num << " "; // Outputs: 2 3 4
}

When passing data to functions, std::span provides an efficient alternative to passing large vectors or arrays by reference. You can pass a span instead, ensuring that no copies are made, while maintaining full access to the original data:

void process_data(std::span<int> data) {
    for (int num : data) {
        std::cout << num << " ";
    }
    std::cout << std::endl;
}

int main() {
    std::vector<int> numbers = {10, 20, 30, 40, 50};
    process_data(numbers); // Pass the vector as a span
    return 0;
}

In the early example, the function process_data accepts a std::span, avoiding unnecessary copies and keeping the original data structure intact.

Comparing `std::span` to Traditional Methods

Feature	`std::vector`	Raw Pointers	`std::span`
Memory Ownership	Yes	No	No
Memory Overhead	High (allocates memory)	Low	Low
Bounds Safety	High	Low	High
Compatibility	Works with STL	Works with raw arrays	Works with both

Unlike std::vector, which manages its own memory, std::span does not allocate or own memory. This is similar to raw pointers but with added safety since std::span knows its size. std::span is safer than raw pointers because it carries bounds information, helping avoid out-of-bounds errors. While raw pointers offer flexibility, they lack the safety features provided by modern C++.

Practical Application: Using `std::span` in Competitive Programming

When working with large datasets in competitive programming, using std::span avoids unnecessary memory copies, making operations faster and more efficient. You can easily pass sub-ranges of data to functions without creating temporary vectors or arrays. Additionally, it allows you to maintain full control over memory without introducing complex ownership semantics, as with std::unique_ptr or std::shared_ptr.

Example: Efficiently Passing Data in a Competitive Programming Scenario:

#include <iostream>
#include <span>
#include <vector>

void solve(std::span<int> data) {
    for (int num : data) {
        std::cout << num * 2 << " "; // Example: print double each value
    }
    std::cout << std::endl;
}

int main() {
    std::vector<int> input = {100, 200, 300, 400, 500};

    // Use std::span to pass the entire vector without copying
    solve(input);

    // Use a subspan to pass only a portion of the vector
    solve(std::span<int>(input).subspan(1, 3)); // Pass elements 200, 300, 400

    return 0;
}

2.4.4. Data Manipulation with `std::ranges`

C++20 brought the <ranges> library—a tool for handling data sequences with power and ease. It works through lazy views and composable transformations. std::ranges lets you create views over containers or arrays, avoiding changes and extra copies. In competitive programming and high-performance tasks, cutting down on memory and computation is key.

With std::vector and other containers, you often need extra storage or loops for things like filtering, transforming, or slicing data. std::ranges changes that. It lets you chain operations in a simple, expressive way without losing speed. It uses lazy evaluation, meaning transformations only happen when needed, not upfront.

std::ranges revolves around “views” of data, a windows that let you look at and manipulate sequences without owning them. A view acts like a container, but it doesn’t hold the data itself. It just provides a way to interact with it, making operations light and efficient.

The advantage? std::ranges stacks operations without creating new containers, saving memory. Traditional methods create copies with every action (filter, transform, slice) adding overhead. Ranges avoid this, evaluating only when the data is accessed. Memory stays low, performance stays high, especially with big data.

Performance gains also come from optimized operations. By working lazily and directly on the data, std::ranges avoids unnecessary copies and allocations. The result is better cache usage and fewer CPU cycles wasted on managing temporary structures.

Example Filtering and Transforming Data with std::ranges:

Suppose we have a vector of integers and we want to filter out the odd numbers and then multiply the remaining even numbers by two. Using traditional methods, we would need to loop through the vector, apply conditions, and store the results in a new container. With std::ranges, this can be done in a more expressive and efficient way:

#include <iostream>
#include <vector>
#include <ranges>
#include <span>

int main() {
    // Sample data
    std::vector<int> data = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

    // Using std::span to create a view over the existing data without copying
    std::span<int> data_span(data);

    // Using std::ranges to filter and transform data lazily
    auto processed_view = data_span
        | std::views::filter([](int x) { return x % 2 == 0; }) // Filter even numbers
        | std::views::transform([](int x) { return x * x; });  // Square each number

    // Iterating over the processed view
    for (int value : processed_view) {
        std::cout << value << " ";
    }

    return 0;
}

The line that creates processed_view does the heavy lifting. It uses std::span and std::ranges to work smart, not hard. Here’s what happens, step by step:

auto processed_view = data_span
        | std::views::filter([](int x) { return x % 2 == 0; })
        | std::views::transform([](int x) { return x * x; });

First, data_span is a direct window into the data. No copies, no waste. Then comes std::views::filter. It uses a lambda function, which is just a fancy name for a tiny, anonymous function that you write right there. The filter is simple—it checks each number, [] (int x) { return x % 2 == 0; }. It says, “If the number is even, keep it; if not, toss it.” No fuss.

Next is std::views::transform. It’s another lambda, [] (int x) { return x * x; }. It takes what’s left from the filter and squares each number. The job is done on the fly, no waiting. The power here is that everything happens only when needed. No intermediate results, no unnecessary containers. Just efficient, on-demand calculation. The functions do their work with precision—like a scalpel, not a sledgehammer.

The | operator in C++20 is called the pipe operator. It chains ranges together, like pieces of a machine, one feeding into the next, just like a Unix pipeline. In the processed_view line, it links std::views::filter first, then std::views::transform. Data moves step by step, each part doing its job without extra fuss. No need for temporary variables or extra code. It’s clean, efficient. Each transformation builds on the last, clear as a straight line.

But the pipe doesn’t stop with ranges. You can use it with custom classes or user types too. Overload the operator, and you’ve got yourself a clear chain of actions, each link precise and sharp. It can take complex operations and make them simple, one step feeding the next, easy to read and hard to get wrong.

The pipe works with I/O too. It lets you stack stream manipulators like a craftsman arranging his tools,std::cout | std::hex | std::uppercase. It’s smooth, no clutter. In functional code, pipes connect functions, turning data flow into a straight path. They make the code tell a story—one step at a time, each part pulling its weight.

The | operator isn’t just a trick for ranges; it’s a way to keep the code honest. It turns complex work into a direct line, clear, readable, and true to the task.

One of the main strengths of std::ranges is how you can stack operations, one on top of the other. You filter, you transform, you slice—and it all stays as a view. No extra containers, no wasted steps. You build a pipeline that works only when you call on it, no sooner. It’s efficient and lean, cutting through the data like a sharp knife. You get what you need when you need it, nothing more. The code is clean, each piece doing its job, each step feeding the next without clutter or delay.

Consider another example, where we filter, transform, and take only a part of the data:

#include <iostream>
#include <vector>
#include <ranges>
#include <span>

int main() {
    // Sample data
    std::vector<int> data = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

    // Using std::span to create a direct, non-owning view of the data
    std::span<int> data_span(data);

    // Using std::ranges with std::span to filter, transform, and limit the data efficiently
    auto processed_view = data_span
        | std::views::filter([](int x) { return x % 2 == 0; })  // Filter even numbers
        | std::views::transform([](int x) { return x * x; })   // Square each number
        | std::views::take(3);                                // Take the first 3 elements

    // Iterate over the processed view
    for (int value : processed_view) {
        std::cout << value << " ";
    }

    return 0;
}

In this example, we stack three operations. First, we filter the numbers, keeping those $20$ or higher. Then, we double them. Finally, we take the first three. It all happens lazily, nothing done until you need it. When you loop through the final view, result, that’s when the work gets done. No extra containers, no wasted steps. Each transformation hits just once, right where it counts. The code stays lean, the processing sharp and efficient, doing just enough—no more, no less.

2.8. Loops the Heart of All Competitive Programming

Loops are, without a doubt, the most important part of any code, whether for competitive programming, high-performance applications, or even solving academic problems. Most programming languages offer more than one way to implement loops. In this text, since Python is only our pseudocode language, we will focus on studying loops in C++.

2.8. Understand `for` Loops

C++ provides several ways to iterate over elements in a vector, using different types of for loops. In this section, we will explore the various for loop options available in C++20, discussing their performance and code-writing efficiency. We will also analyze which loops are best suited for competitive programming based on input size—whether dealing with small or large datasets.

2.8.1 `for` Loop with Iterator

The for loop using iterators is one of the most efficient ways to iterate over a vector, especially for complex operations where you need to manipulate the elements or the iterator’s position directly.

for (auto it = vec.begin(); it != vec.end(); ++it) {
    std::cout << *it << " ";
}

Utilizing iterators directly avoids unnecessary function calls such as operator[] and allows fine-grained control over the iteration. Ideal when detailed control over the iterator is necessary or when iterating over containers that do not support direct index access (e.g., std::list).

Input Size Consideration:

For Small Inputs: This is a solid option as it allows precise control over the iteration with negligible overhead.
For Large Inputs: Highly efficient due to minimal overhead and memory usage. However, ensure that the iterator’s operations do not induce cache misses, which can slow down performance for large datasets.

2.8.2. Classic `for` Loop with Index

The classic for loop using an index is efficient and provides precise control over the iteration process.

for (size_t i = 0; i < vec.size(); ++i) {
    std::cout << vec[i] << " ";
}

Accessing elements via index is fast, but re-evaluating vec.size() in each iteration can introduce a small overhead. Useful when you need to access or modify elements by their index or when you may need to adjust the index inside the loop.

Input Size Consideration:

For Small Inputs: Efficient and straightforward, especially when the overhead of re-evaluating vec.size() is negligible.
For Large Inputs: If performance is critical, store vec.size() in a separate variable before the loop to avoid repeated function calls, which can become significant for larger datasets.

2.8.3. Range-Based `for-each` with Constant Reference

Range-based for-each with constant reference is highly efficient for reading elements since it avoids unnecessary copies.

for (const auto& elem : vec) {
    std::cout << elem << " ";
}

Using constant references avoids copying, making it very efficient for both memory and execution time. Recommended for reading elements when you don’t need to modify values or access their indices.

Input Size Consideration:

For Small Inputs: Ideal for minimal syntax and efficient execution.
For Large Inputs: Excellent choice due to the avoidance of element copies, ensuring optimal memory usage and performance.

2.8.4. Range-Based `for-each` by Value

The for-each loop can also iterate over elements by value, which is useful when you want to work with copies of the elements.

for (auto elem : vec) {
    std::cout << elem << " ";
}

Elements are copied, which can reduce performance, especially for large data types. Useful when you need to modify a copy of the elements without affecting the original vector.

Input Size Consideration:

For Small Inputs: Suitable when the overhead of copying is negligible, especially if you need to modify copies of elements.
For Large Inputs: Avoid for large datasets or large element types, as the copying can lead to significant performance degradation.

2.8.5. `for` Loop with Range Views (C++20)

C++20 introduced range views, which allow iteration over subsets or transformations of elements in a container without creating copies.

for (auto elem : vec | std::views::reverse) {
    std::cout << elem << " ";
}

Range views allow high-performance operations, processing only the necessary elements. Ideal for operations involving transformations, filtering, or iterating over subsets of elements.

Input Size Consideration:

For Small Inputs: Works well, especially when applying transformations like reversing or filtering, while maintaining code readability.
For Large Inputs: Very efficient as no extra memory is allocated, and the processing is done lazily, meaning only the required elements are accessed.

2.8.6. Parallel `for` Loop (C++17/C++20)

While not a traditional for loop, using parallelism in loops is a powerful feature introduced in C++17 and further improved in C++20.

#include <execution>

std::for_each(std::execution::par, vec.begin(), vec.end(), [](int& elem) {
elem \*= 2; // Parallelized operation
});

Uses multiple threads to process elements in parallel, offering substantial performance gains for intensive operations that can be performed independently on large datasets. It requires more setup and understanding of parallelism concepts but can provide significant performance boosts for operations on large datasets.

Input Size Consideration:

For Small Inputs: Overkill. The overhead of managing threads and synchronization outweighs the benefits for small datasets.
For Large Inputs: Extremely efficient. When dealing with large datasets, parallel processing can drastically reduce runtime, especially for computationally expensive operations.

2.8.7. Optimal `for` Loops for Competitive Programming

Choosing the right type of for loop in competitive programming depends largely on input size and the specific use case. The following table summarizes the best choices for different scenarios:

Input Size	Best `for` Loop Option	Reasoning
Small	Range-Based `for-each` with Constant Reference	Offers minimal syntax, high readability, and avoids copies, making it fast and efficient.
Small	Classic `for` Loop with Index	Provides precise control over the index, useful when index manipulation or modification is required.
Large	Iterator-Based `for` Loop	Highly efficient for large datasets due to minimal memory overhead and optimized performance.
Large	Parallel `for` Loop with `std::for_each` and `std::execution::par`	Ideal for computationally heavy tasks on large datasets, leveraging multiple threads to parallelize.
Transformations	`for` Loop with Range Views (C++20)	Ideal for processing subsets or transformations of data without creating extra copies.

2.9 Now the `while` Loop which we all love

The while loop is another fundamental control structure in C++ that is often used in competitive programming. It repeatedly executes a block of code as long as a specified condition evaluates to true. In this section, we will explore the different use cases for while loops, their performance considerations, and scenarios where they may be preferable to for loops. We will also examine their application with both small and large datasets.

2.9.1. Basic `while` Loop

A while loop continues executing its block of code until the condition becomes false. This makes it ideal for situations where the number of iterations is not known beforehand.

int i = 0;
while (i < n) {
    std::cout << i << " ";
    i++;
}

The while loop is simple and provides clear control over the loop’s exit condition. The loop runs while i < n, and the iterator i is incremented manually within the loop. This offers flexibility in determining when and how the loop terminates.

Input Size Consideration:

For Small Inputs: This structure is efficient, especially when the number of iterations is small and predictable.
For Large Inputs: The while loop can be optimized for larger inputs by ensuring that the condition is simple to evaluate and that the incrementing logic doesn’t introduce overhead.

2.9.2. `while` Loop with Complex Conditions

while loops are particularly useful when the condition for continuing the loop involves complex logic that cannot be easily expressed in a for loop.

int i = 0;
while (i < n && someComplexCondition(i)) {
    std::cout << i << " ";
    i++;
}

In this case, the loop runs not only based on the value of i, but also on the result of a more complex function. This makes while loops a good choice when the exit condition depends on multiple variables or non-trivial logic.

Input Size Consideration::

For Small Inputs: This is ideal for small inputs where the condition can vary significantly during the iterations.
For Large Inputs: Be cautious with complex conditions when dealing with large inputs, as evaluating the condition on every iteration may add performance overhead.

2.9.3. Infinite `while` Loops

An infinite while loop is a loop that runs indefinitely until an explicit break or return statement is encountered. This type of loop is typically used in scenarios where the termination condition depends on an external event, such as user input or reaching a specific solution.

while (true) {
    // Process some data
    if (exitCondition()) break;
}

The loop runs until exitCondition() is met, at which point it breaks out of the loop. This structure is useful for algorithms that require indefinite running until a specific event happens.

Input Size Consideration:

For Small Inputs: Generally unnecessary for small inputs unless the exit condition is based on dynamic factors.
For Large Inputs: Useful for large inputs when the exact number of iterations is unknown, and the loop depends on a condition that could be influenced by the data itself.

2.9.4. `do-while` Loop

The do-while loop is similar to the while loop, but it guarantees that the code block is executed at least once. This is useful when you need to run the loop at least one time regardless of the condition.

int i = 0;
do {
    std::cout << i << " ";
    i++;
} while (i < n);

In this case, the loop will print i at least once, even if i starts with a value that makes the condition false. This ensures that the loop runs at least one iteration.

Input Size Consideration:

For Small Inputs: Ideal when you need to guarantee that the loop runs at least once, such as with small datasets where the minimum iteration is essential.
For Large Inputs: Suitable for large datasets where the first iteration must occur independently of the condition.

2.9.5. `while` Loop with Early Exit

The while loop can be combined with early exit strategies using break or return statements to optimize performance, particularly when the loop can terminate before completing all iterations.

int i = 0;
while (i < n) {
    if (shouldExitEarly(i)) break;
    std::cout << i << " ";
    i++;
}

By including a condition inside the loop that checks for an early exit, you can significantly reduce runtime in cases where processing all elements is unnecessary.

Input Size Consideration:

For Small Inputs: It can improve performance when early termination conditions are common or likely.
For Large Inputs: Highly efficient for large datasets, particularly when the early exit condition is met frequently, saving unnecessary iterations.

2.9.6. Combining `while` with Multiple Conditions

A while loop can easily incorporate multiple conditions to create more complex termination criteria. This is particularly useful when multiple variables determine whether the loop should continue.

int i = 0;
while (i < n && someOtherCondition()) {
    std::cout << i << " ";
    i++;
}

This allows the loop to run based on multiple dynamic conditions, providing more control over the iteration process than a standard for loop might offer.

Input Size Consideration:

For Small Inputs: A flexible option when the conditions governing the loop may change during execution, even for small datasets.
For Large Inputs: Can be optimized for large datasets by ensuring that the condition checks are efficient and that unnecessary re-evaluations are minimized.

2.9.7. Optimal `while` Loops for Competitive Programming

Choosing the right type of while loop depends on the nature of the input and the complexity of the condition. The following table summarizes the optimal choices for different input sizes:

Input Size	Best `while` Loop Option	Reasoning
Small	Basic `while` Loop	Offers straightforward control over iteration with minimal overhead and is easy to implement.
Small	`do-while` Loop	Ensures at least one execution of the loop.
Large	`while` with Early Exit	Improves performance by terminating the loop early when a specific condition is met, saving unnecessary iterations.
Large	`while` with Complex Conditions	Allows dynamic and flexible exit conditions, making it suitable for large datasets with evolving parameters.
Continuous	Infinite `while` Loop with Explicit Breaks	Best for situations where the exact number of iterations is unknown and depends on external factors or dynamic conditions.

2.10 Special Loops in C++20

In C++20, several advanced looping techniques have been introduced, each offering unique ways to improve code efficiency and readability. While some of these techniques provide remarkable performance optimizations, not all are well-suited for competitive programming. competitive programmings often involve handling dynamic inputs and generating outputs within strict time limits, so techniques relying heavily on compile-time computation are less practical. This section focuses on the most useful loop structures for competitive programmings, emphasizing runtime efficiency and adaptability to varying input sizes.

2.10.1. Range-Based Loops with `std::ranges::views`

C++20 introduces ranges and views, which allow you to create expressive and efficient loops by operating on views of containers without copying data. Views are lazily evaluated, meaning that operations like filtering, transformation, or reversing are applied only when accessed.

Example:

#include <ranges>
#include <vector>
#include <iostream>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};

    // Using views to iterate in reverse
    for (auto elem : vec | std::views::reverse) {
        std::cout << elem << " ";
    }

    return 0;
}

Benefits:

Efficient and lazy evaluation ensures that operations like reversing or filtering are performed only when needed, rather than precomputing them or creating unnecessary copies of the data. This approach optimizes memory usage and speeds up execution, particularly when working with large datasets.

The syntax is also highly expressive and concise, allowing you to write clear and readable code. This is particularly useful when applying multiple transformations in sequence, as it helps maintain code simplicity while handling complex operations.

Considerations for competitive programmings:

Range views are particularly useful when working with large datasets, as they enable efficient processing by avoiding the creation of unnecessary copies and reducing memory overhead. This approach allows for smoother handling of extensive input data, improving overall performance.

Additionally, range views provide clarity and simplicity when dealing with complex operations. They streamline the process of transforming data, making it easier to apply multiple operations in a clean and readable manner, which is especially beneficial in competitive programming scenarios.

2.10.2. Parallel Loops with `std::for_each` and `std::execution::par`

C++20 enables parallelism in standard algorithms with std::execution. Using parallel execution policies, you can distribute loop iterations across multiple threads, which can drastically reduce the execution time for computationally expensive loops. This is especially useful when working with large datasets in competitive programming.

Example:

#include <execution>
#include <vector>

int main() {
    std::vector<int> vec(1000000, 1);

    std::for_each(std::execution::par, vec.begin(), vec.end(), [](int& elem) {
        elem *= 2;
    });

    return 0;
}

Benefits:

Parallel loops offer high performance, particularly when dealing with large input sizes that involve intensive computation. By utilizing multiple CPU cores, they significantly reduce execution time and handle heavy workloads more efficiently.

What makes this approach even more practical is that it requires minimal changes to existing code. The parallel execution is enabled simply by adding the execution policy std::execution::par, allowing traditional loops to run in parallel without requiring complex modifications.

Considerations for competitive programmings:

Parallel loops are highly effective for processing large datasets, making them ideal in competitive programming scenarios where massive inputs need to be handled efficiently. They can dramatically reduce execution time by distributing the workload across multiple threads.

However, they are less suitable for small inputs. In such cases, the overhead associated with managing threads may outweigh the performance gains, leading to slower execution compared to traditional loops.

2.11. `constexpr` Loops

With C++20, constexpr has been extended to allow more complex loops and logic at compile time. While this can lead to ultra-efficient code where calculations are precomputed during compilation, this technique has limited utility in competitive programming, where dynamic inputs are a central aspect of the problem. Since competitive programming requires handling varying inputs provided at runtime, constexpr loops are generally less useful in this context.

Example:

#include <array>
#include <iostream>

constexpr std::array<int, 5> generate_squares() {
    std::array<int, 5> arr{};
    for (int i = 0; i < 5; ++i) {
        arr[i] = i * i;
    }
    return arr;
}

int main() {
    constexpr auto arr = generate_squares();
    for (int i : arr) {
        std::cout << i << " ";  // 0 1 4 9 16
    }

    return 0;
}

Benefits:

Compile-time efficiency allows for faster runtime performance, as all necessary computations are completed during the compilation phase. This eliminates the need for processing during execution, leading to quicker program runs.

This approach is ideal for constant, static data. When all relevant data is known ahead of time, compile-time computation removes the need for runtime processing, providing a significant performance boost by bypassing real-time calculations.

2.11.1 Considerations for competitive programmings

While constexpr loops are not suitable for processing dynamic inputs directly, they can be strategically used to create lookup tables or pre-compute values that are then utilized during runtime calculations. This can be particularly useful in problems involving mathematical sequences, combinatorics, or other scenarios where certain calculations can be predetermined. However, it’s important to balance the use of pre-computed data with memory constraints, as large lookup tables might exceed memory limits in some competitive programming environments.

2.12. Early Exit Loops

In competitive programming, optimizing loops to exit early when a condition is met can drastically reduce execution time. This approach is especially useful when the solution does not require processing the entire input if an early condition is satisfied.

Example:

#include <vector>
#include <iostream>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};

    // Early exit if a condition is met
    for (int i = 0; i < vec.size(); ++i) {
        if (vec[i] == 3) break;
        std::cout << vec[i] << " ";
    }

    return 0;
}

Benefits:

Early exit loops improve efficiency by terminating as soon as a specified condition is met, thus avoiding unnecessary iterations. This approach helps save time, especially when the loop would otherwise continue without contributing to the result. This technique is particularly useful in search problems. By exiting the loop early when a target value is found, it can improve performance, reducing the overall execution time.

Early exit loops are highly practical, as they allow a solution to be reached without the need to examine all the data. By cutting down unnecessary iterations, they help reduce execution time, making them particularly useful in scenarios where a result can be determined quickly based on partial input.

2.13. Indexed Loops with Range-Based `for`

While C++ offers powerful range-based for loops, there are scenarios where accessing elements by index is essential, especially when the loop logic requires modifying the index or accessing adjacent elements. Range-based for loops cannot directly access the index, so indexed loops remain valuable for such cases.

Example:

#include <vector>
#include <iostream>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};

    for (size_t i = 0; i < vec.size(); ++i) {
        std::cout << vec[i] << " ";
    }

    return 0;
}

Benefits:

Indexed loops offer precise control by providing direct access to elements through their index, giving you full control over how the index changes during iteration. This level of control allow fine-tuning the behavior of the loop.

They are essential when modifying iteration behavior, especially in cases where you need to adjust the index dynamically. This is useful for tasks such as skipping elements or implementing non-linear iteration patterns, allowing for flexible loop management.

Indexed loops are well-suited for dynamic access, offering the flexibility required for more complex iteration logic. This makes them ideal for scenarios where direct control over the loop’s behavior is necessary.

However, they are less expressive compared to range-based loops. While they provide detailed control, they tend to be more verbose and less concise than the streamlined syntax offered by range-based alternatives.

2.14. Standard Library Algorithms

Using standard library algorithms like std::for_each and std::transform allows for highly optimized iteration and transformation of container elements. These algorithms are highly optimized, making them ideal for competitive programming scenarios.

Example:

#include <algorithm>
#include <vector>
#include <iostream>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};

    std::for_each(vec.begin(), vec.end(), [](int& x) { x *= 2; });

    for (const int& x : vec) {
        std::cout << x << " ";
    }

    return 0;
}

Benefits:

Standard library algorithms are highly optimized for performance, often surpassing the efficiency of manually written loops. Their internal optimizations make them a powerful tool for handling operations in a time-efficient manner.

Additionally, these functions are concise and clear, providing a clean and expressive syntax to apply operations on containers. This simplicity improve code readability while maintaining high performance, making them ideal for competitive programming.

Standard library algorithms are great for transformation tasks, allowing you to apply operations on container elements with minimal code. They maximize efficiency while keeping the implementation simple and concise, making them particularly effective for handling transformations in competitive programming scenarios.

2.15 Summary Table of Useful Loop Techniques

Technique	Best Use Case	Efficiency Considerations
`std::ranges::views`	Transforming or filtering large datasets	Lazily evaluated operations reduce memory overhead and improve runtime efficiency.
Parallel Loops with `std::execution::par`	Large computational tasks	Parallelism significantly improves performance for large, independent tasks.
Early Exit Loops	Search or conditional exit problems	Avoids unnecessary iterations, improving efficiency in scenarios with early exits.
Indexed Loops	Precise control over iteration	Offers flexibility and control for complex iteration logic or index manipulation.
Standard Library Algorithms	Applying transformations or actions	Well-optimized algorithms that simplify code and improve performance.

Techniques Not Recommended for competitive programmings:

Technique	Reasoning
`constexpr` Loops	Compile-time only, cannot handle dynamic input, thus impractical for runtime competitive programming problems.

Complete Series

Previous Next

References

MUSSER, D.R. (1997), Introspective Sorting and Selection Algorithms. Softw: Pract. Exper., 27: 983-993. https://doi.org/10.1002/(SICI)1097-024X(199708)27:8<983::AID-SPE117>3.0.CO;2- ↩
Sedgewick, R. (1998). Algorithms in C++ (4rd ed.). Addison-Wesley. ↩ ↩²

23 Oct 2024

« Competitive Programming in C++ Insights - Introduction Em Busca da Diagonalização de Cantor »

Name	Availability in C++	Purpose	Algorithm	Performance
`std::mt19937`	C++11 and later	General purpose	Mersenne Twister with period $2^{19937}-1$	Good overall performance, but requires 2.5 KB of state
`std::mt19937_64`	C++11 and later	General purpose (64-bit)	Mersenne Twister optimized for 64 bits	Similar to `mt19937`, with better performance on 64-bit systems
`std::xoshiro256**`	C++20	High performance	Xoshiro256** (xor/shift/rotate)	Excellent performance, small state (256 bits), period $2^{256}-1$
`std::ranlux24`	C++11 and later	High statistical quality	Subtract-with-borrow + discarding	Lower performance due to number discarding
`std::ranlux48`	C++11 and later	Cryptographic quality	Subtract-with-borrow + intensive discarding	Lower performance, higher statistical quality
`std::minstd_rand`	C++11 and later	Minimalistic	Linear Congruential Generator (LCG)	Very fast, but lower statistical quality
`std::default_random_engine`	C++11 and later	Implementation-specific	Depends on the compiler	Varies depending on implementation

Name	Availability in C++	Purpose	Description	Characteristics
`std::uniform_int_distribution`	C++11 and later	Uniform distribution	Generates integers uniformly within a range	Constant time performance, unbiased results
`std::uniform_real_distribution`	C++11 and later	Uniform distribution	Generates real numbers uniformly within a range	High precision, constant time performance
`std::normal_distribution`	C++11 and later	Gaussian distribution	Generates values with a normal distribution	Good for simulations, may have performance overhead
`std::bernoulli_distribution`	C++11 and later	Bernoulli trial	Generates true/false with a given probability	Fast, suitable for binary outcomes
`std::binomial_distribution`	C++11 and later	Binomial distribution	Number of successes in a sequence of trials	Effective for modeling discrete probability events
`std::geometric_distribution`	C++11 and later	Geometric distribution	Generates number of failures before success	Useful for waiting-time scenarios, fast computation
`std::poisson_distribution`	C++11 and later	Poisson distribution	Models number of events in fixed interval	Efficient for modeling rare events over time
`std::exponential_distribution`	C++11 and later	Exponential distribution	Time between events in Poisson process	Suitable for modeling interarrival times
`std::gamma_distribution`	C++11 and later	Gamma distribution	Models waiting times with shape and scale	Flexible, used in various statistical applications
`std::weibull_distribution`	C++11 and later	Weibull distribution	Models failure times in reliability analysis	Good for reliability modeling, shape-adjustable
`std::chi_squared_distribution`	C++11 and later	Chi-squared distribution	Statistical measure of goodness-of-fit	Common in hypothesis testing, moderate performance
`std::student_t_distribution`	C++11 and later	Student’s t-distribution	Models differences between sample means	Useful in statistical testing, heavy-tailed behavior
`std::cauchy_distribution`	C++11 and later	Cauchy distribution	Heavy-tailed distribution, undefined variance	Models outcomes with extreme values
`std::lognormal_distribution`	C++11 and later	Log-normal distribution	Models growth rates of processes	Skewed distribution, good for financial modeling
`std::discrete_distribution`	C++11 and later	Discrete distribution	Generates integers based on a set of probabilities	Useful for weighted random selection
`std::piecewise_constant_distribution`	C++11 and later	Piecewise constant distribution	Generates values with constant probability density within subranges	Good for custom-defined distributions
`std::piecewise_linear_distribution`	C++11 and later	Piecewise linear distribution	Generates values with linear probability density within subranges	Useful for modeling linear trends

Property	Description
Average-case time complexity	$O(n \log n)$
Worst-case time complexity	$O(n \log n)$, due to the switch to HeapSort when necessary
Space complexity	$O(\log n)$ for the recursion stack
Stable vs. unstable sorting	Introsort is an unstable sort, meaning that equal elements might not retain their original order after sorting
In-place sorting	Introsort is an in-place sort, meaning it requires a constant amount of additional memory beyond the input array

Frank de Alcantara

C++ Here the Journey Begins

2.4. Data Manipulation in C++20

2.4.1. Working with Vectors

Vectors as Inputs and Outputs

Optimized Version Using fread and putchar with Command-Line File Input

2.4.2. Using Matrices

Creating and Filling a Matrix

Displaying the Matrix

Inserting Elements at a Specific Position

Removing the Last Element and a Specific Element

Creating a New Matrix with a Default Value

Resizing and Filling with Random Values

Sorting Matrices by Rows and Columns

2.5.7.1 Sorting by Rows

2.5.7.2 Sorting by Columns

Optimizing Matrix Input and Output in Competitive Programming

Optimized Reading with fread

Optimized Output with putchar_unlocked

Using mmap on Unix Systems for Matrices

2.5 Fundamental Tool: Sorting

2.5.1 Stable Sorting of Vectors

2.4.3. Using Span and Ranges

Using std::span

Efficient Use Cases for std::span

Comparing std::span to Traditional Methods

Practical Application: Using std::span in Competitive Programming

2.4.4. Data Manipulation with std::ranges

2.8. Understand for Loops

2.8.1 for Loop with Iterator

2.8.2. Classic for Loop with Index

2.8.3. Range-Based for-each with Constant Reference

2.8.4. Range-Based for-each by Value

2.8.5. for Loop with Range Views (C++20)

2.8.6. Parallel for Loop (C++17/C++20)

2.8.7. Optimal for Loops for Competitive Programming

2.9 Now the while Loop which we all love

2.9.1. Basic while Loop

2.9.2. while Loop with Complex Conditions

2.9.3. Infinite while Loops

2.9.4. do-while Loop

2.9.5. while Loop with Early Exit

2.9.6. Combining while with Multiple Conditions

2.9.7. Optimal while Loops for Competitive Programming

2.10 Special Loops in C++20

2.10.1. Range-Based Loops with std::ranges::views

2.10.2. Parallel Loops with std::for_each and std::execution::par

2.11. constexpr Loops

2.11.1 Considerations for competitive programmings

2.12. Early Exit Loops

2.13. Indexed Loops with Range-Based for

2.14. Standard Library Algorithms

2.15 Summary Table of Useful Loop Techniques

Complete Series

References

Explore →

Optimized Version Using `fread` and `putchar` with Command-Line File Input

Optimized Reading with `fread`

Optimized Output with `putchar_unlocked`

Using `mmap` on Unix Systems for Matrices

Using `std::span`

Efficient Use Cases for `std::span`

Comparing `std::span` to Traditional Methods

Practical Application: Using `std::span` in Competitive Programming

2.4.4. Data Manipulation with `std::ranges`

2.8. Understand `for` Loops

2.8.1 `for` Loop with Iterator

2.8.2. Classic `for` Loop with Index

2.8.3. Range-Based `for-each` with Constant Reference

2.8.4. Range-Based `for-each` by Value

2.8.5. `for` Loop with Range Views (C++20)

2.8.6. Parallel `for` Loop (C++17/C++20)

2.8.7. Optimal `for` Loops for Competitive Programming

2.9 Now the `while` Loop which we all love

2.9.1. Basic `while` Loop

2.9.2. `while` Loop with Complex Conditions

2.9.3. Infinite `while` Loops

2.9.4. `do-while` Loop

2.9.5. `while` Loop with Early Exit

2.9.6. Combining `while` with Multiple Conditions

2.9.7. Optimal `while` Loops for Competitive Programming

2.10.1. Range-Based Loops with `std::ranges::views`

2.10.2. Parallel Loops with `std::for_each` and `std::execution::par`

2.11. `constexpr` Loops

2.13. Indexed Loops with Range-Based `for`