Wednesday, May 1, 2013

Transform, Won't do the Obligatory Catchphrase

/**
 * All about std::transform. I've never really used it because of it's strange
 * form. The idea is that you can apply a functor to a container and have it
 * write to a container via an iterator position [1]. When I started writing
 * this entry I thought it would take only a few paragraphs; the more I wrote
 * and experimented the more excited I got about the possibilities this
 * algorithm presents. Hope you enjoy this post because I enjoyed writing it!
 */

// This is a requirement to use the "transform" function.
#include <algorithm>

// This is required to use "back_inserter", which is part of the third
// attempt.
#include <iterator>

// Obvious the rest of this is just required for containers and output.
#include <iostream>
#include <set>
#include <vector>


/**
 * This is from the notorious site www.cplusplus.com [1]. If you ever work
 * with C++, this website should be infamous. This function simply takes
 * an integer by value and increments it by 1 and returns the value.
 */
int op_increase(int i)
{
    return ++i;
}


/**
 * Putting this functionality in to illustrate that if we are transforming
 * directly into a second container via its begin iterator, we need to
 * allocate the memory for that container BEFORE transforming. If this
 * macro is undefined (just imagine the code didn't exist below), then
 * you will get a Segmentation Fault.
 */
#define SECOND_ATTEMPT


/**
 * When commented in this will attempt another way of performing insertion
 * via a back_inserter. This allows us to perform a transform operation
 * without pre-allocating the size of the destination container.
 */
#define THIRD_ATTEMPT


/**
 * Here is a possible good use of transforming a bunch of elements in a
 * vector and automatically inserting them into a set. It's a neat way
 * to convert a vector to a set (just think of using identity).
 */
#define FOURTH_ATTEMPT


/**
 * The final example I have illustrates why transform needs to have its
 * memory pre-allocated. The idea is that transform doesn't have to target
 * a new container; it could target itself! [5]
 */
#define FIFTH_ATTEMPT


/**
 * The main entry point to the program.
 */
int main(int, char** )
{
    std::vector<int> input;

    // Fill a vector with a range of 1 to 10.
    for(int i=0; i<10; ++i)
    {
        input.push_back(i);
    }

    std::vector<int> destination;

    // Like I said above, this is the fix to a Segmentation Fault
    // where the memory for the destination container hasn't been
    // allocated!
    #if defined SECOND_ATTEMPT
    destination.resize(input.size());
    #endif

    // For each item in the iterator range we will actually execute
    // op_increase and then take the result from that and place it
    // into the destination container.
    std::transform( input.begin(), input.end(), destination.begin(), op_increase);

    // Print out results from the second attempt.
    std::cout << "Second Attempt: ";
    for(std::vector<int>::iterator iter=destination.begin(); iter != destination.end(); ++iter)
    {
        std::cout << *iter << ' ';
    }
    std::cout << std::endl;

    // This is all well and good, but what if for some reason you
    // don't want to pre-allocate the size of your destination container.
    #if defined THIRD_ATTEMPT

    // Reset the example by clearing out the destination container.
    destination.clear();

    // Perform the transformation using a back inserter. Back inserter's are really cool
    // because they return an iterator that can insert elements at the end of
    // a container! [2]
    std::transform( input.begin(), input.end(), std::back_inserter(destination), op_increase );

    std::cout << "Third Attempt: ";
    for(std::vector<int>::iterator iter=destination.begin(); iter != destination.end(); ++iter)
    {
        std::cout << *iter << ' ';
    }
    std::cout << std::endl;

    // There may be valid reasons to do this (although I can't
    // think of any). The advantage of pre-allocating space for your
    // container is that the allocation is only made once- if you
    // did multiple insertions via a back_inserter it will end up
    // allocating space multiple times. For a vector this can be
    // a substantial performance hit!

    #endif

    // What if we wanted to insert into a set instead. We can't pre-
    // allocate the memory for a set because it is typically implemented
    // using a balanced tree so allocation is done on the fly. A back
    // inserter cannot be used because back inserter requires a
    // container has a "push_back" method.
    #if defined FOURTH_ATTEMPT
   
    std::set<int> destinationSet;
   
    // Okay, so looking at the documentation for this I was at first
    // puzzled.
    std::transform( input.begin(), input.end(), std::inserter(destinationSet, destinationSet.begin()), op_increase );

    // What is going on here? destinationSet.begin() isn't always the best place
    // to insert (obviously, especially if this is a set!). Here is what is
    // going on:
    //
    // 1) std::inserter(destinationSet, destinationSet.begin()) returns a
    //    std::insert_iterator object.
    //
    // 2) The std::insert_iterator object when constructed stores the iterator
    //    passed in.
    //
    // 3) When std::insert_iterator::operator= is invoked, we end up invoking
    //    std::set::insert( iterator, value ) where the iterator is the
    //    iterator that was already stored. [3]
    //
    // 4) This is the part I was missing, there is a version of std::set::insert
    //    that does indeed take an iterator! I forgot that this form allows developers
    //    to specify a "hint" as to where to insert. [4]

    std::cout << "Fourth Attempt: ";
    for(std::vector<int>::iterator iter=destination.begin(); iter != destination.end(); ++iter)
    {
        std::cout << *iter << ' ';
    }
    std::cout << std::endl;

    #endif


    #if defined FIFTH_ATTEMPT
   
    std::cout << "Fifth Attempt (BEFORE): ";
    for(std::vector<int>::iterator iter=input.begin(); iter != input.end(); ++iter)
    {
        std::cout << *iter << ' ';
    }
    std::cout << std::endl;

    // Apply the transform directly on the input container! This is so frigging
    // cool because you can imagine all of the uses this could have.
    std::transform(input.begin(), input.end(), input.begin(), op_increase);

    std::cout << "Fifth Attempt (AFTER): ";
    for(std::vector<int>::iterator iter=input.begin(); iter != input.end(); ++iter)
    {
        std::cout << *iter << ' ';
    }
    std::cout << std::endl;



    #endif


    return 0;
}

/**

REFERENCES

[1] - http://www.cplusplus.com/reference/algorithm/transform

[2] - http://www.cplusplus.com/reference/iterator/back_inserter

[3] - http://www.cplusplus.com/reference/iterator/insert_iterator

[4] - http://www.cplusplus.com/reference/set/set

[5] - Stroustrop, Bjarne; The C++ Programming Language, January 2006 Page 531

*/

No comments:

Post a Comment