Friday, April 11, 2014

Boost Serialization (Part 2) : Handle/Pimpl Idiom

Introduction

In my first attempt at explaining Boost Serialization I tried to outline a simple example that explained how to serialize shared pointers with base classes [1]. This blogpost focuses on how to use boost serialization when using the pimpl pattern (handle idiom, pimpl idiom). This post will re-emphasize what was posted in the previous post and add some new things in along the way.

Before Serialization is Added

Let's say you have a class called "Object"; Object is a base class that has a private implementation. Right now it doesn't have boost serialization implemented. Let's say it looks like this:

/**
 * object.h
 */
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

    private:
        boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif



/**
 * object.cpp
 */
#include "object.h"

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }
std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }


Object::Object() : pImpl(new ObjectPrivate()) { }
std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

Very simple. This isn't a trivial case either; many times all classes in a library should be set up this way because it lends itself nicely to binary compatibility, lazy evaluation, and fast swapping.

Adding Serialization

Here comes the tough part, how do you serialize something when you don't know what it is?! Let's say we add the serialization method to the base class:

/**
 * object.h
 */
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

        template <typename Archive>
        void serialize(Archive& ar, unsigned int const version)
        {
            ar & d_ptr;
        }

    private:
        boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif

This obviously won't work; not only is it colored in red but the compiler doesn't know anything about ObjectPrivate. The serialization of a shared_ptr is already handled by a separate header, so I'm ignoring that. The ObjectPrivate implementation is held in the cpp file- so how do we make this work? The key to the fix is separation of the definition from implementation. The result looks like this:

/**
 * object.h
 */
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

        template <typename Archive>
        void serialize(Archive& ar, unsigned int const version);

    private:
        boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif



/**
 * object.cpp
 */
#include "object.h"

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
    ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }


This is how we implement the body of the serialize method. This is not the end of the story though- sadly this will report a variety of different errors:

Error #1: Object::serialize is not defined for whatever archive you are using. 

This is a semi-obvious linker error. Remember that the templatized definition is defined in the header while the implementation is produced in the cpp file. Since the cpp-file is compiled separately and outside of whatever may be using the Object::serialize method it does not know what the template arguments are.

Put it another way- say you have object.cpp and main.cpp; main.cpp uses object.h's Object::serialize<boost::archive::text_oarchive> method. Note that main.cpp is the only one that would know that boost::archive::text_oarchive is used.

#include "object.h"

int main(int argc, char** argv)
{
    dataclient::Object o1;
    std::stringstream out;
    boost::archive::text_oarchive oa(out);

    // The following line will invoke Object::serialize<T> where
    // T = boost::archive::text_oarchive.
    oa << o1;
    
}

When the compiler goes to build object.cpp, it knows absolutely NOTHING about boost::archive::text_oarchive! Therefore the body never gets compiled. This results in the linker error stating that Object::serialize(boost::archive::text_oarchive& ar, unsigned int const) could not be found.

Courtesy of the boost documentation [2], manually instantiate the templates inside of object.cpp:

/**
 * object.cpp
 */
#include "object.h"



#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
      boost::archive::text_iarchive& ar
    , unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
      boost::archive::text_oarchive& ar
    , unsigned int const version );



struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
    ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

Manually instantiating the templates solves the problem at a price: the developer must know exactly what types of archives will be used to serialize beforehand. There is probably a more elegant solution but as the documentation states: this is the price you pay for using templates.

Error #2: Compiler Doesn't know how to serialize shared_ptr.

The telltale sign that the compiler is complaining about not knowing how to serialize shared_ptr is if you see:

class boost::shared_ptr<XXX> has no member named serialize

This means that boost serialization doesn't know what to do with a shared_ptr. Add the appropriate serialization header to your cpp file:

/**
 * object.cpp
 */
#include "object.h"

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/shared_ptr.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
      boost::archive::text_iarchive& ar
    , unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
      boost::archive::text_oarchive& ar
    , unsigned int const version );

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
    ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

This fix works as well for STL containers. Just include the appropriate serialization header.

Error #3 Private Implementation doesn't have serialize method

Another easy one, just add a boost serialization method to the private implementation:

/**
 * object.cpp
 */
#include "object.h"

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/shared_ptr.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
      boost::archive::text_iarchive& ar
    , unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
      boost::archive::text_oarchive& ar
    , unsigned int const version );

struct ObjectPrivate
{
    std::string id;

    template <typename Archive>
    void serialize(Archive& ar, unsigned int const version)
    {
        ar & id;
    }
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
    ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

The next blogpost on boost serialization I'll review how to make this base class perform polymorphic serialization.


[1] - http://lewismanor.blogspot.com/2013/04/boost-serialization-part-1.html

[2] - http://www.boost.org/doc/libs/1_55_0/libs/serialization/doc/pimpl.html