Lewis Manor: 2014

Friday, April 11, 2014

Boost Serialization (Part 2) : Handle/Pimpl Idiom

Introduction

In my first attempt at explaining Boost Serialization I tried to outline a simple example that explained how to serialize shared pointers with base classes [1]. This blogpost focuses on how to use boost serialization when using the pimpl pattern (handle idiom, pimpl idiom). This post will re-emphasize what was posted in the previous post and add some new things in along the way.

Before Serialization is Added

Let's say you have a class called "Object"; Object is a base class that has a private implementation. Right now it doesn't have boost serialization implemented. Let's say it looks like this:

/**
* object.h
*/
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

private:
boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif

/**
* object.cpp
*/
#include "object.h"

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }
std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

Object::Object() : pImpl(new ObjectPrivate()) { }
std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

Very simple. This isn't a trivial case either; many times all classes in a library should be set up this way because it lends itself nicely to binary compatibility, lazy evaluation, and fast swapping.

Adding Serialization

Here comes the tough part, how do you serialize something when you don't know what it is?! Let's say we add the serialization method to the base class:

/**
* object.h
*/
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

        template <typename Archive>
void serialize(Archive& ar, unsigned int const version)
{
ar & d_ptr;
}

private:
boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif

This obviously won't work; not only is it colored in red but the compiler doesn't know anything about ObjectPrivate. The serialization of a shared_ptr is already handled by a separate header, so I'm ignoring that. The ObjectPrivate implementation is held in the cpp file- so how do we make this work? The key to the fix is separation of the definition from implementation. The result looks like this:

/**
* object.h
*/
#ifndef OBJECT_H
#define OBJECT_H

#include <string>

struct ObjectPrivate;
class Object
{
    public:
        Object();
        virtual ~Object() { }

        std::string const& getId() const;
        void setId(std::string const& id);

        template <typename Archive>
void serialize(Archive& ar, unsigned int const version);

private:
boost::shared_ptr<ObjectPrivate> pImpl;
};

#endif

/**
* object.cpp
*/
#include "object.h"

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

This is how we implement the body of the serialize method. This is not the end of the story though- sadly this will report a variety of different errors:

Error #1: Object::serialize is not defined for whatever archive you are using.

This is a semi-obvious linker error. Remember that the templatized definition is defined in the header while the implementation is produced in the cpp file. Since the cpp-file is compiled separately and outside of whatever may be using the Object::serialize method it does not know what the template arguments are.

Put it another way- say you have object.cpp and main.cpp; main.cpp uses object.h's Object::serialize<boost::archive::text_oarchive> method. Note that main.cpp is the only one that would know that boost::archive::text_oarchive is used.

#include "object.h"

int main(int argc, char** argv)
{
    dataclient::Object o1;
    std::stringstream out;
    boost::archive::text_oarchive oa(out);

    // The following line will invoke Object::serialize<T> where
    // T = boost::archive::text_oarchive.
    oa << o1;

}

When the compiler goes to build object.cpp, it knows absolutely NOTHING about boost::archive::text_oarchive! Therefore the body never gets compiled. This results in the linker error stating that Object::serialize(boost::archive::text_oarchive& ar, unsigned int const) could not be found.

Courtesy of the boost documentation [2], manually instantiate the templates inside of object.cpp:

/**
* object.cpp
*/
#include "object.h"

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
    boost::archive::text_iarchive& ar
, unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
    boost::archive::text_oarchive& ar
, unsigned int const version );

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

Manually instantiating the templates solves the problem at a price: the developer must know exactly what types of archives will be used to serialize beforehand. There is probably a more elegant solution but as the documentation states: this is the price you pay for using templates.

Error #2: Compiler Doesn't know how to serialize shared_ptr.

The telltale sign that the compiler is complaining about not knowing how to serialize shared_ptr is if you see:

class boost::shared_ptr<XXX> has no member named serialize

This means that boost serialization doesn't know what to do with a shared_ptr. Add the appropriate serialization header to your cpp file:

/**
* object.cpp
*/
#include "object.h"

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/shared_ptr.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
    boost::archive::text_iarchive& ar
, unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
    boost::archive::text_oarchive& ar
, unsigned int const version );

struct ObjectPrivate
{
    std::string id;
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

This fix works as well for STL containers. Just include the appropriate serialization header.

Error #3 Private Implementation doesn't have serialize method

Another easy one, just add a boost serialization method to the private implementation:

/**
* object.cpp
*/
#include "object.h"

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/shared_ptr.hpp>

template void Object::serialize<boost::archive::text_iarchive>(
    boost::archive::text_iarchive& ar
, unsigned int const version );

template void Object::serialize<boost::archive::text_oarchive>(
    boost::archive::text_oarchive& ar
, unsigned int const version );

struct ObjectPrivate
{
    std::string id;

template <typename Archive>
void serialize(Archive& ar, unsigned int const version)
{
ar & id;
}
};

Object::Object() : pImpl(new ObjectPrivate()) { }

template <typename Archive>
void Object::serialize(Archive& ar, unsigned int const version)
{
ar & d_ptr;
}

std::string const& Object::getId() const { return pImpl->id; }
void Object::setId(std::string const& id) { pImpl->id = id; }

The next blogpost on boost serialization I'll review how to make this base class perform polymorphic serialization.

[1] - http://lewismanor.blogspot.com/2013/04/boost-serialization-part-1.html

[2] - http://www.boost.org/doc/libs/1_55_0/libs/serialization/doc/pimpl.html

Friday, March 7, 2014

Simple Public-Private Key Erlang Tutorial

Introduction

This tutorial assumes you have no background in cryptography and very little background in Erlang. The purpose this tutorial serves is to create a simple example of how to encode/decode simple strings inside of Erlang. Up front this is the source material I am drawing from:

http://erlang.org/doc/man/crypto.html
http://www.erlang.org/doc/apps/public_key/using_public_key.html
http://www.openssl.org/docs/HOWTO/keys.txt
http://stackoverflow.com/questions/4294689/how-to-generate-a-key-with-passphrase-from-the-command-line

Creating the Keys

I'm creating this using Linux, in a terminal window:

> openssl genrsa -des3 -out private_key.pem 2048

This will generate a key file. It will prompt you for a password (unless you omit -des3). To verify that it worked, just simply:

> cat private_key.pem

-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,2FFDE296A2F2FE08

Kjw01WesO4BUfs4GM31LH/DCTzlyaulsRwmGaPdoc6kyaB8TC2KGa4BxEYyKxDin
UBB2Oo/qKX4uccGQwiAfK1IY3UFA7m4HbD2RjpHOY5nru/dBhBvYisnepFNEahbg
L1vyfp8UdH4DO9el1utCBznSnCJuwGK1u9bBCZoWyaFxv09Zidc1WxtGXfZjyg/w
Z0R5PEOTrYl6vHsstFeEqDtBRAzZGjmBdgTzOYKo9VAFsHqgdGImdcf5GsXIhXl1
C55UjjLYs88/qa0snMG2qR74vZDNboFcRi8IgXwNJjx2os4GV9q2gWjLjhAcudZ2
hfU1XIo74F4tg7qGmogqkV1FRNJASnGPdITqs9Q+Ll19Qy8a8zOS1bZURZyM7gwd
nWGQU0NAYb0v0Q8VsbzIUjFba00xdRD8wxU0vE+uNHJa9foT8fhCEGwavbvbuktZ
+1JY8hqvELaWNBdSbibhe0gNGO/n2bS/oIVeGGyPzbl9ahXk+rwllFfk0tGaZIm0
S385RyqnLwS/Dy0xmBvC/88bsulqCpePrsEzkD08nVlAlxj+IltEx9447gb/0q+8
n5xX6Azk+EHMkqTfPHjUD6vu1HWqCmPMC5+VqtTPF0T9bIWnqGNjbINL9B4GbV31
YGWXM5oGknme/57Vdg2F2vmJPAhLs48oLKvnUEPl5z+W1HTWdLSUvuw2UYcTCLwx
c3U6qnqRGk6XqVQy+VYtndnWUZCQzhmTbP05GCiFyhFpAch0jGI2dIyaT+con107
rskaLkBfzaUBD2aZrpUQpMapKVe3JrA4kyMOm6jY2VOnloUrxUlnIKHtIPPCjLoC
EYVY2djVgcdMGWAvfE6YGQ6KrQfwa7JkT2lg+yJKYoA54U2yDMN5z6+5lK7LwO8S
rV0XbQcqAuo4dN10gGi0poUgK32a5GxiOAy4LJ4bcg87iFkJQhgiopEgmq7gU1F8
-----END RSA PRIVATE KEY-----

When you create RSA private keys this way, it also contains a public key. The public key should also be in its own file too:

> openssl rsa -in private_key.pem -pubout > public_key.pem

You'll be prompted for the password to the private key.

Basic Encoding

Open up your Erlang shell now and do the following (I'm not going to pretend I know what is going on):

> {ok, PemBin } = file:read_file("private_key.pem").
> [ RSAEntry ] = public_key:pem_decode(PemBin).
> PrivateKey = public_key:pem_entry_decode( RSAEntry, "yourpassword").

These were three of the most difficult lines to figure out from their documentation. It's the simplest way to obtain the RSA Private Key without fuss. Now that we have the key, you can perform an encryption operation on some plain text:

> Encrypted = public_key:encrypt_private( <<"Hello World!">>, PrivateKey ).

This will return back some encrypted data now stored in Encrypted. Please note: The first argument to encrypt_private MUST BE A BINARY! This cannot be a string. It will give you an error that is not helpful!

Basic Decoding

To decode this message you'll once again need to load a key; this time the public key. That's the point of asymmetric cryptography: two keys. We encrypted using the private key, now we have to use the public key to decrypt:

> { ok, PemBin2 } = file:read_file("public_key.pem").
> [ RSAEntry2 ] = public_key:pem_decode(PemBin2).
> PublicKey = public_key:pem_entry_decode( RSAEntry2 ).

Now we have the public key. Let's decode our original message:

> Decrypted = public_key:decrypt_public(Encrypted, PublicKey).

You should get <<"Hello World">> back.

Public Key Encode, Private Key Decode

The above example shows how to encode using a private key and decoding using a public key. If you want to encode using a public key and then decode using a private key just do the following:

> A1 = public_key:encrypt_public(<<"Isn't this fun">>, PublicKey).
> A2 = public_key:decrypt_private(A1, PrivateKey).

That's all!

Thursday, March 6, 2014

Erlang Unit Testing

Erlang Unit Testing is really simple. There is a system already built into Erlang OTP called EUnit. The documentation can be found here:

http://www.erlang.org/doc/apps/eunit/chapter.html

A couple of notes on usage:

1) Tests are any function that has a _test at the end of its name. So a test function might look like:

create_user_name_test() -> "JR" = (user:create("JR"))#user.name.

2) The tests that are created do NOT have to be exported. They'll be exported automatically by Erlang.

3) To run unit tests from the shell, simply do the following:

> eunit:test( ModuleName ).

And that's it. Thus far this is what I've been using to unit test my code. It seems to work pretty well, although it is causing some conflicts with my Makefile every now and then. No big deal. Another difference I noticed about Erlang Unit Testing vs. something like gtest or UnitTest++ for C++ is the lack of checker macros. There doesn't appear to be much use for stuff like CHECK_EQUAL( , ); the reason for this is Erlang's natural ability to do pattern matching on items that are called. In C++ you can't do the following:

13 = callingMyCppFunction();

That doesn't make sense in C++. In Erlang it makes perfect sense:

13 = calling_my_erlang_function().

And the test would succeed if calling_my_erlang_function() returns 13. Why add checker macros if your language already implicitly supports a similar feature?

Wednesday, March 5, 2014

YAWS Home Router Non-Root Run

Last night I was starting to contemplate how to make my site more secure. A little about my site:

Run on a Virtual Machine so as to insulate from attacks on my host machine. It's easier to restore this way too; someone hacks it I just restore the machine back to what it was.
It's run from a home network. I have access to the router and its settings.
Running CrunchBang Linux, it's a Debian distro (just like Ubuntu).

Up until this point I've been running YAWS as root. The reason why I've been doing this is because privileged ports (anything less than 1024) are not accessible for non-root users. At the site:

http://yaws.hyber.org/privbind.yaws

They give several options for how to run yaws as non-root. It's important that we run as a regular user instead of root. None of the ways documented in the YAWS documentation would work for me. I did like the idea of patching the kernel and maybe I'll try it later but for now I just wanted security.

This morning I came up with a really elegant solution for my situation: use an unprivileged port and then make the router re-route traffic from port 80 to my non-privileged port! It was such an easy solution that I didn't think it would work.

Step 1: Open yaws.conf and change your port to an unprivileged port. I used 8080:

port = 8080

...

</server>

Anywhere port = 80 is shown, change it to port = 8080. The 8080 port is greater than 1024 and no other application was using it.

Step 2: Find your IP Address. Just do an ifconfig command from the terminal. You should get something like this:

The inet addr is your IP Address.

Step 3: Open your router configuration page. Normally you can just open a web browser and go to 192.168.1.1.

Step 4: Find the Port Forwarding page. For Linksys models you can go under Applications & Gaming > Single Port Forwarding.

Step 5: Fill out the following:

External Port: 80

Internal Port: 8080

Protocol: Both

To IP Address: (your computer's IP Address)

Enabled: Checked

It's that simple. Start your YAWS server as a non-root user and it just works. No fussing with configuration files or installing any other crazy software. Although modifying the kernel does sound fun...

Tuesday, March 4, 2014

YAWS Login Page Tutorial

Introduction

This blog entry is all about creating a YAWS-based login page. This is a lot more challenging of a task than I first thought! This entry should take you from almost 0 to a working login-page stub. First some links up-front:

http://hyber.org/yaws.pdf : Chapter 7 has most of the content in a really advanced form.
http://en.wikipedia.org/wiki/Web_cookie : What is a Web cookie? Seriously, I didn't know.
http://jrserlangnotes.blogspot.com/2014/02/yaws-setup.html : Just in case you need to know how to set up YAWS.

This is going to be a LONG post because there's a lot to learn and understand about web technologies, the Erlang language, and YAWS data structures. If you are looking for a quick guide just use the YAWS guide- this tutorial is probably better for the beginner.

Prerequisites

This entry assumes you have a working YAWS system in place and are familiar with your system. This entry is using CrunchBang Linux but all of the ideas should work anywhere. The reader will also need access to being able to set environment variables. This just makes our life easier.

Setup YAWS_INCLUDES

The very first thing that needs to be done is identifying where a file called yaws_api.hrl is. The reason this file is important is that it's where the "Argument" record is defined. The example in Chapter 7 of the YAWS documentation seems to hint at wanting to change the Argument (more on this later). There are other scenarios where wanting access to the yaws_api.hrl file exist; let's make it easier to get to.

1) Find yaws_api.hrl on your system. In Windows you can do a fancy search in the explorer for the file. On Linux, be more fancy and amaze your friends:

> find . -name yaws_api.hrl

Obviously do this in a directory far enough down where it can be found. Since I built mine from source it's in:

/usr/local/lib/yaws/include/yaws_api.hrl

I did a search through the contents of the directory and it doesn't appear that any of them use any other include directives, so we don't have to worry about pathing so much.

2) Add an environment variable called YAWS_INCLUDES.

In Windows a shortcut key to get to the place to add system environment variables is to hold down the Windows key and press the Break button (get it, break-Windows?). I give credit to Josh for that one. The dialog that pops up should have an "Advanced system settings" where you can access the environment variables from.

In Linux, modify your .bashrc file to have this definition:

> vim ~/.bashrc

Add at the bottom:

export YAWS_INCLUDES="/usr/local/lib/yaws/include"

And save. Once back out type:

> source ~/.bashrc

That will reload your bashrc without the headaches of logging out and back in.

3) Check to make sure your environment variable was set.

Open up a Terminal in Linux (or a Command Window in Windows). In Windows type:

> echo %YAWS_INCLUDES%

And in Linux:

> env | grep YAWS_INCLUDES

How Blocking Pages Works (Part 1)

The first question that I had about the YAWS webserver (or any webserver) was: How do I make sure that pages that are sensitive are blocked to other users? New developers have no clue how this mechanism works. It all boils down to the "Arg" request.

When the customer of your website types in a URL:

http://www.yourawesomesite.com

or whatever, the browser puts together a request and sends it over a TCP Socket port 80. The YAWS webserver is listening on port 80 and turns that request into an Arg record which can then be consumed by our Erlang code via the "out" functions (remember, they take an Arg as input). What YAWS can do is it can take this request and conditionally rewrite it before it ever gets processed. This means that if a user is seen as "not logged in" we can redirect them from blocked pages to pages that are safe for general viewing (without a login). I'm going to set up an example similar to what they have in the YAWS documentation but the difference is I'm going to go in reverse.

The first step is to set up a container of allowable pages that the user will want to visit. In the YAWS book the name of the file is "myapp.erl", we are going to set up "lm_app.erl". Note: I am putting all of my Erlang modules inside of the www directory of my YAWS website. This is probably not the best place for these pieces of code!

%%=======================

%% lm_app.erl
%%

%% @version 0

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

login_pages() ->

[ "/index.yaws", "/lewismanor.jpg" ].

Please note that I'll keep incrementing the @version tag at the top during each iteration of the code in this document. The first line that does anything is the -include directive. This is needed because eventually we want to modify the incoming request. The incoming http request is stuffed into a data structure defined by the YAWS source code. Conveniently in a step before this we found out where it lives and created the environment variable YAWS_INCLUDES.

The login_pages() function simply outlines what pages on the system don't require login credentials! In this case, I've made the main page and the title image both accessible without credentials. The next step is to create a main entry point for Argument "rewriting". This function is called "arg_rewrite". The way you specify your own custom arg_rewrite function is to change the module that YAWS looks under. For now lets create a stub for this function (just to prove what it does):

%%=======================

%% lm_app.erl
%%

%% @version 1

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

-export([arg_rewrite/1]).

login_pages() ->

[ "/index.yaws", "/lewismanor.jpg" ].

arg_rewrite(Arg) ->
io:fwrite("ARGUMENT REWRITING HIT!~n"),
Arg.

This function alone isn't going to be enough since we haven't specified to YAWS what module should be used to do argument rewriting. Also please note that before continuing the code in lm_app.erl should be compiled and in beam format!

Open up the yaws.conf file and add a line to the server:

<server yourawesomesite.com>
port = 80
arg_rewrite_mod = lm_app
listen = 0.0.0.0
docroot = ...
auth_log = ...
appmods = ...
</server>

The bold blue line is what you want to add. This is stating to YAWS that we want the module lm_app to supply arg_rewrite. As an experiment once you are done restart your YAWS server (make sure that lm_app.beam is also available!).

> sudo yaws -i

After YAWS starts, navigate to your website from another panel. You should see:

ARGUMENT REWRITING HIT!

In your terminal window. What is happening is that when a customer accesses your website page (index.yaws), it creates an "Argument" that models the http request, then that gets fed to lm_app:arg_rewrite for modification. Whatever Argument is returned from lm_app:arg_rewrite is what is used going forward!

What's in Argument?

I have to interrupt the previous topic about blocking pages because it's a good time to cover what's in Argument. Every member of Argument is documented in Chapter 4 of the YAWS documentation (right now it's on page 14). Here are the contents (as of today):

Connection Information( clisock, client_ip_port ): Developers have access to the socket that the client is using for connection and the client's ip address and port.
Header Information (headers)
HTTP Request Information (req): The request can further be broken down into three more pieces of information: method, path, version.

During a session some information needs to be available: username, password, possible other data. To define this information we can use an hrl file. I'm using similar information as the documentation:

%%=======================

%% lm_session.hrl
%%

%% @version 0

%%=======================

-record( session, { user, passwd, udata=[]}).

Make sure that this lm_session.hrl file is accessible from within the code.

The Login Page

At this point I want to skip to the login page- the actual page that takes the username and password. Inside of index.yaws, :

<html>
    <title> My Awesome Page </title>
        <form action="/login_post.yaws" method="post">
            <p> UserName <input name="uname" type="text">
            <p> Password <input name="passwd" type="password">
            <input type="submit">
        </form>
</html>

Notice I'm not using the ehtml; I ran into an error where it simply stated that YAWS had an internal error due to formatting in the ehtml attributes. It was a pretty cryptic error- so to simplify we'll use basic HTML.

The important component to the form is the "login_post.yaws" target. This means we need to add a file called login_post.yaws that will handle the log information. Another thing to take notice of is that login_post.yaws is mentioned in the allowable login pages.

%%=======================

%% lm_app.erl
%%

%% @version 2

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

-export([arg_rewrite/1]).

login_pages() ->

[ "/index.yaws", "/lewismanor.jpg", "/login_post.yaws" ].

arg_rewrite(Arg) ->
io:fwrite("ARGUMENT REWRITING HIT!~n"),
Arg.

Now inside of login_post.yaws:

<!--

%%=======================

%% login_post.yaws
%%

%% @version 0

%%=======================
-->

<erl>

-include("lm_session.hrl").

kv(K,L) ->

{ value, {K, V}} = lists:keysearch(K,1,L),

out(A) ->

L = yaws_api:parse_post(A),

User = kv("uname", L),

Passwd = kv("passwd", L),

{ html, f("User Name: ~s<br>Password: ~s", [User, Passwd])}.

</erl>

There are some key differences in what is in the YAWS documentation versus what is here. First of all, this is just a jumping off point because this code doesn't do anything other than prints out your username and password on the webpage. Not really useful except to debug and foreshadow the use of cookies and authenticating (next section).

Cookies

We have the login page (index.yaws) and the page that it goes to after logging in (login_post.yaws). The next thing we need is to set a cookie in the logging in page- the cookie needs to base its information on the username, password, and whether or not it is valid. Now we set up the real code:

%%=======================

%% lm_app.erl
%%

%% @version 3

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

-export([arg_rewrite/1, authenticate/2]).

login_pages() ->

[ "/index.yaws", "/lewismanor.jpg", "/login_post.yaws" ].

arg_rewrite(Arg) ->
io:fwrite("ARGUMENT REWRITING HIT!~n"),
Arg.

authenticate(User, Password) ->
if
User =:= "me" andalso Password =:= "password" -> ok;
true -> false
end.

We have now added the authenticate function. Obviously you want to stub this with your own authentication logic; for now it is filled with dummy data. The username is "me" and the password is "password". Make sure to compile lm_app.erl into lm_app.beam and load the module back into yaws (see prior blog posts about how to do this).

Somewhere you have to call this authenticate function. Here is the code from my new and improved login_post.yaws file:

<!--

%%=======================

%% login_post.yaws
%%

%% @version 1

%%=======================
-->

<erl>
-include("lm_session.hrl").
kv(K,L) -> { value, {K,V}} = lists:keysearch(K,1,L), V.

out(A) ->
L = yaws_api:parse_post(A),
User = kv("uname", L),
Passwd = kv("passwd", L),

case lm_app:authenticate(User, Passwd) of
ok ->
{ html, f("Login Succeeds!", [])};
false ->
{ html, f("Login Fails!", [])}
end.

</erl>

This one gave me trouble because the YAWS documentation is incorrect about how to use kv.

Now let's add cookie creation to the source code:

<!--
%%=======================

%% login_post.yaws
%%

%% @version 2

%%=======================

-->

<erl>
-include("lm_session.hrl").
kv(K,L) -> { value, {K,V}} = lists:keysearch(K,1,L), V.

out(A) ->
L = yaws_api:parse_post(A),
User = kv("uname", L),
Passwd = kv("passwd", L),

case lm_app:authenticate(User, Passwd) of
ok ->
S = #session{ user = User,
passwd = Passwd,
udata = [] },
Cookie = yaws_api:new_cookie_session(S),
   [ {redirect_local, "/inside.yaws"}
                 , yaws_api:setcookie("lm_sid", Cookie) ];
false ->
{ html, f("Login Fails!", [])}
end.

</erl>

The new purple code that was added will create a new cookie session and register it with our server. It doesn't do much past this. My inside.yaws looks like this:

<!--
%%=======================

%% inside.yaws
%%

%% @version 0

%%=======================
-->

<html>
<body> Made it in! </body>
</html>

This portion of what this example is doing is just proving how redirect works. It's similar to the format of {html, "" } or {ehtml, ... } except in this case it just redirects the web browser. This process isn't finished yet. Before we continue, I have to mention that we require a couple more utility functions. These functions are actually listed in Chapter 7 of the YAWS documentation.

%%=======================

%% lm_app.erl
%%

%% @version 4

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

-export([arg_rewrite/1, authenticate/2,
check_cookie/2, get_cookie_val/2]).

login_pages() ->

The functions that are added are for convenience and are used to extract the opaque structure (listed in lm_session.hrl) from the incoming cookie data. These were taken directly from the YAWS documentation.

How Blocking Pages Works (Part 2)

The bulk of the work is now finished with the exception that customers can still navigate to hidden pages. To fix this, we need to revisit argument rewriting from above. The logic will be that if a magic cookie has been set: this shows we have logged in and should have access to private files. If no cookie has been set the website should redirect back to the main page (index.yaws).

Let's begin by rewriting arg_rewrite so that it will do this redirect magic for us:

%%=======================

%% lm_app.erl
%%

%% @version 5

%%=======================

-module(lm_app).

-include("$YAWS_INCLUDES/yaws_api.hrl").

-export([arg_rewrite/1, authenticate/2,
check_cookie/2, get_cookie_val/2]).

login_pages() ->

[ "/index.yaws", "/lewismanor.jpg", "/login_post.yaws" ].

get_cookie_val(CookieName, Arg) ->
    H = Arg#arg.headers,
    yaws_api:find_cookie_val(CookieName, H#headers.cookie).

check_cookie(A, CookieName) ->
    case get_cookie_val(CookieName, A) of
        [] -> {error, "Not Logged In" };
        Cookie -> yaws_api:cookieval_to_opaque(Cookie)
    end.

do_rewrite(Arg) ->
    Req = Arg#arg.req,
{ abs_path, Path } = Req#http_request.path,
case lists:member(Path, login_pages()) of
true -> Arg;
false ->
Arg#arg{
req = Req#http_request{
path = { abs_path, "/index.yaws" }
},
state = { abs_path, Path }
}
end.

arg_rewrite(Arg) ->
OurCookieName = "lm_sid",
case check_cookie(Arg, OurCookieName) of
{error, _} -> do_rewrite(Arg);
{ok, _Session} -> Arg
    end.

authenticate(User, Password) ->
if
User =:= "me" andalso Password =:= "password" -> ok;
true -> false
end.

That's all. To explain, any request incoming to the webserver goes to arg_rewrite. Our cookie name is "lm_sid", therefore that's what we are looking for. If when we check for the cookie it doesn't exist (meaning we don't have credentials) we rewrite the request. If the cookie does exist we pass along the request as usual (by passing the Argument out).

Rewriting the request is done in do_rewrite. The original request is obtained from the incoming Argument (Arg). From that request we can determine if the request is asking for a safe page (any page listed in login_pages()). If it is, pass along the request as usual (returning Arg). If it isn't legit we create a new Argument from the existing Arg (Arg#arg{ ... }) where the path redirects to index.yaws.

Conclusion

By the time all of this is set up, the website should be accessible. When trying to access a private page (i.e. inside.yaws), it will redirect you back to the main login page. If you login a cookie is set and you will be granted access to those private pages. Subsequent access to the site will allow you to side-step logging in because the cookie was set.

Hopefully this blog entry helps another developer out. The best advice I can give anyone embarking on this same course is to doubt the documentation. A couple times I got stuck because I typed what was in the examples verbatim; some of the examples are flawed and written for someone who is more intermediate.

Saturday, March 1, 2014

Running Code on Load

on_load

According to:

http://www.erlang.org/doc/reference_manual/code_loading.html

We can run code once we load the document by using a directive called on_load:

-module(test2).
-on_load(run_it/0).

run_it() -> io:format("RAN IT!~n").

Notice that an export isn't necessary! So from the shell when I do this:

> c(test2).

I get:

RAN IT!
ok

Just a neat little trick.

Dynamic Compilation

From the Erlang shell you can use:

> c(modulename).

To compile and load a module. From an actual source code perspective:

> compile:file("filename.erl").

So for an example I have the contents of test1.erl:

-module(test1).
-export([dostuff/0]).

dostuff() -> compile.file("test2.erl"),
test2:printMessage().

Contents of test2.erl:

-module(test2).
-export([printMessage/0]).

printMessage() -> io:fwrite("Hello World!~n", []).

Next step is to open the Erlang shell and load in test1.erl:

> c(test1).

If you open a terminal in the same directory, you'll notice the contents are:

test1.erl
test1.beam
test2.erl

But there is no test2.beam because it hasn't been compiled/run yet. Now from the Erlang shell where the compile was run do the following:

> test1:dostuff().
Hello World!
ok

If you list the directories it shows:

test1.erl
test1.beam
test2.erl

test2.beam

The test2.erl file was compiled and loaded from source. Pretty cool that we can dynamically load code as needed.

Wednesday, February 26, 2014

Binary Message Encoding

Learned a couple of really cool pieces of syntax since yesterday and it's all about Binaries. Developers can directly manipulate strings of binary information in a natural way in Erlang. Say you have a server somewhere that is written in C++. In my case I used Qt as my socket provider. The server packs in code like this:

union CharToInt
{
    quint32 number;
    char buffer[ sizeof(quint32) ];
};

void packStringMessage(QString const& str, QTcpSocket& socket)
{
    CharToInt c;
    c.number = str.size();
    socket.write(c.buffer, sizeof(quint32));
    socket.write(str.toLocal8Bit()); //< Simple char-array.
}

This code is simple. It encodes a size of the string followed by the string itself. In Erlang you can simply do the following (once you have the message):

<<Size:32/little, Payload/binary>> = Message.

It will store the size of the message inside of Size and the rest of the binary data inside of Payload! What are the arguments all about? Well let's dissect:

<<Size:32/little, Payload/binary>> = Message.

Binaries are inside of << and >>. Whatever is inside is the makeup of the binary.

<<Size:32/little, Payload/binary>> = Message.

Notice that we can do the same pattern matching strategy as lists/function calls. I think this is the defining feature of Erlang.

<<Size:32/little, Payload/binary>> = Message.

The Size token is an identifier stating what variable the matching binary will be put inside of. The :32 specifies how many bits we are using. I'm taking a lot of liberties with this example; assuming quint32 fits in 4 bytes and that we are storing in little-endian notation. That's what the /little states to Erlang. The data that Size corresponds to should be 32 bits and in little-endian notation! Really convenient.

<<Size:32/little, Payload/binary>> = Message.

The final part to this is that the end of the binary (everything else) will be stored in Payload and should be binary data. I was surprised at how easy this was to decode binary messages in Erlang.

Tuesday, February 25, 2014

Redis

Taking a break from Erlang momentarily to install Redis. Redis is a database that specializes in storing key-value pairs.

Getting Redis

1) Go to the Redis downloads page and download the latest tarball source. http://redis.io/download I'm using version 2.8.6, which is the latest stable version as of today's date.

2) Unpack Redis:

> gunzip redis-2.8.6.tar.gz
> tar -xf redis-2.8.6.tar

3) Step into the redis directory and build the source:

> cd redis-2.8.6
> make
> sudo make install

At this point I'd like to give props to the Redis developers. To have a Makefile that "just works" is phenomenal! So far I'm really liking Redis- if only for the ease of building/installation.

Running Redis

I should also note at this point- similar to how yaws requires an instance to be running either as daemon or interactive, redis is the same way. To run the server in an interactive mode:

1) Open a Terminal.

2) Type:

> redis-server

You should see some decorative text. If you haven't done this step you may get an error message such as:

Could not connect to Redis at 127.0.0.1:6379: Connection refused

Not good. So to avoid this, run redis-server and be happy.

Erlang Tie-In

Introducing eredis, by Github user wooga. This is some cool stuff- you can use the instructions at https://github.com/wooga/eredis to connect to the redis server from your Erlang application! The instructions are good so I won't repeat them here. Something interesting to note is that the function eredis:q takes two parameters: a connection to the redis database and a list. The list is cool because it's the string components of the query. If you are going to set some data:

eredis:q( Connection, [ "SET", "mykey", "Hello World" ] ).

It doesn't get much simpler. The return from this function is a tuple where the first element should be "ok" or whatever the resulting status was. The second argument will be whatever the values returned are.

Conclusion

Being unsure about databases I really like redis. It's very simple- just key/value. Connecting to it couldn't be easier. Running it couldn't be easier. I'm thinking that the purpose of this post is to try and get the word out about this. If I run into any problems later I'll come back to do an addendum.

Monday, February 24, 2014

Makefile

Introduction
The way to structure projects in Erlang is a mystery to me. In C++, it's simple because everything names itself in the projects:

/project
    include/
project /
    src/
test/
    resources/
    views/

Then in the root directory I normally place my qmake or premake4 files. This seems like second nature. After building we'd also end up with some other directories to hold binaries, libraries, and object files.

In Erlang we have the notion of macro/definition files (hrl's); we also have source code (erl's). The compiled output for Erlang is a beam file. How does everything normally get structured? To better answer this question I thought I'd examine the Erlang makefile.

Getting It

I grabbed a version from git:

> git clone https://github.com/extend/erlang.mk.git

This gives you erlang.mk. Have this handy to copy into your experimental directory.

Trying it Out

Follow the steps below to make a small test:

> mkdir experiment
> cd experiment
> cp erlang.mk ./ # Or wherever erlang.mk exists.
> vim Makefile # Or your editor of choice.

Inside of Makefile, just add:

include erlang.mk

And save. Back in the terminal try this:

> make

This won't work because we don't have any sources set up! It will complain with:

find: `src': No such file or directory
find: `src': No such file or directory
find: `src': No such file or directory
find: `src': No such file or directory
APP    experiment.app.src
cat: src/experiment.app.src: No such file or directory

This is because we need to make a directory called src and put some Erlang sources inside of it:

> mkdir src
> cd src
> vim test1.erl

Put a test program inside of test1.erl, could be something simple like:

-module(test1).
-export([blah/0]).

blah() -> "Hello".

After saving, step down a directory and try to make again:

> cd ..
> make

This time it will create a directory called "ebin"; this is where all of the beam files end up. Some additional notes on structuring Erlang programs can be found here:

http://www.erlang.org/doc/design_principles/applications.html#7.4

The hrl files can go into an include directory.

Dependencies

We have our source code but we need to use other people's libraries/projects in our own. The Erlang Makefile is like a dream come true for this purpose: it can automatically pull down your required dependencies directly from git. I suspect it can do more than just that but for now we'll end with that.

In the line before include erlang.mk:

DEPS = eredis
dep_eredis = https://github.com/wooga/eredis.git
include erlang.mk

Now try building. Pretty impressive isn't it? The format here is to list your dependencies after DEPS, then for each entry you add a dep_ENTRY = followed by the location of the target library. The Makefile then automates building everything for you. I can't emphasize how cool this is, although most Erlang developers probably already know it.