Lewis Manor: May 2013

Friday, May 17, 2013

Qt 5 for Android (Fail in MinGW)

I am very interested in running Qt 5 on Android. Qt/QML is THE platform for developing awesome user interfaces in less time than it takes to eat a delicious grilled dinner. When Digia/Qt Open Source announced that it would target Android I fired up the grill. But I was disappointed; even though there is a wiki page that has full instructions on how to set up Qt Creator to build Android apps, there are a copious amount of steps to do the setup! Having fallen victim to these in the past (either they are out of date, don't give enough info, etc.)- I figured I would write about my experiences in following the easy steps online.

The easy steps are found at:

http://qt-project.org/wiki/Qt5ForAndroidBuilding

I will give you background about myself: I am familiar with C++ and have a great love for it. I don't know a thing about developing for mobile/embedded platforms. This will be a challenge for me as I'm unfamiliar with the technical mumbo jumbo. I guess this makes me the perfect guinea pig to do this installation. A word of note before beginning, please take note of the system you are on; in particular know whether you have a 64 bit system vs. 32 bit. I'm doing this installation on a 32-bit system. Let's start!

0) I'm adding a step 0. All of the steps use tools other than what you may already have. The list of tools I've found so far includes:

Perl: I used Active Perl v. 5.16.3, seems to work so far.
7-Zip: Used this to unzip the NDK when you get it.
Git command line (when installing, make sure you add git to the DOS command line).
mingw compiler, with the compiler in the path.
The Java Development Kit (I'll talk more about this later).

After you are done installing all of these tools, make sure they all work on the command line. Open it up and try invoking each program (perl, git, and javac). 7-Zip can be used via a context menu (right click).

1) Android SDK. I downloaded this from http://developer.android.com/sdk/index.html. When I double clicked and installed I selected package 4.0 (that's what my ASUS Transformer can handle). This seemed easy enough so I'm still feeling good about this whole thing.

2) Android NDK. They say you have two options: use the official from Google or use the custom NDK provided in the link. Obviously they prefer the latter. http://code.google.com/p/mingw-and-ndk/downloads/list . This is where I get a little skiddish. There are a lot of choices you can make- so the best advice I can give is to try and match your platform the best you can. In my case, I chose android-ndk-r8e-ma-windows-x86.7z because I want the android NDK for Windows x86. It was the closest match.

3) Qt Creator 2.7.1. Now I didn't build this one- I downloaded the binary. http://qt-project.org/downloads/#qt-creator .

4) I cloned Qt 5 using the instructions on the main site:

git clone git://gitorious.org/qt/qt5.git qt5

cd qt5

perl init-repository

Surprisingly, the last command is what seems to be taking forever. This will do a bunch of checkouts into your system using git. In fact, it's taking so long I'm going to go to sleep.

5) I attempted to perform a configure call using:

./configure -developer-build -xplatform android-g++ -nomake tests -nomake examples -android-ndk <path/to/ndk> -android-sdk <path/to/sdk> -android-ndk-host windows-x86 -skip qttranslations -skip qtwebkit -skip qtserialport -skip qtwebkit-examples

This didn't work because it did not understand the option "android-ndk-host". At this point, my confidence was shot. I knew the rest of the steps wouldn't work no matter what. There are two issues I have with this configure step: 1) android-ndk-host isn't a recognized flag (the biggest problem) and 2) Where is the documentation for android-ndk-host? Is the argument if you are windows win32-x86 or windows-x86 or does it require the compiler? Some consistency across the platform names would help make this a smoother build process but I digress.

After taking out the -android-ndk-host windows-x86, it seemed to start working. At the very least configuration "succeeded".I chalked it up to, "Maybe the Qt development team removed the android-ndk-host flag, maybe I'm going to be okay!" No.

6) To build this thing, I had to use mingw32-make. This is essentially a cross compilation build and the tools for mingw32 are fine. It uses one of the compilers in the ndk toolchain to actually do the build. The next problem I've run into is that this will not build with mingw32 on Windows- It fails with:

Cannot find -lQtCore5

A valid error. I searched through the directory and I did locate libQtCore5.so. When you get errors like this you know they are coming from some sort of build script error where it isn't putting the library in the correct place. Low and behold I got this error when running mingw32-make:

process_begin: CreateProcess(NULL, move, libQtCore.so.5.1 "..\..\lib ", ...) failed.

Instead of trying to troubleshoot the build scripts on a platform that is fighting me the entire way, I think I'm going to jump over to Linux. It looks like the build is much more "stable" there. Sorry if it sounds like I am complaining about the Qt build; but I think that these are all perfectly valid flaws in the build process. I think the biggest flaw is lack of documentation about what the variables in configure do.

Wednesday, May 15, 2013

There Must Be Some Use For...

There's lots of obscure features in C++. One of which is "pointers to Class Members". I remember looking at some high performance code once that used this bizarre feature- I wondered about why this would be preferable. I'm not stating that this is a fair test but it definitely opens room for interpretation on what could be faster:

main1.cpp:

#include <algorithm>
#include <cstdlib>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>

enum
{
    SAMPLE_SIZE = 65536
};

class A
{
    public:
        A() : description(), id(rand() % 32767) { }

        std::string description;
        int id;

};

std::ostream& operator<<(std::ostream& out, A const& a)
{
    out << "A(" << a.id << ")";
    return out;
}

int main(int argc, char** argv)
{
    std::vector<A> elements;
    elements.reserve(SAMPLE_SIZE);
    for(size_t i=0; i<SAMPLE_SIZE; ++i)
    {
        elements.push_back(A());
    }

    std::ostream_iterator<A> iter(std::cout, "\n");
    std::copy( elements.begin(), elements.end(), iter);
}

And I'm going to compare against:

#include <algorithm>
#include <cstdlib>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>

enum
{
    SAMPLE_SIZE = 65536
};

class A
{
    public:
        A() : description(), id(rand() % 32767) { }

        std::string description;
        int id;

};

std::ostream& operator<<(std::ostream& out, A const& a)
{
    out << "A(" << a.id << ")";
    return out;
}

int main(int argc, char** argv)
{
    std::vector<A> elements;
    elements.reserve(SAMPLE_SIZE);
    for(size_t i=0; i<SAMPLE_SIZE; ++i)
    {
        elements.push_back(A());
    }

    int A::*idPtr;
    idPtr = &A::id; // Just a numeric value.

    for(size_t i=0; i<SAMPLE_SIZE; ++i)
    {
        std::cout << "A(" << elements[i].*idPtr << ")\n";
    }
}

The obvious difference is how we are iterating over the container. In main2.cpp I'm using the pointer to class member to access the value whereas the first program uses iterators to perform the outputs. Both solutions are probably acceptable- but I'm seeing better results for main2.cpp! At first I kept the sample size low (under 32767 numbers). For lower number of computations the numbers were identical. Once I upped the sample size to a higher number I've started to see a discrepancy.

The result:

main1:
real         0m0.021s
user         0m.016s
sys          0m0.000s

main2:

real         0m0.019s
user         0m0.016s
sys          0m0.000s

Like I disclosed before, I'm not sure if this is a fair test. Commentary would be nice on this post. The compiler used was gcc 4.7 on Ubuntu running in a virtual machine. I may consider researching this topic further to see if the performance improvements are consistent.

Tuesday, May 14, 2013

The Amazing Shared Pointer

Quick entry tonight since I'm bogged down with work for actual work.

Did you know you could add a custom deleter to a shared pointer in boost? Try the following:

#include <boost/shared_ptr.hpp>
#include <iostream>

void doSomething(int* i)
{
std::cout << "DOING SOMETHING" << std::endl;
}

int main(int argc, char** argv)
{
boost::shared_ptr<int> blah(new int, &doSomething);
}

Obviously the second argument to the shared pointer can be any sort of functor (provided that operator() takes whatever the shared pointer type is).

Like I said, short and lame tonight.

Monday, May 13, 2013

Direct vs. Copy Initialization (Part 1)


#include "a_func.h"
#include <iostream>

int main(int argc, char** argv)
{
	std::cout << "Case 1 A a1;\n";
	A a1;

	std::cout << "\nCase 2: A a2 = A();\n";
	A a2 = A();

	std::cout << "\nCase 3: A a3(a2);\n"; 	
	A a3(a2);

	std::cout << "\nCase 4: A a4; a4 = a3;\n";
	A a4;
	a4 = a3;

	std::cout << "\nCase 5: A getA() { return A(); } A a5 = getA();\n"; 
	A a5 = getA();

	std::cout << "\nCase 6: A getA2() { A a; return a; } A a6 = getA2();\n";
	A a6 = getA2();

	std::cout << "\nCase 7: A getA() { return A(); } A a7; a7 = getA();\n";
	A a7;
	a7 = getA();

	std::cout << "\nCase 8: A getA2() { A a; return a; } A a8; a8 = getA2();\n";
	A a8;
	a8 = getA2();

	std::cout << "\nCLEANUP!\n";
}


/**

RESULTS


Case 1 A a1;
    A()

Case 2: A a2 = A();
    A()

Case 3: A a3(a2);
    A(A const&)

Case 4: A a4; a4 = a3;
    A()
    A& operator=(A const&)

Case 5: A getA() { return A(); } A a5 = getA();
    A()

Case 6: A getA2() { A a; return a; } A a6 = getA2();
    A()

Case 7: A getA() { return A(); } A a7; a7 = getA();
    A()
    A()
    A& operator=(A const&)
    ~A()

Case 8: A getA2() { A a; return a; } A a8; a8 = getA2();
    A()
    A()
    A& operator=(A const&)
    ~A()

CLEANUP!
    ~A()
    ~A()
    ~A()
    ~A()
    ~A()
    ~A()
    ~A()
    ~A()

ANALYSIS:

Trying to clear up DIRECT vs. COPY initialization. It appears 
that DIRECT INITIALIZATION requires NO CONVERSION. The 
"constructor is available and is an exact match" [1]. The object
is initialized using a single constructor [2].

COPY INITIALIZATION is a little more difficult to understand.
It seems to consist of a series of "conversions". It looks
for ways to do the construction [1]. 

Case 1: It's obvious what's going on here, we construct an 
        object of A using the default constructor. It is 
        destroyed at the end of the scope of main. DIRECT.

Case 2: This one surprised me- only because I've been 
        mistaken a lot. Due to the RETURN VALUE OPTIMIZATION,
        we avoid the cost of a copy constructor and an 
        assignment operator. 

Case 3: This is a simple copy constructor object. Definitely
        is a DIRECT INITIALIZATION.

Case 4: We cannot use the RETURN VALUE OPTIMIZATION in this
        case because we are not using COPY INITIALIZATION. 
        We construct an object using DIRECT INITIALIZATION
        first, then we are forced to use the assignment
        operator.

Case 5, 6: This one surprised me. Only 1 constructor for A fires?
           The compiler must be doing something interesting 
           under the hood. This is copy.

Case 7: This makes sense, we invoke constructor a7 once. Then
        we create another A when calling getA(), finally it 
        uses the copy constructor to initialize. This is COPY
        INITIALIZATION.

case 8: We create a8 with a default constructor, then call getA2
        which creates another object of class A. Finally it uses
        the copy constructor- this is a chain of constructors
        and conversions so I think it is COPY.

It looks like Herb Sutter has some pretty interesting 
articles. Going to go over them a bit more because there is 
a lot more nuance to this than I originally thought. (This
post was supposed to be short!).

REFERENCES

[1] - http://stackoverflow/questions/1051379/is-there-a-difference-between-copy-initialization-and-direct-initialization

[2] - http://www.gotw.ca/gotw/036.htm

*/

Sunday, May 12, 2013

Uncharted Territory

Different projects require different demands out of C++. If the developer was writing medical software they would probably focus more on making the code correct rather than fast; if the developer was writing trading software they'd want it to be fast; if the developer was writing in-house tools they'd want the code readable. C++ allows us to do all of these things. Keep in mind that there are priorities in writing code. Just because a trading software engineer would focus on speed doesn't mean that safety goes out the window, it just means it might have a lower priority.

I've been thinking about ways to measure the efficiency of compiled programs. Most of the time I would use a profiler to do the work (callgrind). This is overkill for small snippets of code that I want to compare.

The next alternative would be to use a linux "time" command. It's simple, just call time with the name of your executable. The results will be that it comes back with listings of real time, user time, and sys time. This could be a valuable way to test small programs; I tried doing a search for the pitfalls of using this approach. Granted it wouldn't work for large-scale applications that may have threading/gui components unless of course there was a tremendous bottleneck (it would have to be enormous).

The second alternative is to compile to assembly language. I think this is a perfect solution if you can understand the assembly behind it. After a quick googling I found out that clang can do this:

clang++ -S -mllvm --x86-asm-syntax=intel main.cpp

This produces a main.s file that has a whole bunch of assembly language output in it. I am suggesting that for alternative #2 that small programs get written then compiled into assembly language to count and analyze the instructions. Let's try a small program:

int main(int argc, char** argv)
{
    int i = 0;
    ++i;
}

The assembly for this program is:

    .file    "main.cpp"
    .text
    .globl    main
    .align    16, 0x90
    .type    main,@function
main:                                   # @main
# BB#0:
    sub    ESP, 16
    mov    EAX, DWORD PTR [ESP + 24]
    mov    ECX, DWORD PTR [ESP + 20]
    mov    DWORD PTR [ESP + 12], 0
    mov    DWORD PTR [ESP + 8], ECX
    mov    DWORD PTR [ESP + 4], EAX
    mov    DWORD PTR [ESP], 0
    mov    EAX, DWORD PTR [ESP]
    add    EAX, 1
    mov    DWORD PTR [ESP], EAX
    mov    EAX, DWORD PTR [ESP + 12]
    add    ESP, 16
    ret
.Ltmp0:
    .size    main, .Ltmp0-main

    .section    ".note.GNU-stack","",@progbits

Wow. That looks intimidating. The first thing I noticed was a bit of meta-information at the top:

      .file    "main.cpp"
    .text
    .globl    main
    .align    16, 0x90
    .type    main,@function

This seems to be a collection of extra information that is helpful to us (the developers) and to debuggers [1].

main: # @main

This is the start of our main function area (a label).

sub ESP, 16

"As program adds data to the stack, the stack grows downward from high memory to low memory." [2]. Try the additional program below the references to see an example. As we go further up a call stack the lower the address number for stack variables! I would have thought it would go the other way- but I'm also directionally challenged.

ESP is the register that contains the stack pointer. So when we subtract 16 from the stack pointer we are essentially moving the stack pointer to make room for more "stuff". The local variables (in this case argc and argv) are going to be located in memory at [ESP + 24] and [ESP + 20]; it's good to note their original location was at [ESP + 40] and [ESP + 36].

mov EAX, DWORD PTR [ESP + 24]
mov ECX, DWORD PTR [ESP + 20]

Since this is assembly, we need to take the contents of the source inputs and put them into local variables. We cannot do an address to address move- we can take the contents of memory and put them into temporary registers. This is taking argc and argv and putting the contents into EAX and ECX.

    mov    DWORD PTR [ESP + 12], 0
    mov    DWORD PTR [ESP + 8], ECX
    mov    DWORD PTR [ESP + 4], EAX
    mov    DWORD PTR [ESP], 0

This is establishing the local variables. A difficult thing to remember is that the stack pointer (ESP) has already been shifted down enough to contain all of the local variables.The local variables will have the address of the Stack Pointer PLUS some offset in the stack.

Also, remember how we had previously stuffed the original argc/argv into registers EAX and ECX? This step is COPYING those values into local stack arguments (at [ESP + 8] and [ESP + 4]). The first assignment of [ESP + 12] to 0 is our return value. The second assignment of 0 to the memory location of the stack pointer is our "i" value.

    mov    EAX, DWORD PTR [ESP]
    add    EAX, 1
    mov    DWORD PTR [ESP], EAX

This takes the temporary location for i ([ESP]), moves it into the EAX register, adds 1 to it, then moves the contents of EAX back into the address of where "i" is [ESP]. Whew, this is getting me exhausted just looking at this!

mov EAX, DWORD PTR [ESP + 12]
add ESP, 16

Remember a couple steps up we stuffed a return value into [ESP + 12]? Yeah, it's back. We simply move the contents into register EAX (our return value). Finally, we add 16 to the stack pointer- it completely restores our prior state!

This post was all about converting to assembly and trying to read/interpret the results. I can tell in the future we can use this tool to detect possible areas where we are copying values around too much or we are accessing memory too much, etc. Assembly language is probably the best solution for analyzing small snippets of what the compiler produces (as long as you can read it).

REFERENCES

[1] - http://www.cs.wfu.edu/~torgerse/Kokua/More_SGI/007-2418-006/sgi_html/ch07.html

[2] - http://www.c-jump.com/CIS77/ASM/Stack/S77_0040_esp_register.htm

ADDITIONAL PROGRAMS

#include <iostream>

void funcall2()
{
    int i3 = 10;
    std::cout << "Address of i3: " << &i3 << std::endl;
}

void funcall()
{
    int i2 = 13;

    std::cout << "Address of i2: " << &i2 << std::endl;
    funcall2();
}

int main(int argc, char** argv)
{
    int i1 = 0;

    std::cout << "Address of i1: " << &i1 << std::endl;
    funcall();
}

OUTPUT:

Address of i1: 0xbf8b749c
Address of i2: 0xbf8b746c
Address of i3: 0xbf8b743c

Saturday, May 11, 2013

Update on recursive_mutex.


/**
 * In one of my last posts, I started looking
 * at recursive_mutex as a "clean" way to handle
 * recursive functions and locking. What I didn't 
 * realize is how "frowned upon" they are- Even 
 * the bad features have their appropriate uses.
 * So why so much hate for recursive_mutex?
 *
 * The first con to using a recursive_mutex is 
 * the cost involved in calculating the count.
 * Every time you lock there is a count that is
 * increased on the mutex. The same thread increments
 * this use count every time it locks. However,
 * to fully unlock it also needs to call 
 * unlock() the same number of times.
 *
 * The second con of using a recursive_mutex
 * is that it doesn't work well with scoped_lock,
 * which is a nice convenience class provided
 * by boost (remember the RAII implementation).
 *
 * The third reason (and the one I don't fully 
 * understand) is that unlocking the resource 
 * is sometimes not unlocking the resource. I 
 * don't want to reference where I read this-
 * Mainly because it's from a blog that reads
 * more like a rant.
 *
 * Considering the three reasons, I started 
 * playing around with some new implementations.
 * I am most concerned about the speed of using
 * the recursive mutex and second most concerned
 * about the readability. Therefore the prior
 * blogpost was re-written to use regular 
 * vanilla mutexes.
 */

// This header is needed to get the awesome 
// scoped_lock provided by boost. 
#include <boost/thread/mutex.hpp>

// This is needed to use the conditions.
#include <boost/thread/condition.hpp>
#include <boost/thread.hpp>

#include <iostream>

// We have one mutex for output. 
static boost::mutex mtx;


/**
 * Notice that the following two methods are 
 * exactly the same as before, with the exception
 * that they have the underscore afterwords. 
 * I hope that this would be a clue to readers
 * that this function is not to be called directly.
 * Normally you would make this innaccessable to 
 * other developers by making it private.
 */
template <typename T>
void print_(T const& t)
{
	std::cout << t;
}


/**
 * These methods are exact copies of the prior
 * blogpost with the exception that they do not 
 * lock at all. This is recursive behavior- so
 * locking would cause deadlocks.
 */
template <typename T, typename... Parameters>
void print_(T const& t1, Parameters... parms)
{
	std::cout << t1;
	
	if (sizeof...(Parameters))
	{
		print_(parms...);
	}

}


/**
 * Here is where the implementation design comes
 * in. Our public function that is non-recursive
 * does the locking! It then simply calls our 
 * private recursive functions. This eliminates 
 * the need for a recursive lock.
 */
template <typename T>
void print(T const& t)
{
	boost::mutex::scoped_lock lock(mtx);
	print_(t);
}


/**
 * Same comment as above.
 */
template <typename T, typename... Parameters>
void print(T const& t1, Parameters... parms)
{
	boost::mutex::scoped_lock lock(mtx);
	print_(t1, parms...);
}


/**
 * For right now, ignore the condition parameter. I 
 * haven't quite figured out how that all works 
 * yet.
 */
void runThread(int id, boost::shared_ptr<boost::condition> cond)
{
	for(int i=0; i<5; ++i)
	{
		print("Thread #", id, " (iteration=", i, ")\n");
	}
}


int main(int argc, char** argv)
{
	boost::shared_ptr<boost::condition> cond(new boost::condition());

	boost::thread t1(runThread, 1, cond);
	boost::thread t2(runThread, 2, cond);
	boost::thread t3(runThread, 3, cond);
	boost::thread t4(runThread, 4, cond);
	
	t1.join();
	t2.join();
	t3.join();
	t4.join();
}

/**
 * In conclusion, developers are a very passionate bunch.
 * I understand and get it- I have the same vehement hatred
 * for auto_ptr (I want to rant about it but I won't). 
 * I think when I see one person speak up about how bad 
 * something is I try to push the criticism aside. When 
 * an entire group of developers shun something I take 
 * notice; I won't be using recursive_mutex's in my code 
 * anytime soon!
 */

Variadic Templates are Hip.


/**
 * Good morning. I missed yesterday so I'll be putting up
 * another post this afternoon to make up for the post
 * missed yesterday.
 *
 * Today I'm focusing a bit more on variadic templates.
 * Yesterday's example (or maybe the day before) had 
 * a print function in it that looked kind of like this:
 *
 * template <typename T1, typename T2, typename T3>
 * void print(T1 const& t1, T2 const& t2, T3 const& t3)
 * {
 *     mutex.lock();
 *     std::cout << t1 << t2 << t3 << std::endl;
 *     mutex.unlock();
 * }
 *
 * That's just plain sloppy for a couple reasons. 
 * First, we should have created some sort of object to
 * do the streaming and not put it all into a method. 
 * Second, what if people want to print 4 things instead
 * of three? Or maybe just 1 thing? They can't use this.
 *
 * The solution is remarkably simple: variadic templates.
 * These are REALLY HIP and let me tell you why: they 
 * reduce sloppy code like our print code. Another example:
 * 
 * std::pair<int, std::string>
 *
 * Yuck. In the past other tuple libraries have come 
 * into existence. They all kind of covered up the 
 * inadequacy of the C++ language to handle multiple 
 * template parameters! 
 *
 * For me this is also an experiment to see if libraries
 * built with a different compiler will work with clang's
 * C++ compiler (I believe the other libraries were 
 * built with gcc or some standard compiler, definitely 
 * not clang).
 */

// Include this, because we are making going to guard
// a print statement.
#include <boost/thread.hpp>

// For the print statement.
#include <iostream>
#include <string>


// Use a recursive mutex, this will be more apparent
// when you read a little more.
static boost::recursive_mutex mtx;


/**
 * The first interesting part of this example is this 
 * print function with 1 template parameter. This 
 * function is used as the base case for a recursive
 * function setup [1].
 */
template <typename T>
void print(T const& t)
{
	mtx.lock();
	std::cout << t;
	mtx.unlock();
}

//#define SEE_THE_ACTION


/**
 * Now here's the HIP part. The second parameter uses 
 * a "typename..." to specify that the Inputs type 
 * can contain more than one template! Down below 
 * that the input to the function reads 
 * "Input... parameters". This function is like any
 * other- the compiler is wrapping a group of parameters
 * together into one parameter.
 */
template <typename  T, typename... Inputs>
void print(T const& t, Inputs... parameters)
{

	mtx.lock();

	#if defined SEE_THE_ACTION
	std::cout << "Recursive call!" << std::endl;
	#endif
	
	std::cout << t;	

	/**
	 * C++11 also offers a new use of the sizeof
	 * keyword. Coupled with a ..., it can return
	 * the number of arguments that are part 
	 * of the typename coming in for Inputs! I 
	 * thought this was hip [2].
	 */
	if (sizeof...(Inputs))
	{
		/**
		 * Make the recursive call. If you 
		 * don't use the ... after parameters, you'll
		 * get an error:
		 *
		 * error: expression contains unexpanded parameter pack
		 *
		 * I'm going to write more about this in
		 * the conclusion, but without the ellipses that
		 * is what you will get.
		 */
		print(parameters...);
	}

	mtx.unlock();
}

void newThread(int id)
{
	for(int i=0; i<5; ++i)
	{
		print("Thread #", id, " is in loop ", i, "\n");	

		boost::posix_time::milliseconds pauseTime(rand() % 1000);
		boost::this_thread::sleep(pauseTime);
	}
}


int main(int argc, char** argv)
{
	boost::thread t1(newThread, 1);
	boost::thread t2(newThread, 2);
	boost::thread t3(newThread, 3);
	boost::thread t4(newThread, 4);

	t1.join();
	t2.join();
	t3.join();
	t4.join();
}

/**
 * This entry turned out better than I thought it 
 * would. I think there may be a slight performance hit
 * that may be unreasonable to some due to the recursive
 * nature of the print statement. It would be faster 
 * if there was some sort of static unrolling (I have 
 * nothing to back this up). 
 *
 * So the elipses operator when used after a parameter
 * must perform some sort of operation similar to a 
 * dereference operation. You can think of the 
 * parameters object inside of the print function 
 * as something akin to a pointer, except that it 
 * is a pointer to a bunch of template parameters. To 
 * get those parameters and expand you can use the 
 * ellipses after it to "dereference" those parameters
 * and pass them along. The only fancy thing I see the
 * compiler doing is separating the first argument
 * from the variadic template argument and deducing
 * what function to call. 
 */


/**

REFERENCES

[1] - http://insanecoding.blogspot.com/2010/03/c-201x-variadic-templates.html

[2] - http://www.cplusplus.com/articles/EhvU7k9E/

*/

Thursday, May 9, 2013

Boost Threads (Part 3)


/**
 * This is my third small example for boost 
 * threads. This time, I'm covering the difference
 * between a regular mutex and a "recursive" 
 * mutex. 
 *
 * I happened by this topic while googling the 
 * words, "Reentrant Function". It caused me to 
 * ask a question: What happens when a recursive
 * function attempts to lock the same mutex?
 *
 * This example shows how it can be done and how
 * it can lock up (if you use the wrong mutex). 
 * Ultimately I wrote this example before finally
 * looking up the definition to reentrancy. It
 * (reentrant mutex) is a "mutual exclusion, 
 * recursive lock mechanism". Let's see how it 
 * is employed.
 */

// Obviously required for using threads (mutex,
// recursive mutex, etc)
#include <boost/thread.hpp>

// Needed for print statements.
#include <iostream>


/**
 * So if you want to deadlock this program, just
 * define the following macro. It makes the 
 * fibonacci example instantiate a regular non-
 * recursive mutex. 
 */
#if defined WANT_TO_DEADLOCK
typedef boost::mutex fibonacci_mutex;
#else

/**
 * By default this recursive mutex is defined.
 */
typedef boost::recursive_mutex fibonacci_mutex;
#endif


int fibonacci(int i)
{
	static fibonacci_mutex mtx;

	// Because this is a recursive method (i.e.
	// fibonacci). It calls itself, which means
	// this mtx.lock() gets called multiple times.
	// If this is a plain mutex, it will deadlock
	// because you've already locked it- if it 
	// is a recursive mutex you'll be just fine.
	mtx.lock();
	
	int ret = (0 == i || 1 == i) ? 1 : fibonacci(i-1) + fibonacci(i - 2);
	
	mtx.unlock();

	return ret;
}


/**
 * This is a simple newThread launch function.
 */
void newThread(int threadId)
{
	// This mutex is just used to do printing.
	static boost::mutex mtx;

	int fib = fibonacci(threadId);

	// Simply guard the print statement so it 
	// won't become garbled.
	mtx.lock();
	std::cout << "fib(" << threadId << ") = " << fib << std::endl;
	mtx.unlock();
}


int main(int argc, char** argv)
{

	boost::thread t1(newThread, 1);
	boost::thread t2(newThread, 2);
	boost::thread t3(newThread, 3);

	t1.join();
	t2.join();
	t3.join();
}

Wednesday, May 8, 2013

Boost Threads (Part 2)


/**
 * This blog entry focuses on the use of a mutex.
 */

// Obviously required for using threads.
#include <boost/thread.hpp>

// Output through cout can be garbled when using 
// threads, so in this example I'm going to guard
// against cout interruption.
#include <iostream>

// Using this to put together output strings.
#include <sstream>

/**
 * If you are building this example, try commenting
 * this out and see what happens to the output.
 * It's hard to tell at times but it is indeed 
 * garbled at points.
 */
#define DO_LOCK

// We have a singular mutex that guards cout. This
// is only used from PrintLocker.
static boost::mutex printMutex;


/**
 * The PrintLocker. This is an RAII class that 
 * locks a mutex (if it can) and then unlocks it
 * upon destruction. This is very exception 
 * savvy and will earn you extra points with your
 * colleagues.
 */
struct PrintLocker
{
	PrintLocker()
	{
		#if defined DO_LOCK
		printMutex.lock();
		#endif
	}

	~PrintLocker()
	{
		#if defined DO_LOCK
		printMutex.unlock();
		#endif
	}
};


/**
 * This is the only print mechanism this program
 * has. It guards the cout access.
 */
void print(std::string const& stringOutput)
{
	// Check it out, by creating this object 
	// you lock the mutex (if you can get it).
	// If you can't get it, the thread yields
	// and allows other active threads to run.
	PrintLocker pl;

	// Once you have the lock, feel free to 
	// print to standard out.
	std::cout << stringOutput << std::endl;

	// At the end, the object pl will be 
	// destroyed. Remember that in the destructor
	// the mutex will unlock. If an exception
	// is thrown within cout then pl will still
	// be destroyed and the mutex unlocked. 
}


/**
 * This is just a convenience function used 
 * to put together output strings via a string
 * stream.
 */
template <typename T1, typename T2, typename T3>
std::string getString(T1 const& t1, T2 const& t2, T3 const& t3)
{
	std::stringstream ss;
	ss << t1 << t2 << t3;
	return ss.str();
}


/**
 * The new thread function, which is invoked whenever
 * we create a new boost thread. It will pause randomly
 * while iterating 10 times. The random pauses help us 
 * when trying to expose race conditions (just comment 
 * out the #define DO_LOCK).
 */
void newThread(int threadId)
{
	print(getString("START Thread #", threadId, ""));

	for(int i=0; i<10; ++i)
	{
		print(getString(i, "th Iteration, Thread #", threadId));

		boost::posix_time::seconds pauseTime( rand() % 2 );
		boost::this_thread::sleep(pauseTime);
	}

	print(getString("END Thread #", threadId, ""));
}


int main(int argc, char** argv)
{

	boost::thread t1(newThread, 1);
	boost::thread t2(newThread, 2);
	boost::thread t3(newThread, 3);

	t1.join();
	t2.join();
	t3.join();
}


/**
 * In conclusion, boost mutexes are pretty normal.
 * They provide a nice wrapper around whatever threading 
 * system is already present on your system.
 */