tiger.c:
 * Refactor the tiger_49 family so it receives 2 input pointers one for each hash.
 * Add a parallelized TTH implementation with a multiple worker approach using
   gomp
 * Allow to choose between sse2, serialized or interleaved calls to tiger in
   tiger_2 for speed purpouses
 * Check the speed of each implementation on startup to choose the best.
 * Add better big endian emulation for debuging purposes
 * Optimize big endian code.
