Modify output to provide rates.
Convert CUDA implementation as OpenCL one.
Split MainLoop* by calls on one MainLoop
Add different Marsaglia RNG.
Minor modifications.
Add vendor print and strip output on device name.
Exception add.
Add Simple/Double precision in Kernels. Modify output to simplify import in CSV and GNUplot
Change Pi estimation from global division to atomic division.
Add Xeon Phi support for ACCELERATOR type
Improved version.
Changes on Device selection and metrology statements.
Minor changes on output filename.
Change size of iterations variables (in functions)...
Change size of Iterations variable.
Improve choice of GPU/CPU.
Modify randint call domain for 32 bits machines.
Add jobs mention in output process.
Add Pi test for Metrology tests.