Add Cecill v2 licence on source code
Add granularity on variable types and Marsaglia RNG generators.
Minor change about check.
Extend granularity on size and Marsaglia RNG generators. Add both asynchrone and synchrone MPI calls.
Add granularity choice on type of counters and type of Marsaglia generator.
Support for Intel Xeon Phi
Replace synchrone to asynchrone MPI calls as in Hybrid version.
Replace synchrone to asynchrone MPI Send/Receive. At the beginning only to avoid distribution of tasksbut it was a problem on OpenIB (mlx4_core.log_mtts_per_seg=5 to add in GRUB)
Modify output to provide rates.
Convert CUDA implementation as OpenCL one.
Split MainLoop* by calls on one MainLoop
Add different Marsaglia RNG.
Minor modifications.
Add vendor print and strip output on device name.
Minor changes
Add Hybrid MPI/OpenMP version
Add comment.
Add hostname print. Correct bugs.
Exception add.
Add Simple/Double precision in Kernels. Modify output to simplify import in CSV and GNUplot
Change Pi estimation from global division to atomic division.
Add Xeon Phi support for ACCELERATOR type
Improved version.
Changes on Device selection and metrology statements.
Minor changes on output filename.
Minor changes.
Change size of iterations variables (in functions)...
Change size of Iterations variable.
Minor changes on default values for bench.
Change minor bug.
Change optimization to O3.
Change to O3 optimization.
Change name of executable.
Change atol to atoll to support long long
Add architecture information. Add better support on 32bits architecture
Minor changes on output information about architecture.
Change unsigned long to long long for 32 bits compatibility.
Add support for 32 bits and large number of iterations (use of long long int)
Delete executable
Changes to support long (over 2^32 iterations)
Add script to reduce GProf output to show source code lines.
Merge INT and LOG elements.Add sqrt and classical test on quadrant.
Improve choice of GPU/CPU.
Modify randint call domain for 32 bits machines.
Add jobs mention in output process.
Delete executables.
Modify bench to specify OMP_NUM_THREADS, needed on ARM architecture
Add Pthreads implementation.
Add script for bench on each implementation.
Tiny modifications for input/output
Add MPI simple version
Add OpenMP version.
Add Pi test for Metrology tests.