Support for Intel Xeon Phi
Minor changes on default values for bench.
Change optimization to O3.
Change atol to atoll to support long long
Add architecture information. Add better support on 32bits architecture
Delete executables.
Modify bench to specify OMP_NUM_THREADS, needed on ARM architecture
Add script for bench on each implementation.
Tiny modifications for input/output
Add OpenMP version.