src/grid/gpu/Benchmarks.md

    Benchmarks on GPUs

    These have been done on:

    • Inteli7: 8 cores (OpenMP) of 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz on a Dell XPS laptop with 16GB RAM.
    • IntelUHD: the integrated Mesa Intel(R) UHD Graphics (TGL GT1) (0x9a60) with 3072 MB of video memory.
    • RTX3050: NVIDIA GeForce RTX 3050 Ti Laptop GPU card with 4096 MB of video memory.
    • RTX6000: NVIDIA Quadro RTX 6000/PCIe/SSE2 with 24576 MB on a different workstation.

    The RTX6000 is still about three times slower (on paper) than current state-of-the-art “gamers” graphics cards (e.g. the RTX 4090).

    Do not hesitate to send me benchmarks results on other cards: to reproduce the benchmarks on your system follow the links for the raw scripts and results given in each section.

    Time-reversed advection in a vortex

    This is this test case i.e. the BCG advection solver.

    See Benchmarks/advection for the commands and raw data.

    set term svg enhanced font ',11' size 1000,500
    c1 = "#99ffff"; c2 = "#4671d5"; c3 = "#ff0000"; c4 = "#f36e00"
    set auto x
    set style data histogram
    set style histogram cluster gap 1
    set style fill solid border -1
    set boxwidth 0.9
    set xtic scale 0
    set key top left
    set logscale y
    set grid
    
    set multiplot layout 1, 2
    set title 'Speed in grid points x timesteps / second'
    # 2, 3, 4, 5 are the indexes of the columns; 'fc' stands for 'fillcolor'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk advection' using 2:xtic(1) ti col lc rgb c1, \
         '' u 3 ti col lc rgb c2, \
         '' u 4 ti col lc rgb c3, \
         '' u 5 ti col lc rgb c4
    set title 'Speedup relative to 8 x Intel Core i7'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk advection' using ($3/$2):xtic(1) ti col lc rgb c2, \
         '' u ($4/$2) ti col lc rgb c3, \
         '' u ($5/$2) ti col lc rgb c4
    unset multiplot
    Time-reversed advection in a vortex (script)

    Time-reversed advection in a vortex (script)

    Time-reversed VOF advection in a vortex

    This is this test case i.e. a test of the VOF advection scheme, which is significantly more complex than the BCG scheme.

    See Benchmarks/reversed for the commands and raw data.

    set multiplot layout 1, 2
    set title 'Speed in grid points x timesteps / second'
    plot '< awk -v findex=3 -v minlevel=5 -f advection.awk reversed' using 2:xtic(1) ti col lc rgb c1, \
         '' u 3 ti col lc rgb c2, \
         '' u 4 ti col lc rgb c3, \
         '' u 5 ti col lc rgb c4
    set title 'Speedup relative to 8 x Intel Core i7'
    plot '< awk -v findex=3 -v minlevel=5 -f advection.awk reversed' using ($3/$2):xtic(1) ti col lc rgb c2, \
         '' u ($4/$2) ti col lc rgb c3, \
         '' u ($5/$2) ti col lc rgb c4
    unset multiplot
    Time-reversed VOF advection in a vortex (script)

    Time-reversed VOF advection in a vortex (script)

    Saint-Venant bump

    This is close to this test case and tests the Saint-Venant solver.

    See Benchmarks/bump2D-gpu for the commands and raw data.

    set multiplot layout 1, 2
    set title 'Speed in grid points x timesteps / second'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk bump2D-gpu' using 2:xtic(1) ti col lc rgb c1, \
         '' u 3 ti col lc rgb c2, \
         '' u 4 ti col lc rgb c3, \
         '' u 5 ti col lc rgb c4
    set title 'Speedup relative to 8 x Intel Core i7'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk bump2D-gpu' using ($3/$2):xtic(1) ti col lc rgb c2, \
         '' u ($4/$2) ti col lc rgb c3, \
         '' u ($5/$2) ti col lc rgb c4
    unset multiplot
    Saint-Venant bump (script)

    Saint-Venant bump (script)

    Lid-driven cavity

    This is this test case i.e. the Navier-Stokes solver. An important difference with the previous benchmarks is the use of the multigrid solvers used for viscosity and pressure.

    See Benchmarks/lid for the commands and raw data.

    set multiplot layout 1, 2
    set title 'Speed in grid points x timesteps / second'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk lid' using 2:xtic(1) ti col lc rgb c1, \
         '' u 3 ti col lc rgb c2, \
         '' u 4 ti col lc rgb c3, \
         '' u 5 ti col lc rgb c4
    set title 'Speedup relative to 8 x Intel Core i7'
    plot '< awk -v findex=3 -v minlevel=6 -f advection.awk lid' using ($3/$2):xtic(1) ti col lc rgb c2, \
         '' u ($4/$2) ti col lc rgb c3, \
         '' u ($5/$2) ti col lc rgb c4
    unset multiplot
    Lid-driven cavity (script)

    Lid-driven cavity (script)

    Two-dimensional turbulence

    This is this example using the streamfunction–vorticity Navier-Stokes solver (i.e. mostly the multigrid Poisson solver).

    See Benchmarks/turbulence for the commands and raw data.

    set multiplot layout 1, 2
    set title 'Speed in grid points x timesteps / second'
    plot '< awk -v findex=3 -v minlevel=7 -f advection.awk turbulence' using 2:xtic(1) ti col lc rgb c1, \
         '' u 3 ti col lc rgb c2, \
         '' u 4 ti col lc rgb c3, \
         '' u 5 ti col lc rgb c4
    set title 'Speedup relative to 8 x Intel Core i7'
    plot '< awk -v findex=3 -v minlevel=7 -f advection.awk turbulence' using ($3/$2):xtic(1) ti col lc rgb c2, \
         '' u ($4/$2) ti col lc rgb c3, \
         '' u ($5/$2) ti col lc rgb c4
    unset multiplot
    Two-dimensional turbulence (script)

    Two-dimensional turbulence (script)