sandbox/performance

Performance of Basilisk
- Speed Benchmark Results
- Performance Plots

This page is work in progress …

Performance of Basilisk

This page aims to provide an order-of-magnitude estimate for the performance of real-world Basilisk runs, as diagnosed by it’s users. It is intended to extend upon the speed benchmarks for the more elementary operations as presented here and here. Some may argue that speed benchmarking is impossible due to the variability between the applications and used hardware. Therefore, speed performance benchmarking is quite hard to do thoroughly, and hence always come with many disclaimers that render a single benchmark useless unless you want to redo the exact same thing. The ph of this page is that the speed benchmark results from every application will contribute to gaining a better grasp on the solver speed performance and its variability. This requires a large-enough dataset, so that users may learn if their set-up results in sub-par performance characteristics compared to (similar) other runs and also what hardware configurations work well for different applications. E.G. who knows if i should invest in faster RAM, or rather in a CPU architecture with larger chache memory? Or How much grid cells should you be able to reduce before the overhead of adaptivity pays off?

For runs using grid adaptivity, my current understanding is dat het performance actually greatly varies depending on the actual grid strucutre. Also convergence characteristics of the Multi-grid solver are offcourse dependend on the problem that is solved for. Therefore, a submission of speed performance should be as complete as possible in order to place the diagnosed speeds in their relevant perspective.

Please modify this page (anywhere) if you wish to contribute… Maybe it is better to also distinguish between the various solvers …

Speed Benchmark Results

Single core results

Case Name and link	Solver	Grid type	Number of Cells	(Max) level	Performance (Cells x Iterations / sec.)	Hardware	Comments	User
Game of Life	See link	Cartesian 2D	256^2	N.A.	1.50 \pm 0.05 \times 10^7	Intel i7-6700 HQ	The system is a Laptop	Antoon
Ekman Spiral	Diffusion	Adaptive bitree	\approx 2235	13	1.28 \times 10^6	Intel i7-6700 HQ	The system is a laptop	Antoon
Add more	if you wish

Parallelized using shared memory (openMP)

Case Name and link	Solver	Grid type	Number of Cells	(Max) level	Performance (Cells x Iterations / sec.)	Cores / threads	Hardware	Comments	User
Von Karman vortex street	Navier-Stokes	Adaptive Quadtree	600-21000	9	5.40 \pm 0.1 \times 10^5	4 / 8	Intel i7-6700 HQ	The system is a Laptop	Antoon
Von Karman vortex street	Navier-Stokes	Adaptive Quadtree	600-21000	9	4.23 \times 10^5	2 / 4	Intel i7-6700 HQ	The system is a Laptop	Antoon
Add more	if you wish

Parallelized using distributed memory (MPI) on a shared PCB

Case Name and link	Solver	Grid type	Number of Cells	(Max) level	Performance (Cells x Iterations / sec.)	Cores / Threads / MPI Tasks	Hardware	Comments	User
2D turbulence	\omega-\psi solver	Multigrid 2D	256^2	8	1.2 \times 10^7	4 / 8 / 4	Intel i7-6700 HQ	Performance is better than on a single CPU	Antoon
2D turbulence	\omega-\psi solver	Multigrid 2D	256^2	8	5.31 \times 10^6	4 / 8 / 16	1 x Intel i7-6700 HQ	The system is a laptop	Antoon
Convective turbulence	Navier-Stokes + SGS.h	Adaptive Octree	\approx (1 - 5)\times10^6	8	3.1 \times 10^5	6 / 12 / 6	intel Xeon E5-1650 v2 @ 3.5 Ghz	Workstation	Antoon

Parallelized using more than one node (super computers)

Case Name and link	Solver	Grid type	Number of Cells	(Max) level	Performance (Cells x Iterations / sec.)	Cores / threads / Tasks / Nodes / Islands / Midplanes	Hardware	Comments	User
Add more	if you wish

Performance Plots

If you did scaling analysis of some sort, please upload your results and plot them below.