Getting Started
This guide walks through setting up Mosaic, running the benchmarks yourself, and reading the output. Running locally is also how you reproduce the Results on your own target hardware, which is the most accurate measurement for your setup.
Prerequisites
- Python >= 3.10
- Docker — must be running. All solvers execute inside Docker containers.
- Several dozen GB of disk space for building all solver containers; individual images range from ~1 GB to ~10 GB.
- RAM: 16 GB minimum; 32 GB recommended for 3D fluid problems.
GPU support (optional)
GPU-enabled solvers (marked GPU in the Solver Reference) run efficiently on a CUDA-capable NVIDIA GPU. To enable that path you’ll need:
- An NVIDIA GPU with CUDA support
- NVIDIA Container Toolkit installed and configured
- Docker configured to use the
nvidiaruntime
Autodiff without gradient checkpointing materialises the entire trajectory in memory, so RAM/VRAM can saturate quickly on longer rollouts — more memory is better.
Installation
git clone https://github.com/pasteurlabs/mosaic
cd mosaic
# Pick one
uv sync --extra dev # uv (recommended)
pip install -e ".[dev]" # pip
pre-commit install # optional: enables lint checks on commitVerify the installation:
mosaic --help
mosaic status # shows all problems and solvers (no Docker needed)Running a single solver (smoke test)
The fastest way to verify everything works is to build one solver and run a quick forward check. The first two commands come straight from tesseract-core (tesseract build, tesseract run) — Mosaic adds nothing on top.
# Build the Exponax solver (small, fast, JAX-only)
tesseract build mosaic/tesseracts/navier-stokes-grid/exponax
# Run a single forward call (uses the built image)
tesseract run exponax_navier_stokes_grid apply '{}'
# Run the forward accuracy suite for just this solver
mosaic run -p ns-3d-grid --suites forward -s ExponaxThis builds one container (~2 min), runs a single forward simulation, and produces accuracy plots in mosaic-results/ns-3d-grid/forward/.
Running a benchmark suite
Each suite can target a specific problem and (optionally) a specific solver:
# Forward accuracy for all NS-grid solvers
mosaic run -p ns-grid --suites forward
# Gradient quality check for a single solver (quick debug mode)
mosaic run -p ns-grid -e gradient/fd_check --debug
# Cost scaling for structural mechanics
mosaic run -p structural-mesh --suites cost
# Full optimization convergence
mosaic run -p thermal-mesh --suites optimizationUseful flags
| Flag | Effect |
|---|---|
-s <solver,…> |
Restrict to listed solvers. A flat CSV is a union set: each problem keeps only the listed solvers that exist there, problems with zero matches are skipped. Per-problem overrides via <problem>=<csv>;…. |
-e <suite>/<exp> |
Run only one experiment within a suite |
-e <suite>/<exp>/<ic> |
Pin one initial condition (e.g. forward/agreement/tgv) |
--debug |
Small problem size for quick iteration |
--no-build |
Skip container builds (use existing images) |
--no-plots |
Skip plot generation |
--plots-only |
Regenerate plots from existing results |
--gpus 0,1,2 |
Distribute solvers across multiple GPUs |
--only <state[,…]> |
Re-run only cells matching one of failed,anom,missing,stale,excluded (skips fresh-ok). Combinable. |
Running everything
mosaic run # all suites, all problems
mosaic run --problems ns-grid,structural-mesh # filter problems
mosaic run --suites forward,gradient # filter suitesSelecting solvers
-s (alias --solvers) accepts two forms:
# Flat CSV — a union set applied to every problem in -p.
# Each problem keeps only the listed solvers that exist there; problems
# with zero matches are skipped (not silently expanded to "run all").
mosaic run -s OpenFOAM,XLB,deal.II,JAX-FEM
# Per-problem map — explicit picks per domain. Problems not listed in
# the map pass through unchanged (all solvers).
mosaic run -s "ns-grid=XLB,jax-cfd;structural-mesh=Firedrake,JAX-FEM"Names must match the display form exactly (XLB, OpenFOAM, deal.II, JAX-FEM, …). A name not in any registered problem aborts the run with a “Did you mean…?” hint before any image build.
Re-running a subset by status
After an initial run, --only re-executes only the cells in a given state, leaving fresh-ok cells untouched (the merge-aware result writer keeps their entries intact). Useful for iterating on a single solver or recovering from a partial failure without redoing everything.
mosaic run --only failed # re-run only failed cells
mosaic run --only failed,stale # plus anything stale
mosaic run --only missing # first-time runs only
mosaic run --only failed,missing,stale # everything that isn't fresh-ok
mosaic run -s PhiFlow --only excluded # re-check after dropping an exclusionStates: failed, anom, missing, stale, excluded. Compose multiple with commas. Combinable with -p / --suites / -e / -s for finer scoping.
mosaic run builds ~20 Docker containers and runs the complete evaluation. Expect several hours on a machine with a modern GPU, or much longer on CPU only. The container images add up to dozens of GB on disk. Consider starting with a single problem (mosaic run --problems thermal-mesh) to validate your setup before committing the time and storage.
Resuming after a crash
A long run may get killed mid-experiment by an OOM, host reboot, or SIGKILL. --continue resumes from where the prior invocation left off, at two granularities:
- Per experiment. Experiments whose
result.jsonalready exists under the output directory are skipped entirely. Multi-IC experiments count as done once every IC subdir has aresult.json. - Per solver. Within a partially-completed experiment, each harness writes
result_partial.jsonafter each solver finishes; on resume, solvers already recorded there are filtered out so the remaining ones can pick up the work.
mosaic run --continue # resume all problems / suites
mosaic run -p ns-grid --continue # scope to one problem
mosaic run --only failed --continue # only retry failed cells, skip ok--continue composes with --only — useful when a long sweep mostly succeeded but a few cells timed out: --only failed --continue re-runs just those without restarting fresh-ok cells.
Understanding output
Results are organized by problem and suite:
mosaic-results/
ns-grid/
forward/
baseline/result.json # per-solver forward errors
agreement/result.json # cross-solver agreement
*.png # generated plots
gradient/
fd_check/result.json # FD verification results
cost/
result.json # wall-clock scaling data
optimization/
recovery/result.json # optimization convergence
Use mosaic status to get a summary table of all completed experiments:
mosaic status # full per-problem tables
mosaic status -p ns-grid -f # single problem with failure reasons
mosaic status --format md > report.md # export as markdown
mosaic status --format json > snap.json # machine-readable snapshotTroubleshooting
Container start failures
A solver whose container fails to start — broken image, missing CUDA runtime, or an import error inside the API module — surfaces as a failed cell with the underlying exception message in mosaic status -f. (Previously these dropped silently as NOT_RUN, indistinguishable from “wasn’t selected”; the runner now records them via an on_error callback so the status pipeline can classify the cell as FAILED.) Retry the cell with mosaic run --only failed once the container is fixed.
Container build failures
If tesseract build fails:
- Check Docker is running:
docker info - Check disk space:
docker system df - For GPU solvers, verify the NVIDIA runtime:
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi - Try building with verbose output:
tesseract build <path> --verbose - Inspect the build log for missing system dependencies (common with FEniCS/Firedrake)
Solver timeouts
Solvers that exceed the HTTP watchdog timeout (1200s default) are killed automatically. This typically happens with:
- CPU-only solvers at large grid sizes
- Long rollouts on slow solvers
Use --debug for a smaller problem size, or run with -s <solver> to isolate.
Memory issues
3D fluid problems at high resolution can exceed available (V)RAM. Symptoms include:
- OOM kills (check
dmesgordocker logs) - Silent NaN outputs
Reduce resolution or limit to fewer concurrent solvers via --gpus.
Getting help
If you run into problems not covered here, visit the Tesseract Forum for community support, or open an issue on GitHub.