What is Brush?
Brush is an open-source 3D reconstruction engine built in Rust that implements Gaussian Splatting β a technique that represents 3D scenes as collections of 3D Gaussian primitives that can be rendered in real time. With over 4,300 stars on GitHub and growing at 78 stars per day, Brush has rapidly become one of the most exciting projects in the neural rendering space.
What sets Brush apart from other Gaussian Splatting implementations is its commitment to universal accessibility. It runs on macOS, Windows, Linux with NVIDIA, AMD, and Intel GPUs, on Android, and even in a web browser via WebAssembly. This is possible because Brush is built on the Burn machine learning framework and uses WebGPU-compatible technology, eliminating the CUDA dependency that locks most ML tools to NVIDIA hardware.
Key Insight: Brush produces simple, dependency-free binaries that run on nearly all devices without any setup. This is a fundamental shift from most machine learning tools that require complex CUDA installations and are limited to specific GPU vendors.
How Gaussian Splatting Works
Traditional 3D reconstruction approaches like NeRF (Neural Radiance Fields) use neural networks to implicitly represent scenes. Gaussian Splatting takes a different approach: it represents a scene as a set of 3D Gaussian primitives, each with position, covariance (shape), color (via Spherical Harmonics), and opacity. These Gaussians are projected onto 2D image planes and alpha-blended front-to-back to produce photorealistic renderings.
The result is a representation that can be rendered at real-time frame rates (often 100+ FPS) while maintaining high visual quality. Brush implements this entire pipeline in Rust with custom GPU compute shaders, achieving performance that matches or exceeds the reference Python/CUDA implementations.
The architecture diagram above illustrates Brushβs layered design. At the top, input data flows in through standard formats like COLMAP datasets, Nerfstudio transforms, PLY files, and image masks. The core Rust crates handle dataset loading (brush-dataset), Gaussian Splat rendering (brush-render), MCMC training (brush-train), backward pass autodiff (brush-render-bwd), and loss computation (brush-loss). Supporting crates provide GPU radix sorting, parallel prefix sums, virtual file systems, and COLMAP parsing. The GPU compute layer uses Burn with CubeCL for JIT-compiled compute kernels and WebGPU/wgpu for cross-platform GPU abstraction. The output layer provides desktop, CLI, web, and mobile applications.
Key Features
| Feature | Description |
|---|---|
| Cross-Platform | Runs on Windows, macOS, Linux, Android, and Web (WASM) |
| Multi-GPU Vendor | Supports NVIDIA, AMD, and Intel GPUs via WebGPU |
| MCMC Training | Auto-growing splats with scene exploration and pruning |
| Zero CUDA Dependency | Built on Burn + WebGPU, no CUDA installation needed |
| Real-Time Viewer | Interactive egui-based viewer during training |
| Web Demo | Train and view directly in Chrome/Edge browser |
| CLI Interface | Full command-line control with --with-viewer option |
| Dynamic Splats | Support for sequence playback and delta frame animations |
| Loss Functions | L1, SSIM (separable convolution), and LPIPS perceptual loss |
| Scalable Data | Handles datasets larger than RAM with streaming data loader |
| PLY Export | Standard and compressed PLY format support |
| Rerun Integration | Visualize training dynamics and memory usage |
| NPM Package | brush-js WASM module for JavaScript integration |
| Single Binary | No complex dependencies, just download and run |
The features diagram shows the complete reconstruction pipeline. Step 1 handles data input and loading from COLMAP, Nerfstudio, PLY, and mask formats. Step 2 runs the MCMC training loop with splat initialization, MCMC-like optimization with auto-growing and pruning, and the custom AdamScaled optimizer with exponential learning rate scheduling. Step 3 is the GPU rendering pipeline that projects Gaussians, performs tile-based rasterization with radix sorting and prefix sums, computes Spherical Harmonics for view-dependent color, and alpha-blends the result β with a backward pass for autodiff gradients via Burn. Step 4 produces outputs including PLY export, real-time viewer, web deployment, and Rerun visualization. The feature cards below highlight Brushβs cross-platform support, zero CUDA dependency, MCMC training, interactive viewer, web demo, loss functions, dynamic splats, scalable data handling, CLI, Android support, Rerun integration, and NPM package.
Installation
Prerequisites
Brush requires Rust 1.88+ installed on your system. If you donβt have Rust yet, install it using rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Building from Source
Clone the repository and build an optimized release binary:
git clone https://github.com/ArthurBrussee/brush.git
cd brush
cargo run --release
For a debug build (faster compilation, slower runtime):
cargo run
Important: Always use
--releasefor training. Debug builds are significantly slower and the CLI will warn you about this.
Running the CLI
Brush provides a full command-line interface. View available commands:
brush --help
Train a model from a dataset:
brush train /path/to/dataset --with-viewer
The --with-viewer flag opens the interactive UI alongside training, which is invaluable for debugging and monitoring progress.
Export a trained model:
brush export /path/to/dataset --export-path ./output
Evaluate model quality:
brush eval /path/to/dataset
Web Build
Brush can be compiled to WebAssembly for browser deployment. You need wasm-pack and Node.js:
# Install wasm-pack
cargo install wasm-pack
# Build and run the web demo
cd brush/apps/brush-app/web
npm install
npm run dev
The web version currently supports Chrome 134+ and Edge on Windows and macOS.
Android Build
For Android deployment, set up the Android SDK and NDK:
# Add Android target
rustup target add aarch64-linux-android
# Install cargo-ndk
cargo install cargo-ndk
# Build native library (release mode for performance)
cargo ndk -t arm64-v8a -o crates/brush-app/app/src/main/jniLibs/ build --release
# Build and install the APK
cd crates/brush-app
./gradlew build
./gradlew installDebug
adb shell am start -n com.splats.app/.MainActivity
Training with Brush
Dataset Preparation
Brush accepts datasets in two primary formats:
-
COLMAP format β The standard output from COLMAP photogrammetry software, including sparse/dense reconstructions with camera poses and point clouds.
-
Nerfstudio format β A
transforms.jsonfile paired with image directories, commonly used in the NeRF ecosystem.
If your dataset includes an init.ply file, Brush will automatically use it as the initial point cloud for training. You can also include a folder of masks to ignore specific image regions, or use images with alpha channels to force the output splat to match transparency.
Training Process
Brush v0.3 uses an MCMC-like training technique with its own variation that still grows splats automatically. This combines the best of both worlds: splats grow where they are needed, while also exploring the scene like in MCMC to improve quality.
# Basic training with viewer
brush train /path/to/colmap/dataset --with-viewer
# Training with maximum splat count
brush train /path/to/dataset --max-splats 1000000
# Export checkpoints every N steps
brush train /path/to/dataset --export-every 1000 --export-path ./checkpoints
# Save evaluation images to disk
brush train /path/to/dataset --eval-save-to-disk --export-path ./eval_results
During training, you can interact with the scene in the viewer, rotate the model with arrow keys, and compare the current rendering against input views as training progresses.
Amazing: Brush can train on datasets larger than RAM. Only a configurable amount of data is cached in memory, while the rest is streamed by the data loader during training. Training also starts instantly β no lengthy preprocessing step required.
Loss Functions
Brush supports multiple loss functions that can be combined during training:
- L1 Loss β Basic pixel-wise absolute difference
- SSIM Loss β Structural Similarity Index with separable convolution for efficiency
- LPIPS Loss β Learned Perceptual Image Patch Similarity for perceptual quality
- Alpha Loss β Weighted loss for transparent image regions, controllable via
--alpha-loss-weight
Viewing and Exporting Results
Interactive Viewer
Brush includes a full-featured viewer built with egui, supporting:
- Orbit controls β Rotate around the scene center
- FPS controls β First-person navigation through the scene
- Flythrough controls β Smooth camera path animation
- Panning β Translate the view
- Arrow keys β Rotate model and move up/down
- F key β Toggle fullscreen mode
- FOV slider β Adjust field of view
- Background color picker β Change the scene background
- Splat scale slider β Adjust Gaussian splat sizes
Loading Splats
The viewer can load .ply and .compressed.ply files. For web deployment, you can stream data from a URL:
https://arthurbrussee.github.io/brush-demo/?url=https://example.com/scene.ply&zen=true
The ?zen=true parameter enables fullscreen mode for immersive viewing.
Dynamic Splats
Brush supports animated splat sequences:
- ZIP of PLY files β Load a sequence of splat files as an animation
- Delta frame PLY β Custom format with incremental frames (used by Cat4D and Cap4D)
- Play/pause controls β Control animation playback in the viewer
Architecture Deep Dive
Brush is organized as a Rust workspace with 20+ crates, each with a focused responsibility:
Core Crates
| Crate | Purpose |
|---|---|
brush-render | Forward Gaussian Splat rendering with GPU kernels |
brush-render-bwd | Backward pass for autodiff gradient computation |
brush-train | MCMC training loop with AdamScaled optimizer |
brush-dataset | Scene loading, batching, and data management |
brush-loss | Loss functions: L1, SSIM, LPIPS |
brush-sort | GPU radix sort for tile-based rendering |
brush-prefix-sum | Parallel prefix sum for GPU computation |
brush-vfs | Virtual file system abstraction |
brush-serde | Serialization layer for splat data |
brush-async | Async runtime utilities |
brush-cube | CubeCL compute abstractions |
colmap-reader | COLMAP format parser |
lpips | LPIPS perceptual similarity model |
Application Crates
| Crate | Purpose |
|---|---|
brush-app | Desktop GUI application (egui) |
brush-cli | Command-line interface |
brush-js | WebAssembly + NPM package for web |
brush-c | C FFI bindings |
GPU Compute Stack
Brushβs GPU compute stack is what enables its cross-platform capability:
- Burn Framework β Provides the autodiff system and tensor operations
- CubeCL β JIT-compiled compute kernels written in Rust that compile to SPIR-V, WGSL, PTX, and more
- wgpu β Cross-platform GPU abstraction supporting Vulkan, Metal, DX12, and WebGPU
- Custom Forks β Brush maintains forks of wgpu and CubeCL with WebGPU subgroup operations needed for the backward rasterizer
Takeaway: The use of Burn and CubeCL means Brushβs compute kernels are written once in Rust and automatically compiled to the optimal GPU instruction set for each platform. This is why a single codebase can run on NVIDIA, AMD, Intel, and even in the browser.
Brush vs Other 3D Reconstruction Tools
| Feature | Brush | gsplat (Python) | Original 3DGS | NeRF Studio |
|---|---|---|---|---|
| Language | Rust | Python/CUDA | Python/CUDA | Python/CUDA |
| GPU Vendors | NVIDIA, AMD, Intel | NVIDIA only | NVIDIA only | NVIDIA only |
| Web Support | Yes (WASM) | No | No | No |
| Mobile Support | Android | No | No | No |
| CUDA Required | No | Yes | Yes | Yes |
| Training Speed | Faster than gsplat | Baseline | Baseline | Varies |
| Quality (PSNR) | Higher than gsplat | Baseline | Baseline | Varies |
| Binary Size | Small, standalone | Large (Python env) | Large (Python env) | Large (Python env) |
| Interactive Training | Yes | Limited | Limited | Yes |
| Dynamic Splats | Yes | No | No | No |
Troubleshooting
Common Issues
Build fails with Rust version error:
Brush requires Rust 1.88 or later. Update your toolchain:
rustup update stable
rustup default stable
Web demo only works in Chrome/Edge:
WebGPU is still an emerging standard. Firefox and Safari support is not yet available. Use Chrome 134+ or Edge for the web demo.
Training is slow:
Make sure you are running a release build:
cargo run --release
Debug builds are significantly slower. The CLI will display a warning if you are running in debug mode.
Out of memory on large datasets:
Brush can handle datasets larger than RAM by streaming data. If you still encounter memory issues, try reducing the batch size or limiting the maximum number of splats:
brush train /path/to/dataset --max-splats 500000
Android build fails:
Ensure the following environment variables are set:
ANDROID_NDK_HOMEpointing to your NDK installationANDROID_HOMEpointing to your Android SDK
Also verify you have added the correct target:
rustup target add aarch64-linux-android
wgpu panics on startup:
This may indicate your GPU driver does not support the required features. Try updating your GPU drivers to the latest version. On Linux, ensure you have the latest Mesa or proprietary drivers installed.
Getting Started: Your First Reconstruction
Here is a complete walkthrough to create your first 3D reconstruction with Brush:
-
Capture images β Take 50-200 photos of an object or scene from different angles. More coverage leads to better results.
-
Run COLMAP β Generate camera poses and a sparse point cloud:
colmap automatic_reconstructor \
--workspace_path ./workspace \
--image_path ./images
- Train with Brush β Point Brush at your COLMAP output:
brush train ./workspace/sparse --with-viewer
-
Monitor progress β Watch the training in the interactive viewer. You will see the scene gradually appear as splats are created and refined.
-
Export the result β Once training converges (typically 7,000-30,000 iterations), export the PLY file:
brush export ./workspace/sparse --export-path ./output
- Share on the web β Upload the PLY file and share via URL:
https://arthurbrussee.github.io/brush-demo/?url=https://your-server.com/scene.ply&zen=true
Important: Brushβs MCMC-like training automatically grows splats where they are needed and prunes redundant ones. You do not need to manually tune the number of Gaussians β the algorithm handles this for you. However, you can set a maximum splat count with
--max-splatsif you want to limit memory usage.
Performance Benchmarks
Brushβs rendering and training are generally faster than gsplat, the reference Python implementation. The project includes built-in benchmarks:
cargo bench
Key performance characteristics:
- Clean builds compile in approximately 1.5 minutes on modern hardware
- Training speed exceeds gsplat on comparable hardware
- Memory usage is optimized with streaming data loading for large datasets
- Web training is approaching feature parity with the desktop version
Community and Resources
- GitHub Repository: ArthurBrussee/brush
- Web Demo: arthurbrussee.github.io/brush-demo
- Discord Community: Join the Brush Discord
- License: Apache-2.0
Brush is not an official Google product. It is a forked public version of the google-research repository, significantly extended with cross-platform support, MCMC training, and many other features.
Conclusion
Brush represents a paradigm shift in 3D reconstruction accessibility. By implementing Gaussian Splatting entirely in Rust on top of the Burn ML framework and WebGPU, it eliminates the CUDA dependency that has limited similar tools to NVIDIA GPUs. The result is a tool that runs everywhere β from high-end workstations to mobile phones to web browsers β while delivering training quality that exceeds the reference implementation.
Whether you are a researcher exploring neural rendering, a developer building 3D applications, or a hobbyist creating photorealistic 3D models from photos, Brush provides a powerful yet accessible entry point into the world of Gaussian Splatting. Enjoyed this post? Never miss out on future posts by following us