Why Burn & Polars
Machine learning in Rust: from data to inference
The Problem
Machine learning in production requires two things: data processing and model inference. The standard stack is Python (pandas + PyTorch/TensorFlow), with all the baggage that implies.
Python is slow where it matters.
Data loading, preprocessing, feature engineering—these are CPU-bound tasks where Python's interpreter overhead dominates. Model inference in Python adds latency and memory overhead that scale with request volume.
The Python → Production gap is real.
Train in Python, then rewrite for production in C++. Export ONNX and pray the semantics match. Maintain two codebases that must stay synchronized.
We needed ML infrastructure that:
- Processes data at native speed without leaving Rust
- Runs inference without Python runtime overhead
- Compiles to multiple backends (CPU, GPU, WASM)
- Integrates with our Rust services naturally
Current Options
| Option | Pros | Cons |
|---|---|---|
| Python (pandas + PyTorch)Industry standard. Maximum ecosystem. |
|
|
| Rust (Polars + Burn)Native performance for data and inference. |
|
|
| ONNX RuntimeUniversal model format with optimized inference. |
|
|
Future Outlook
Rust ML is reaching production readiness.
Polars is already faster than pandas.
This isn't marginal—it's 10-100x faster on realistic workloads. Lazy evaluation, query optimization, native parallelism. The performance gap will only widen.
Burn is maturing rapidly.
Multiple backends (NdArray, Tch, WGPU, Candle), automatic differentiation, ONNX import. You can train models in Burn or import them from PyTorch.
WASM changes the deployment story.
Burn compiles to WASM. ML inference in the browser, on edge devices, anywhere WASM runs. No Python, no runtime dependencies, just a binary.
The trajectory is clear: Rust becomes a first-class ML language. We are early adopters, not guinea pigs.
Our Decision
✓Why we chose this
- Native performanceNo Python interpreter overhead. Data processing and inference at compiled speed.
- Unified languageData pipeline, model inference, and service code in one language. No serialization boundaries.
- Memory efficiencyPolars uses Apache Arrow format. Zero-copy operations, predictable memory usage.
- Multi-backendBurn compiles to CPU, GPU (CUDA/Metal), and WASM from the same code.
×Trade-offs we accept
- Ecosystem sizeFewer pre-built models than PyTorch Hub. More DIY required.
- Community sizeFewer StackOverflow answers. Documentation improving but not as comprehensive.
- Training limitationsFor novel research, PyTorch is still more flexible. Burn is better for inference than training.
Motivation
Our ML workloads are inference-heavy. We load data, preprocess it, run models, return results. This happens on every request.
Python added latency we could not tolerate. pandas loaded data slowly. PyTorch inference had GIL contention. Deployment meant Docker images with Python runtimes.
Polars loads data 10x faster. Burn runs inference without interpreter overhead. Our services are single binaries with no runtime dependencies.
The transition was not trivial—we rebuilt data pipelines and model inference. But the result is faster, simpler, and cheaper to operate.
Recommendation
Use Polars for:
- Data loading and preprocessing
- Feature engineering
- ETL pipelines
- Any task currently using pandas
Use Burn for:
- Model inference in production
- Edge deployment (WASM)
- When you need to integrate with Rust services
Training strategy:
- For standard architectures: train in PyTorch, export to ONNX, import to Burn
- For custom models: evaluate training in Burn (improving but not yet parity)
Start with Polars. The learning curve is low (similar API to pandas), and the performance gains are immediate. Add Burn for inference when Python overhead becomes a bottleneck.
Examples
use polars::prelude::*;
// Polars: DataFrame operations at native speed
fn preprocess_features(path: &str) -> Result<DataFrame, PolarsError> {
let df = LazyFrame::scan_parquet(path, Default::default())?
.filter(col("status").eq(lit("active")))
.with_column(
(col("revenue") / col("users")).alias("revenue_per_user")
)
.group_by([col("category")])
.agg([
col("revenue_per_user").mean().alias("avg_revenue"),
col("users").sum().alias("total_users"),
])
.sort("avg_revenue", SortOptions::default().with_order_descending(true))
.collect()?;
Ok(df)
}
// Burn: Model inference without Python
use burn::prelude::*;
use burn::backend::NdArray;
type Backend = NdArray<f32>;
fn predict(model: &Model<Backend>, features: Tensor<Backend, 2>) -> Tensor<Backend, 1> {
model.forward(features)
}Polars handles data with lazy evaluation and query optimization. Burn runs inference on any backend without Python dependencies.