Bagel LABS

Distributed Superintelligence.

Bagel Labs is an Artificial Intelligence Research Lab developing novel methods for distributed training of frontier diffusion models on commodity hardware. Our work enables training of state-of-the-art generative models for robotics, video, and world modelling across heterogeneous hardware, unlocking compute capacity that current training architectures can't touch.

Bagel Labs icon

Decentralized Diffusion Models.

Distributed Diffusion Models (DDM) replace a single large diffusion model with an ensemble of smaller expert models, each trained independently on a partition of the dataset with no gradient synchronization between nodes. At inference, a lightweight router ensembles their outputs. This removes the tight coupling that forces conventional training onto homogeneous GPU superclusters.

DDM decomposition diagram

Paris-1.

Paris is the first publicly released DDM. Despite using 14x less data and 16x less compute than prior decentralized baselines, it outperforms models trained on traditional monolithic clusters, achieving a 24% FID improvement (22.60 vs 29.64) on standard benchmarks.

Inference Strategy FID-50K ↓
Monolithic (single) 29.64
Top-1 30.60
Top-2 22.60
Full Ensemble 47.89
Improvement 7.04