Distributed Superintelligence.
Bagel Labs is an Artificial Intelligence Research Lab developing novel methods for distributed training of frontier diffusion models on commodity hardware. Our work enables training of state-of-the-art generative models for robotics, video, and world modelling across heterogeneous hardware, unlocking compute capacity that current training architectures can't touch.
Decentralized Diffusion Models.
Distributed Diffusion Models (DDM) replace a single large diffusion model with an ensemble of smaller expert models, each trained independently on a partition of the dataset with no gradient synchronization between nodes. At inference, a lightweight router ensembles their outputs. This removes the tight coupling that forces conventional training onto homogeneous GPU superclusters.
Paris-1.
Paris is the first publicly released DDM. Despite using 14x less data and 16x less compute than prior decentralized baselines, it outperforms models trained on traditional monolithic clusters, achieving a 24% FID improvement (22.60 vs 29.64) on standard benchmarks.
| Inference Strategy | FID-50K ↓ |
|---|---|
| Monolithic (single) | 29.64 |
| Top-1 | 30.60 |
| Top-2 | 22.60 |
| Full Ensemble | 47.89 |
| Improvement | 7.04 |
Bagel Labs is an AI research lab and infrastructure company building distributed training systems for diffusion-heavy physical AI.
You will own Bagel Labs' physical-AI go-to-market motion across technical buyer discovery, design partners, written technical evaluations, and paid pilots.
Role Overview
The job is to turn Bagel's distributed-training advantage into a real market wedge with teams building robotics, autonomy, simulation, embodied AI, industrial AI, world-model, and diffusion-heavy physical-AI model stacks.
Key Responsibilities
- Define and prioritize Bagel's physical-AI ICP.
- Move the best accounts from first conversation to written technical evaluation, design partnership, or paid pilot.
- Lead technical-commercial conversations with founders, ML leads, infra owners, research leads, and executives.
- Translate DDM into buyer language around training economics, heterogeneous compute, specialization, and model-stack constraints.
- Package buyer requirements back into Paris 3 and distributed-training priorities.
Who You Might Be
You might be an ex-founder, first business hire, founding GTM lead, technical BD lead, infrastructure GTM operator, or product-minded commercial lead from robotics, simulation, AI infrastructure, autonomy, developer platforms, GPU/cloud, or deep-tech markets.
Bagel Labs is an AI research lab and infrastructure company building distributed training systems for diffusion-heavy physical AI.
You will help define Bagel's physical-AI research program across diffusion models, world models, latent dynamics, action representations, simulation, embodied generalization, video/multimodal modeling, and decentralized training.
Role Overview
The concrete proof motion is Paris 3: show that Bagel's DDM approach can matter for physical-AI workloads, starting from physical-AI benchmarks and moving toward specialization, compositionality, and action/world-state prediction.
Key Responsibilities
- Develop research directions across diffusion, world models, action representations, latent dynamics, simulation, and embodied policy learning.
- Contribute to decentralized world models, DDM expert ensembles, routing, specialization, and compositional generalization.
- Explore Action RAE / latent action encoder directions as a reusable representation layer for embodied AI.
- Design experiments with clear baselines, datasets, success metrics, and failure modes.
Who You Might Be
You might be a diffusion researcher, robotics/world-model researcher, video generation researcher, simulation researcher, representation-learning researcher, RL/imitation-learning researcher, or applied scientist with strong taste around physical systems.
Bagel Labs is an AI research lab and infrastructure company building distributed training systems for diffusion-heavy physical AI.
You will build the systems layer that lets Bagel turn research into credible proof across distributed training, GPU orchestration, observability, benchmark infrastructure, experiment tracking, and data/model pipelines.
Role Overview
The role is broader than data-pipeline work and broader than benchmark maintenance. Physical-AI workloads are the wedge, but the core skill is building high-leverage ML systems.
Key Responsibilities
- Build and operate distributed training infrastructure for diffusion-heavy workloads across heterogeneous compute.
- Improve launchers, configs, checkpointing, logging, metrics, run comparison, and reproducibility.
- Build benchmark and evaluation harnesses for physical-AI research.
- Add observability for GPU utilization, failure modes, data quality, routing behavior, model quality, and training stability.
Who You Might Be
You might come from ML infrastructure, distributed training, research engineering, GPU systems, observability, data infrastructure, benchmark engineering, simulation infrastructure, or model-platform work.