Senior Robot Learning Engineer

Company: Wave Recruitment

Location: Bristol

Posted: April 30th, 2026

This robot learning role is with a seriously exciting scale up. The platform is mature, the data is flowing, and the team is ready to scale its most promising research directions into production-grade manipulation policies.

Scroll down to find an indepth overview of this job, and what is expected of candidates Make an application by clicking on the Apply button.

They need someone to lead the development and deployment of large behaviour models, taking diffusion transformers, VLAs, and language-conditioned policies from the literature onto a real bi-manual humanoid.

This is not a research-only role. You'll inherit a mature policy training codebase, a VR teleoperation pipeline producing high-frequency multi-modal data, and a Gymnasium environment wrapping a real robot. The work you ship runs on hardware.

The Role

You will architect, train, and deploy end-to-end large behaviour models for bi-manual and mobile manipulation, and lead the maturing of the early-stage RL pipeline.

The key responsibilities

Architect, train, and evaluate end-to-end large behaviour models for bi-manual and mobile manipulation
Advance diffusion transformer policies, mature VLA integration, and develop language conditioning for true multi-task generalisation
Apply RL to refine pre-trained policies: RL token fine-tuning, residual RL, off-policy RL with reference-action regularisation, RL-based fine-tuning of diffusion policies
Build a systematic sim-to-real transfer pipeline, connecting existing simulation infrastructure to training
Deploy and iterate learned policies on physical robot hardware
Mentor junior researchers and engineers, and publish at top-tier venues

What We're Looking For

Essential:

PhD/MSc in ML, Robotics, CS, or related field with 4+ years of equivalent industry research experience
Demonstrated expertise training and deploying learned manipulation policies on real robots
Strong background in at least two of: behaviour cloning, diffusion policies, VLA/VLM architectures, RL for manipulation
PyTorch and large-scale (multi-GPU, distributed) training
Track record of publications at top-tier venues (CoRL, RSS, ICRA, NeurIPS, ICML, ICLR), or equivalent demonstrated research impact through deployed systems, patents, or significant open-source contributions
Strong Python; production-quality research code with proper testing, type hints, and documentation

Useful:

Hands-on experience with humanoid or bi-manual manipulation platforms
Diffusion transformer, ACT, or VLA architectures specifically
Pre-trained vision/language models for robot control (CLIP, DINOv2, PaliGemma)
MuJoCo, Isaac Sim, or ManiSkill for sim-to-real policy training
RL fine-tuning of pre-trained policies (residual RL, DPPO, or similar)
3D perception for policy conditioning (point clouds, keypoints, NeRFs)

Key contribution areas

Policy Architecture & Training

End-to-end large behaviour models for bi-manual and mobile manipulation
Scale and evolve diffusion transformer policies, VLA integration, and language conditioning
Extend the imitation learning pipeline to leverage growing teleoperation datasets
Apply RL to push beyond what imitation alone can reach
Target sub-millimetre precision and contact-rich manipulation

Generalisation & Scaling

Develop policies that generalise across tasks, object categories, and environments
Move from single-task to multi-task and task-conditioned architectures
Design hierarchical behaviour systems for long-horizon manipulation
Investigate data-efficient learning: few-shot adaptation, transfer learning, multi-dataset training xwzovoh
Drive systematic ablations across architectures

Sim-to-Real & Deployment

Build the sim-to-real transfer pipeline: domain randomisation, rendering augmentation, sim-to-real benchmarking
Deploy and iterate learned policies on physical robot hardware
Extend the Gymnasium environment wrapper and integrate with the robot's control stack
Leverage perception team outputs (keypoints, learned features, 3D point clouds) for policy conditioning

Research Leadership

Track the literature and bring relevant advances back to the team
Identify and propose new research directions aligned with the manipulation roadmap
Mentor junior researchers and engineers
Publish at top-tier venues — conference attendance and open-source contributions are actively supported

What's On Offer

Join a team with world class applied research scientists, ML engineers, and robotics software engineers
A mature platform that ships to physical hardware, not slides
Active support for conference attendance and open-source contributions
Competitive compensation

Apply or send your CV to —

Apply Now