Staff Cloud SRE: AI/ML Platform & GPU Compute

Company: Deepstreamtech

Location: London

Posted: May 20th, 2026

Deepstreamtech is looking for a Staff Site Reliability Engineer to shape the reliability of large-scale AI systems and GPU compute infrastructure. In this foundational role, you will establish reliability frameworks and operational standards to ensure the performance of cloud infrastructures.

Your responsibilities will span from defining SLOs to participating in a 24/7 on-call rotation. Ideal candidates will have strong experience in SRE roles, particularly in GPU environments, Kubernetes, and cloud platforms like AWS, GCP, or Azure.

#J-18808-Ljbffr
Apply Now