AI Engineer - Must be Mandarin and English Fluent

Company: Chubb

Location: London

Posted: May 9th, 2026

Position Overview

We are seeking an AI Engineer to join our Global Analytics team in London. This role focuses on the end-to-end lifecycle of production‑grade AI, from training and fine‑tuning specialized models to architecting high-performance inference pipelines.

We view AI as a rigorous engineering discipline. Beyond building models, you will write high-quality, maintainable Python code and ensure that every solution—whether a voice agent or a document processor—is built for reliability, low latency, and global scale.

Key Responsibilities

Model Training & Fine‑Tuning: Lead the adaptation of Large Language Models (LLMs) for domain‑specific tasks using techniques like LoRA, QLoRA, and PEFT to balance performance with resource efficiency.
Inference Optimization: Architect and optimize inference pipelines to minimize Time to First Token and maximize throughput, including quantization, caching strategies, and efficient batching.
Production Engineering: Build and maintain real‑time AI pipelines using WebSockets and SSE, ensuring low‑latency delivery for voice (ASR/TTS) and text applications.
Architecture & MLOps: Deploy and orchestrate models within containerized microservice architectures (Docker/Kubernetes), ensuring robust monitoring, security, and scalability.
Collaborative Delivery: Work closely with Business Analysts and internal stakeholders to bridge commercial requirements and technical implementation.

Qualifications

Technical Requirements

Professional Experience: 5+ years in AI/ML engineering with a documented history of moving complex models from research into production.
Python Mastery: Deep proficiency in Python, with a strong commitment to clean coding standards (SOLID/DRY), modular design, and comprehensive unit/integration testing.
Generative AI Deep Dive: Hands‑on experience with LLM training cycles, parameter‑efficient fine‑tuning (PEFT), and sophisticated prompt engineering.
Inference Stack: Experience with high-performance inference servers (e.g., vLLM, TGI, or Triton) and understanding of how to optimize models for GPU deployment.
Infrastructure: Comfortable working in Linux‑based environments and proficient in managing containerized workloads and automated CI/CD pipelines.
Advanced RAG: Experience building production‑ready Retrieval-Augmented Generation systems, including vector database management and semantic search optimization.

Preferred Qualifications

Experience in the insurance or financial services sector.
Deep knowledge of GPU architecture, CUDA, and hardware‑level performance optimization.
Familiarity with Document Intelligence frameworks (OCR, layout analysis, and multimodal extraction).
MUST be fluent in Mandarin.

#J-18808-Ljbffr

Apply Now