Location: London / Edinburgh (Hybrid - 2 days per week in the office)
Day Rate: Market rate (Inside IR35)
Duration: 6 months
The Role
We are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering advanced data and AI solutions. This role will focus on designing scalable data platforms and integrating Generative AI capabilities into enterprise systems.
Key Responsibilities
- Design, build and maintain scalable data pipelines using PySpark, Python and distributed computing frameworks
- Architect and optimise AWS-based data and AI infrastructure for secure, high-performance data processing
- Develop, fine-tune, benchmark and evaluate GenAI/LLM models, including custom training and inference optimisation
- Implement and maintain Retrieval-Augmented Generation (RAG) pipelines, vector databases and document processing workflows
- Build reusable frameworks for prompt management, evaluation and GenAI operations
- Collaborate with cross-functional teams to integrate GenAI solutions into production environments
- Ensure data quality, governance and operational reliability across systems
- Strong experience with PySpark and large-scale distributed data processing (ETL/ELT pipelines)
- Advanced SQL expertise, including schema design (star/snowflake), indexing and optimisation techniques
- Proven experience implementing CDC and SCD (Type 1/2/3) in data warehousing environments
- Advanced Python skills for data engineering, automation and AI/ML integration
- Hands‑on experience with AWS services (e.g. S3, Glue, Lambda, EMR, Bedrock or custom model hosting)
- Practical experience developing and evaluating GenAI/LLM models
- Strong understanding of RAG architectures, embeddings and vector databases
- Experience handling structured and unstructured data (e.g. text, documents, logs, images)
- Knowledge of scalable storage solutions such as Delta Lake, Parquet, Redshift and DynamoDB
- Experience optimising AI models (e.g. quantisation, distillation, inference tuning)
- Strong troubleshooting and performance optimisation skills across distributed systems
- Additional experience across PySpark, Python, SQL, AWS and GenAI technologies
Personal Attributes
- Strong analytical and problem‑solving abilities
- Effective communication skills and ability to work collaboratively
- Experience working in agile, cross‑functional environments
#J-18808-Ljbffr