Join the b2venture community at one of our portfolio companies. See all open positions below.

Interested in working directly for b2venture? Learn more.

Senior Data Engineer - Healthcare AI - UK/EU Remote

Vamstar

Vamstar

Software Engineering, Data Science
United Kingdom
Posted on Feb 1, 2026

Title: Senior Data Engineer - Healthcare AI

Location: UK or Europe | Remote (±2–3 hrs GMT overlap mandatory)

Reports to: Head of Engineering

Existing Clients: Top 100 Lifesciences, MedTech and Pharma companies

Type: Full-time

Core responsibilities & objectives

  • Design, build, and maintain batch/streaming data pipelines, ingestion, cleaning, normalisation, enrichment, deduplication.
  • Build and own ML/LLM pipelines end-to-end: document parsing, chunking, embeddings generation, vector indexing, agentic tool calling, multi-step workflows, retries, fallbacks, and state handling.
  • Write production-grade, well-tested Python that processes large volumes of data and documents reliably.
  • Own pipeline health: if data is stale, broken, or wrong, it's on you.
  • Work autonomously to project deadlines with minimal hand-holding.

Key qualifications & skills (non-negotiable)

  • 7+ years in backend data-heavy development or data engineering.
  • Highly proficient in Python
  • Hands-on experience with large datasets and high-velocity data streams (Kafka, Flink, Spark).
  • Strong with pipeline orchestration tools (Airflow, MLflow, or equivalent).
  • Solid SQL skills (Postgres, BigQuery, or Snowflake) and NoSQL experience (DynamoDB, OpenSearch, Elastic).
  • Real experience with LLM workflows: RAG architectures, embeddings/vector DBs, prompt engineering, function/tool calling, observability.
  • Deep understanding of ETL/ELT patterns and data processing at scale.

Preferred background (strong signals)

  • Experience with AWS data stack at scale.
  • Exposure to healthcare, life sciences, or regulated industries.
  • Built and shipped data, ML and LLM-powered pipelines in production.
  • Has debugged a pipeline and knows why observability matters.
  • Worked in a fast-moving startup where "that's not my job" doesn't exist.

What will get you rejected

  • "I set up the pipeline, someone else monitors it" mindset.
  • Tutorials and side projects but no production experience at scale.
  • Can't explain trade-offs between streaming vs. batch, or why you chose one vector DB over another.
  • Needs detailed specs before writing a line of code.
  • No curiosity about healthcare or what the data actually means.

Interested? We're a distributed team solving hard problems that will reshape the healthcare industry for a generation. If you want ownership, not just tickets, we'd like to hear from you.