Building an MCP Server for Laboratory Instruments on Top of Vendor SDKs - 4 Lessons From LiquidBridge
Building an MCP Server for Laboratory Instruments on Top of Vendor SDKs - 4 Lessons From LiquidBridge
Building an MCP server for laboratory instruments on top of a vendor SDK works - but only if the server treats the SDK as a hostile dependency. Across 18 months of building LiquidBridge, our digital twin and MCP layer for liquid handlers, four patterns made the difference between a brittle wrapper and a production-grade server: a hard adapter boundary, pre-tool-call validation, async event channels for long-running commands, and full plan simulation before any physical move. This post walks through each pattern with the code, the failure modes that motivated it, and what we would change next time.
Why Vendor SDKs Don't Map Cleanly to MCP
Liquid-handler vendor SDKs were not designed to be exposed to AI agents. They were designed for protocol authors writing scripts in a vendor IDE.
That mismatch shows up in three predictable ways:
Stateful, not stateless. Most vendor SDKs assume a long-running session. You initialize a deck, claim resources, run steps, and tear down. An MCP tools/call is a stateless request. Mapping one to the other requires the server to own session lifecycle on behalf of the agent.
Side-effect-first APIs. SDK methods like pipette.aspirate(volume, well) move physical hardware on call. Exposing them directly as MCP tools means a single hallucinated argument can corrupt a multi-day cell-culture run, ruin reagents that took weeks to prepare, or trigger a hard mechanical fault on a six-figure instrument. Lab hardware is a much higher-stakes target than a shell.
Vendor-defined error semantics. A Hamilton VENUS error code, a Tecan Fluent Control trace exception, and a PyLabRobot Python exception have nothing in common except that something went wrong. An AI agent needs a uniform error contract - structured, retriable-or-not, with an action recommendation - or it will either retry destructively or escalate everything to a human.
The four patterns below address these mismatches directly.
The diagram shows the four layers we landed on. The MCP transport speaks JSON-RPC to the agent. The validation layer rejects malformed or unsafe calls before they reach the SDK. The vendor adapter is the only code that touches vendor types. The event bus and the simulation backend run in parallel and are addressable as MCP resources, not just tool returns.
Pattern 1 - The Vendor Adapter Boundary
The first lesson: keep the vendor SDK out of every other layer.
In LiquidBridge, exactly one module imports the vendor SDK. Everything else - tool handlers, validators, the simulator, the event bus - speaks our internal types: Volume, WellAddress, TipBoxLayout, PipetteOperation. Vendor types stop at the adapter.
The structure is dull on purpose:
# liquidbridge/adapters/base.py
from abc import ABC, abstractmethod
from liquidbridge.types import (
Volume, WellAddress, PipetteOperation, OperationResult, DeckLayout
)
class LiquidHandlerAdapter(ABC):
"""The only contract the rest of the codebase depends on."""
@abstractmethod
async def load_deck(self, layout: DeckLayout) -> None: ...
@abstractmethod
async def aspirate(self, op: PipetteOperation) -> OperationResult: ...
@abstractmethod
async def dispense(self, op: PipetteOperation) -> OperationResult: ...
@abstractmethod
async def tip_pickup(self, channel: int, tip_address: WellAddress) -> OperationResult: ...
@abstractmethod
async def get_state(self) -> dict: ...
A concrete adapter for a Hamilton-class deck looks like this. Note that vendor imports are local to this file and that every vendor exception is translated:
# liquidbridge/adapters/hamilton_adapter.py
from liquidbridge.types import PipetteOperation, OperationResult, Volume
from liquidbridge.adapters.base import LiquidHandlerAdapter
from liquidbridge.errors import (
HardwareTimeout, DeckCollision, OutOfTips, AdapterFault,
)
# Vendor import lives ONLY here. Replace with the real SDK module name.
from vendor_sdk import LiquidHandler as VendorHandler # type: ignore
from vendor_sdk.errors import (
VendorTimeoutError, VendorCollisionError,
VendorTipError, VendorGenericError,
)
class HamiltonAdapter(LiquidHandlerAdapter):
def __init__(self, handler: VendorHandler):
self._h = handler
async def aspirate(self, op: PipetteOperation) -> OperationResult:
try:
await self._h.pipette(
channel=op.channel,
well=str(op.well), # vendor wants strings like "A1"
volume_ul=op.volume.as_microliters(),
liquid_class=op.liquid_class,
mode="aspirate",
)
return OperationResult.ok(operation_id=op.id)
except VendorTimeoutError as e:
raise HardwareTimeout(op.id, str(e)) from e
except VendorCollisionError as e:
raise DeckCollision(op.id, str(e)) from e
except VendorTipError as e:
raise OutOfTips(op.id, str(e)) from e
except VendorGenericError as e:
raise AdapterFault(op.id, str(e)) from e
The payoff shows up the day a customer asks "can you also support a Tecan Fluent?" Adding FluentAdapter is a single file. The MCP server, validators, simulator, and event bus do not change.
This is the same separation pattern that PyLabRobot - the open-source hardware-agnostic liquid-handling library - uses with its Backend abstraction, and it is the reason PyLabRobot can target Hamilton STAR, Opentrons, and Tecan from a single user-facing API. If you are building an MCP server for laboratory instruments, copy the boundary even if you are starting with one vendor. You will eventually have two.
Pattern 2 - Pre-Tool-Call Validation
The second lesson: cheap, deterministic checks must happen before the SDK gets the call.
MCP gives you JSON Schema validation on tool inputs for free. That handles type errors and out-of-range scalars. It does not handle the lab-specific failure modes that actually kill protocols:
- Aspirating from a well that has no liquid loaded in the deck state.
- Dispensing volume that exceeds the destination well's working volume.
- Picking a tip from a tip box that the simulator says is empty.
- Sending a sequence of moves that crosses an obstacle on the deck.
LiquidBridge layers a semantic validator between the MCP transport and the adapter. It runs against the digital-twin deck state, not the physical instrument, so it is fast and free. We reject roughly 12% of agent-generated tool calls at this layer in production - and every rejection costs zero hardware time.
# liquidbridge/validators/pipette_validator.py
from liquidbridge.types import PipetteOperation
from liquidbridge.state import DeckState
from liquidbridge.errors import ValidationError
class PipetteValidator:
def __init__(self, deck: DeckState):
self._deck = deck
def check(self, op: PipetteOperation) -> None:
well = self._deck.well(op.well)
if op.mode == "aspirate":
if well.liquid_volume_ul < op.volume.as_microliters():
raise ValidationError(
code="INSUFFICIENT_VOLUME",
detail=(
f"well {op.well} has {well.liquid_volume_ul} uL, "
f"command requested {op.volume.as_microliters()} uL"
),
fix_hint="aspirate from a well with sufficient volume, "
"or run a transfer step first",
)
if op.mode == "dispense":
headroom = well.max_volume_ul - well.liquid_volume_ul
if op.volume.as_microliters() > headroom:
raise ValidationError(
code="WELL_OVERFLOW",
detail=(
f"well {op.well} has {headroom} uL headroom, "
f"command would dispense {op.volume.as_microliters()} uL"
),
fix_hint="reduce volume or split across wells",
)
if not self._deck.tip_attached(op.channel):
raise ValidationError(
code="NO_TIP",
detail=f"channel {op.channel} has no tip attached",
fix_hint="call tip_pickup before aspirate or dispense",
)
Two design choices that look small but matter:
Errors carry fix_hint. When the agent gets a validation error, it gets a structured, machine-readable suggestion of what to do next. This is critical for autonomous loops - without it the agent retries the same call. With it, the agent reads the hint and emits a tip_pickup first. The 2026 MCP roadmap is moving in the same direction with structured error semantics.
Validation runs against DeckState, not the SDK. That means the validator works in CI with no instrument attached. We have ~600 unit tests that exercise the validator against synthetic deck states. They all run in under 4 seconds.
A validator that runs only on the hardware is a validator that nobody runs.
Pattern 3 - Fire-and-Poll for Long-Running Operations (with Optional Subscribe)
The third lesson: a "run protocol" tool that returns a single string after 45 minutes is unusable. But the obvious fix - "subscribe the agent to a streaming event channel" - is not as solved in 2026 as the MCP spec makes it look.
MCP tools are request/response by default. Most lab operations are not. A 96-well transfer can take 8 minutes. A full PCR setup with reagent prep, mixing, and plate sealing can take an hour. Agents need to see progress, react to intermediate state, and abort if something looks wrong.
The honest state of MCP for long-running ops in early 2026
- The MCP
2025-06-18spec gives younotifications/progress(best-effort, not guaranteed delivery) and resource subscriptions (resources/subscribe). Both work over the Streamable HTTP transport. - Most agent clients today - Claude Code, Claude Desktop, OpenAI Agents SDK, CrewAI, AG2, PydanticAI - do not surface mid-tool-call progress notifications or resource updates to the model. They show them in a UI, drop them at the transport, or wire them into callbacks that never make it back into the agent's reasoning context. Tracker issues: Claude Code #4157, OpenAI Agents SDK #661, PydanticAI #4266.
- LangGraph +
langchain-mcp-adaptersis the one mainstream stack that exposesonProgressandonResourcesUpdatedcallbacks today - though delivery over Streamable HTTP has a known open bug. - MCP
2025-11-25introduced Tasks (SEP-1686) as the experimental sanctioned long-running pattern:tools/callreturns ataskobject immediately; the client usestasks/get,tasks/result,tasks/cancelto interact. FastMCP 2.14+ ships a complete server-side implementation. Client adoption is the gap.
The 2026 MCP roadmap prioritizes Tasks lifecycle hardening. Native streaming tool output is described as "on the horizon" with no maintainer leadership yet. Plan accordingly.
Pragmatic pattern - poll for truth, listen for speed
Implementing both shapes side-by-side is the bet that ages well. The polling layer is the universal fallback every MCP client can use; the subscribe layer is gravy for clients that grow into it.
# liquidbridge/server/tools/run_protocol.py
from fastmcp import FastMCP
from liquidbridge.runtime import ProtocolRuntime, EventLog
server = FastMCP("liquidbridge")
runtime = ProtocolRuntime()
events = EventLog()
@server.tool(task=True) # FastMCP 2.14+ implements MCP Tasks (SEP-1686).
# task-aware clients get tasks/get + tasks/cancel; older
# clients get a clean error if the call would exceed budget.
async def run_protocol(protocol: dict, dry_run: bool = False) -> dict:
"""Start a liquid-handling protocol. Long-running."""
run_id = await runtime.start(protocol, dry_run=dry_run)
final = await runtime.wait(run_id)
return {"run_id": run_id, "status": final.status, "transfers": final.transfers}
@server.tool
async def run_events(run_id: str, since_seq: int = 0, max: int = 100) -> dict:
"""Drain typed events for a run. Universal polling fallback for clients
that do not yet support MCP Tasks or resource subscriptions."""
return {"events": [e.to_dict() for e in events.read(run_id, since_seq, max)]}
@server.resource("liquidbridge://runs/{run_id}")
async def run_resource(run_id: str):
"""Subscribable resource. Update notifications fire on every state
transition. Useful for LangGraph + langchain-mcp-adapters today; will
be useful for more clients as they grow resource-subscription support."""
state = await runtime.snapshot(run_id)
return {"text": state.to_json(), "mimeType": "application/json"}
@server.tool
async def cancel_run(run_id: str) -> dict:
"""Cancel a running protocol. Safe to call multiple times."""
await runtime.cancel(run_id)
return {"status": "cancellation_requested"}
Events are typed. The runtime emits a defined set: step_started, step_completed, liquid_transferred, tip_picked_up, tip_dropped, warning, error, protocol_completed. The agent always knows what shape to expect:
type LiquidBridgeEvent =
| { type: "step_started"; step_index: number; description: string; eta_seconds: number }
| { type: "step_completed"; step_index: number; duration_seconds: number }
| { type: "liquid_transferred"; from: string; to: string; volume_ul: number }
| { type: "warning"; code: string; detail: string; recoverable: boolean }
| { type: "error"; code: string; detail: string; fix_hint?: string }
| { type: "protocol_completed"; total_duration_seconds: number; transfers: number };
The structural cost: every long-running call now produces a run_id the agent must thread through subsequent calls. That is a prompt-engineering tax - the agent must remember to poll - but it is the same tax the AWS SDK, Stripe API, and Temporal SDK all charge. Solved problem.
A subtle thing we got wrong on the first attempt: we initially streamed events only as MCP notifications. They worked over a stable connection but were not durable - if the agent disconnected and reconnected, the event history was lost, and most clients did not surface the notifications to the model anyway. The fix was the append-only EventLog backing run_events() - the agent rebuilds full operation context with one polling call after reconnect, regardless of which client it is using. Subscribe-style is the right MCP-native answer; durable poll is the answer that actually works on every client today.
Pattern 4 - Plan Simulation Before Physical Execution
The fourth lesson: never let an agent's first contact with the physical instrument be a real move.
Every protocol that LiquidBridge runs goes through a simulator first. The simulator is a digital twin of the deck: every tip box, every well, every liquid volume, every channel position. It executes the same protocol code path the real adapter executes, against an in-memory DeckState, and produces the same event stream.
This is the same approach that Opentrons ships with opentrons.simulate.get_protocol_api() and opentrons_simulate, and that PyLabRobot ships as SimulatorBackend. The pattern is well-established. The lesson for an MCP server is to make simulation the default, not an opt-in.
In LiquidBridge, every run_protocol call goes through a two-pass execution:
# liquidbridge/runtime/protocol_runtime.py
class ProtocolRuntime:
def __init__(self, real_adapter, sim_adapter, validator, events):
self._real = real_adapter
self._sim = sim_adapter
self._validator = validator
self._events = events
async def start(self, protocol: dict, dry_run: bool = False) -> str:
op_id = new_operation_id()
# Pass 1: simulate, validating every step against the digital twin.
# If anything fails here, the real instrument never moves.
sim_result = await self._execute(protocol, self._sim, op_id, simulated=True)
if not sim_result.ok:
await self._events.emit(op_id, ErrorEvent(
code="SIMULATION_FAILED",
detail=sim_result.error,
fix_hint=sim_result.fix_hint,
))
raise SimulationFailed(sim_result.error)
if dry_run:
await self._events.emit(op_id, ProtocolCompletedEvent(
total_duration_seconds=sim_result.duration,
transfers=sim_result.transfers,
simulated=True,
))
return op_id
# Pass 2: real execution against the vendor adapter.
await self._execute(protocol, self._real, op_id, simulated=False)
return op_id
The simulator pays for itself in three ways:
Catches structural errors that schema validation cannot. A protocol that picks 8 tips, uses 6, then asks for 4 more without dropping the existing ones - that is a programming error, not a schema error. The simulator catches it.
Lets agents iterate cheaply. A 45-minute protocol takes ~120ms to simulate. An agent can plan, validate, and reject hundreds of variants before committing to one. This is the main reason a digital twin laboratory approach actually saves time on autonomous workflows.
Provides a safety boundary that customers trust. Lab managers approve "this protocol is allowed to run" by reviewing the simulator output, not raw MCP calls. The simulator's event stream is human-readable; the MCP wire format is not.
The flow diagram traces a single run_protocol call from agent to hardware. Each gate (schema, semantic, simulation) can reject the call before any physical motion. Only after all three pass does the vendor adapter receive the operation, and even then the agent watches progress through the event channel rather than blocking.
What We Would Do Differently
Two things stand out at 18 months in.
Treat the simulator as a first-class MCP resource from day one. We initially exposed simulation results only as part of run_protocol returns. Customers wanted to inspect the deck state, the tip boxes, the liquid volumes - mid-run. We retrofitted liquidbridge://state/deck and liquidbridge://state/tips/{box_id} as MCP resources later. Doing that on day one would have saved a refactor and would have unlocked agent behaviors we did not anticipate.
Make dry_run the default. Our first version executed real moves unless dry_run=true was passed. The first time an agent forgot the flag, it cost a real plate. We flipped the default; now dry_run=false must be passed explicitly, and a config flag at the server level can disable real execution entirely for a given deployment. If you are building an MCP server for laboratory instruments, default to safe.
Frequently Asked Questions
What is an MCP server for laboratory instruments?
An MCP server for laboratory instruments is a process that exposes an instrument's capabilities as Model Context Protocol tools, so AI agents can discover, validate, and call them through one open standard. It typically wraps a vendor SDK, REST API, or serial protocol and translates between the agent's MCP calls and the instrument's native interface. See our MCP architecture guide for a full code walkthrough.
Why wrap a vendor SDK in MCP instead of calling the SDK directly?
Direct SDK calls couple your AI agent to one vendor, one programming language, and one error model. An MCP server normalizes those into a single protocol, lets the agent discover capabilities at runtime, and adds a validation layer that prevents unsafe calls from reaching hardware. It is also reusable - the same MCP server works with any agent (Claude, GPT, custom) without changes.
How do you handle long-running protocols in an MCP server?
Use the MCP Tasks pattern from spec 2025-11-25 (SEP-1686) - the tool returns a task object immediately, the agent uses tasks/get and tasks/result to poll. Expose a sync run_events(run_id, since_seq) tool as a universal fallback for clients that do not yet support Tasks. Mirror each running protocol as a subscribable resource at <server>://runs/{run_id} for the one mainstream stack (LangGraph + langchain-mcp-adapters) that surfaces resource updates to the model today. Add a cancel_run tool that propagates to the vendor SDK abort path. Poll for truth, listen for speed.
How do you prevent an AI agent from breaking the instrument?
Run three layers of validation before any physical motion: JSON Schema validation on tool inputs (handled by MCP), semantic validation against a digital-twin deck state (rejects ~12% of calls in our production data), and full plan simulation against an in-memory adapter. Default dry_run=true at the deployment level so a missed flag never causes physical motion.
Should the simulator be a separate service or inside the MCP server?
Inside the same process, sharing the deck state with the validator. Running it as a separate service forces you to synchronize state across the wire and adds a class of consistency bugs. Keeping it in-process means the validator, simulator, and real adapter all see the same DeckState, which is the only sane way to get reliable plan validation.
Key Takeaways
- A production MCP server for laboratory instruments needs four layers: a vendor adapter, semantic validation, an async event channel, and full plan simulation.
- Vendor SDK imports must live in exactly one file. Translate every vendor exception into your own error taxonomy at the boundary.
- Pre-tool-call validation against digital-twin deck state catches the failures that JSON Schema cannot - we reject ~12% of agent calls at this layer before any hardware sees them.
- Long-running operations should use MCP Tasks (SEP-1686) where the client supports them, with a polling
run_events(since_seq)tool as the universal fallback and an optional subscribable resource for the LangGraph stack. Subscribe-style is the right MCP-native answer; durable poll is what works on every client today. - Default to
dry_run=true. The first agent that forgets the flag will cost you a real plate.
Written by Iacob Marian, Technical Lead and Co-founder at QPillars. Published April 30, 2026. QPillars builds LiquidBridge, the digital twin and MCP layer for liquid-handling robots, from Zürich, Switzerland.
Technical Lead & Co-founder at QPillars
Iacob builds intelligent software infrastructure for life sciences laboratories, with a focus on Rust for instrument control and agentic AI for lab automation.