A practice in applied machine learning

Rigour at the
edge of inference.

We design, build, and audit machine-learning systems for organisations that need their models to be correct, not merely impressive. Our work spans architecture research, production engineering, and the evaluation frameworks that connect them.

§ 01 — Practice

Four practice areas, one method.

Every engagement begins with a derivation. We ask what the system is actually doing — mathematically — before we ask what it should do differently. The four areas below describe where we apply that habit.

Architecture & Research

Bespoke model architectures for domain-specific constraints. State-space models, structured attention, hybrid symbolic–neural systems. Where the shelf does not have what you need.

→ /practice

Systems Engineering

Retrieval, fine-tuning, agentic workflows, and the production infrastructure to support them. Built to be debugged at three in the morning by engineers who did not write it.

→ /practice

Evaluation & Assurance

Evaluation harnesses, red-teaming, numerical and mathematical audits of existing models. We find the failure modes that internal teams stop looking for after the second sprint.

→ /practice

Advisory

Pre-publication research review, technical due diligence, bespoke workshops on the mathematics of modern ML. Retainers available for teams that want a standing second opinion.

→ /practice

§ 02 — Method

Derivation before deployment.

The dominant failure mode in production machine learning is not bad engineering. It is engineering done in the absence of a clear account of what the system is supposed to be approximating. Loss functions are chosen by precedent, architectures by familiarity, hyperparameters by grid search.

Our practice begins one step earlier. Before we touch a training run, we write down — explicitly — the mathematical object the model is standing in for. Only then do we choose an architecture, and only then do we know how to evaluate it.

This is slower at the start of an engagement. It is dramatically faster by the middle of one. It is the reason we are willing to work in domains where most teams stop: state-space models for long-context inference, hybrid systems where symbolic structure cannot be discarded, architectures where the geometry of the problem dictates the shape of the solution.

→ Read the full methodology

§ 03 — Engage

We take a small number of engagements per quarter.

If your team is shipping models into production and you want a second set of eyes — or a first set of mathematically literate ones — we would like to hear from you. Initial conversations are confidential and unbilled.

Begin a conversation →

Rigour at the edge of inference.

Four practice areas, one method.

Architecture & Research

Systems Engineering

Evaluation & Assurance

Advisory

Derivation before deployment.

We take a small number of engagements per quarter.

Rigour at the
edge of inference.