Methodology — Atlas Inference

§ 01

First, the derivation.

Most production machine learning is built without an explicit account of what the model is approximating. Architectures are chosen by precedent, losses by familiarity, hyperparameters by search. The system that results is often workable. It is rarely understood.

We begin every engagement by writing the derivation down. What is the mathematical object the model is standing in for? What invariants must it preserve? Where is the geometry of the problem incompatible with the geometry of the architecture under consideration? These questions are answered before any code is written.

This is not a stylistic preference. It is the difference between an ML system that fails gracefully and one that fails in ways its authors cannot diagnose.

§ 02

Then, the smallest thing that could work.

Once the problem is specified, we build the smallest system that could plausibly solve it. Not the most impressive. Not the most fashionable. The smallest. This is partly an aesthetic preference and partly a survival strategy: anything you cannot debug will eventually need debugging.

We treat every additional component — every layer of indirection, every extra service, every hyperparameter — as a debt to be justified rather than a feature to be added. The systems we ship are smaller than the ones we are usually asked to ship. They are also faster, cheaper, and easier to hand off.

§ 03

Evaluation that survives contact with the product.

Benchmarks are a starting point. They are not a destination. The evaluation harnesses we build are tied to the actual decisions a model is making in your product, not to the abstractions a paper found convenient. Where there is no good benchmark, we build one.

We treat evaluation as a first-class engineering deliverable. The harness is shipped alongside the model, runs on every change, and is written so that someone who did not build the model can use it to decide whether the next version is better.

§ 04

Handover, not lock-in.

Every engagement ends with a handover. Code, data pipelines, evaluation harnesses, internal documentation, a recorded walkthrough for the team inheriting the work. We write our systems to be picked up by other engineers, because that is the only kind of system worth shipping.

We do not retain dependencies on our presence. We do retain a standing offer to be reached when something goes sideways.

How the work is done.

First, the derivation.

Then, the smallest thing that could work.

Evaluation that survives contact with the product.

Handover, not lock-in.