MethodologyScale-upTechno-economic analysisIndustrial scale

From 5 lab runs to production-scale COGS: how Augur's prediction pipeline works

A technical walkthrough of the four-stage pipeline Augur uses to turn 5-10 lab fermentations into production-scale titer, rate, yield, and COGS/kg predictions. Covers ODE fitting, PINN residual learning, hydrodynamic scale-up, DSP unit operations, and conformal uncertainty.

RenewVerse Research•April 19, 2026•10 min read

A founder emailed us last month with the same question we hear from every bioprocess engineer: how can you possibly predict 20,000 L performance from 5 fermentation runs at 4 L? Traditional CFD simulations need geometry, mesh, and hours of compute. Pure machine learning models need hundreds of runs and still fail to extrapolate. We built Augur because we thought there was a third path: embed the physics of fermentation into a neural network so the model only has to learn small corrections on top of the physics. This post walks through the four-stage pipeline that turns a CSV of lab fermentations into a production-scale COGS/kg prediction in under a minute.

The four stages

ODE fit. A Monod-family ordinary differential equation is fit to every uploaded run using multi-start gradient descent. This baseline captures growth, substrate consumption, product formation, and oxygen dynamics to roughly 15-25% MAPE.
PINN residual learning. An ensemble of 5 physics- informed neural networks learns the residual between the ODE baseline and the actual data. Physics constraints (mass balance, product inhibition, Arrhenius temperature response, optional overflow metabolism) are encoded as terms in the loss function, so the network cannot learn solutions that violate conservation laws. Typical post-PINN error is 5-12% MAPE.
Hydrodynamic scale-up. Reduced-order models recompute kLa, mixing time, tip speed, power input, and DO profile at the target production volume and aspect ratio. These new physics inputs go into the PINN ensemble as production anchor conditions.
DSP + economics. A chain of 8-9 downstream unit operations (centrifugation, filtration, chromatography, UF/DF, extraction, precipitation, crystallization, evaporation, foam fractionation) converts titer and broth properties into a product mass balance. Feedstock cost, labor with facility sharing, equipment depreciation, utilities, and consumables roll up to COGS/kg with Monte Carlo confidence intervals.

Stage 1: ODE fit

The ODE is deliberately boring. It has five state variables (biomass, glucose, product, dissolved oxygen, volume) plus an optional sixth variable for overflow metabolites (acetate in E. coli, ethanol in yeast). Parameters include mu_max, Ks, Yxs, Yp/s, kLa, product inhibition constant Kp, and Arrhenius Ea/T_opt. Fitting uses Latin hypercube sampling for 5 restarts within parameter bounds, then gradient descent to a minimum. A quality gate classifies the fit as good, fair, or poor based on per-variable R² and MAPE. Poor fits fall back to absolute (non-residual) PINN mode, which is less accurate but more robust when the ODE family doesn't match the biology.

Why bother with the ODE if the PINN is the accurate model? Because extrapolating a neural network from lab to production scale is a terrible idea. The ODE gives us a physics-grounded baseline at production scale. The PINN then only has to correct that baseline. If the corrections are small (typical case), the prediction inherits the ODE's stability. If the corrections are large (bad ODE fit), the platform warns the user and widens confidence intervals.

Stage 2: PINN residual learning

Instead of predicting the full time series Y(t) directly, each network predicts delta(t) = Y_actual(t) - Y_ode(t). Three design choices matter:

Residual normalization. Residuals are typically much smaller than the signal itself. We normalize residuals separately from base signals and floor the range at 0.1 to avoid numerical blow-up when the ODE is already near-perfect.
Data weighting. Lab data points are weighted 3x relative to ODE-derived production anchors during training. This keeps the network from over-trusting synthetic anchors in regions where lab data is sparse.
Physics loss with dynamic weighting. Conservation constraints are enforced with a loss weight that ramps from 0.1 to 1.0 over 500 epochs. Without ramping, the physics term dominates early training and the network can't fit the data. Without the loss entirely, the network can learn non-physical solutions at extrapolation points.

The ensemble of 5 independently-initialized networks gives us epistemic uncertainty essentially for free. Disagreement across the ensemble flags regions where the model is uncertain, which usually correlates with regions of sparse training data.

Stage 3: hydrodynamic scale-up

The core claim of Augur is that biology translates, physics doesn't. The organism's kinetics (learned from 5 L lab runs) are the same at 5 L and at 20,000 L. What changes between scales is:

kLa (oxygen transfer): lab vessels run at 150-400 h^-1. Industrial tanks with the same VVM often drop to 50-100 h^-1 because agitator tip speed is capped by cell shear limits. Lower kLa means oxygen limitation at lower cell density, which caps titer.
Mixing time (substrate homogenization): 20-90 seconds at production scale vs. 1-5 seconds in lab. Feed gradients become significant. Cells near the feed port see spikes, cells far from it starve.
Tip speed and shear: shear-sensitive strains (most mammalian lines, some yeasts) have a ceiling. Exceed it and viability drops regardless of nutrient conditions.
Hydrostatic pressure: a 20,000 L vessel with a 10 m liquid column has ~1 bar of head pressure at the bottom, which affects CO2 dissolution and pH.

We compute these using empirical correlations (Bakker, Van't Riet, Miller) rather than full CFD. Reduced-order is accurate to about 20% for kLa in the 5-20 kL range, which is the relevant band for most customers. The model outputs are fed into the PINN ensemble as production anchor conditions, so the network's prediction reflects scale-appropriate physics, not lab conditions.

Stage 4: DSP + economics

DSP is where most of the cost lives. A typical biosurfactant process might spend $25/kg on fermentation and $60/kg on downstream recovery. Get the DSP model wrong and your COGS estimate is off by 3x. We chain unit operations through a registry pattern with organism-specific templates:

Monoclonal antibodies (CHO): centrifugation, depth filtration, Protein A chromatography, viral inactivation/filtration, ion exchange, UF/DF. ~60% overall DSP yield.
Recombinant proteins (E. coli): centrifugation, homogenization, inclusion body wash, refolding, chromatography. ~30-50% yield depending on refold efficiency.
Secreted yeast products: centrifugation, UF/DF, precipitation, drying. ~70-85% yield.
Biosurfactants: phase separation (gravity-based for sophorolipids, solvent extraction for MEL and rhamnolipids), foam fractionation as an alternative, UF concentration, spray drying. Industrial solvent recycling at 95% recovery is modeled explicitly.

Each operation returns yield, solvent use, consumable cost, labor hours, and equipment wear. Aggregated across the chain, these feed a cost model that also includes media, utilities, depreciation, facility sharing (a small company running one 20,000 L tank amortizes fixed costs over far less kg than a large CMO), and Monte Carlo sampling over the 15-20 most sensitive cost inputs. The output is COGS/kg with an 80% confidence interval and a tornado chart showing which inputs drive the most variance.

Uncertainty quantification

Ensemble spread gives epistemic uncertainty. Conformal prediction adds calibrated coverage. When a customer has enough data (typically 6+ lab runs), we hold out 20% for a calibration set, compute nonconformity scores, and use the empirical quantile to adjust interval widths. This turns "80% confidence" from a nominal claim into one backed by a coverage guarantee under the i.i.d. assumption.

In practice, intervals at production scale run 30-50% wide relative to the point estimate. Tight intervals (say, 5% wide) would be a warning sign: no one should claim a 20,000 L process hits 50.0 ± 2.5 g/L from 6 lab runs. The honest answer is a range, and we report the range.

When the pipeline struggles

Three situations reliably produce wider intervals or explicit warnings:

Narrow lab conditions. If every lab run is at the same temperature, pH, and feed strategy, the PINN has no basis to extrapolate to alternative conditions at scale. Solution: vary conditions in lab.
Huge scale jumps. 1 L to 200,000 L is a 200,000x leap and physics correlations drift. We explicitly flag scale factors above 10,000x as high-severity out-of-distribution.
Poor ODE fit. If the ODE fit has R² below 0.5 on any variable, the quality gate fires and the pipeline falls back to absolute PINN mode with wider intervals and a data-quality warning. This usually means the organism has behavior (e.g., diauxic shifts, multi-product competition) that the default kinetic family can't capture.

What this replaces

Traditional routes to a production-scale COGS estimate look like this:

Run lab experiments (weeks).
Run pilot-scale experiments in a 100-1,000 L vessel (2-4 months).
Pay a consultancy or CMO for a techno-economic model (weeks, $50-250k).
Revise after discovering DSP yield is different at scale (repeat).

With Augur, a bioprocess team runs 5-10 lab experiments, uploads the CSV, and sees a production-scale estimate in under a minute. That estimate is not a replacement for pilot-scale validation. It's a tool for choosing which pilot experiments to run, which strains to prioritize, and whether the unit economics work at all before committing capital to pilot runs. Used that way, it collapses months of iteration into hours.

What's next

Upcoming posts in this series cover: why DSP assumptions drive 60-80% of COGS (and how to stress-test them), the Custom Organism flow for novel strains, five sensor failure modes that break predictions, and case studies on sophorolipid (S. bombicola), rhamnolipid (P. putida), and high-titer mAb processes.

If you're working on a fermentation process and want to try the platform with your own data, we're onboarding pilot users this quarter. Request access.

Frequently asked questions

01Why does Augur only need 5-10 lab runs when pure ML models need hundreds?

The platform embeds the physics of fermentation (mass balance, Monod kinetics, product inhibition, temperature response) directly into the loss function of a physics-informed neural network. The neural network only has to learn the residual corrections on top of a baseline ODE fit. Because the physics already constrains the shape of the solution, the ML component has far less to infer from data, and 5-10 runs is enough to pin down the correction surface.

02Isn't a neural network overkill if you already have an ODE? Why not just fit the ODE and stop there?

Pure ODE fits plateau around 15-25% MAPE on real fermentation data. Biology has behaviors the Monod family doesn't capture: lag extension under stress, oxygen limitation switches, substrate inhibition at high glucose, product-driven overflow metabolism, subtle strain-specific nonlinearities. The PINN ensemble learns the delta between the ODE and reality, which typically halves the error to 5-12% MAPE without requiring the user to hand-craft a custom kinetic model.

03How does Augur handle the jump from 5 L lab bioreactors to 20,000 L production vessels?

Scale-up changes physics, not biology. We recompute kLa (oxygen mass transfer), mixing time, tip speed, and DO profiles using reduced-order hydrodynamic models calibrated against published CFD and empirical correlations. The organism kinetics (learned from lab runs) then run against the new physics, so the PINN sees realistic production conditions. The biggest variable at scale is kLa. Lab vessels sit at 150-400 h^-1. Industrial tanks often drop to 50-100 h^-1, and that directly caps achievable titer.

04What's the uncertainty on a production-scale prediction?

Each prediction reports a confidence interval from a 5-model ensemble, calibrated via split-conformal prediction when enough data is available. Typical intervals are 30-50% wide at production scale (e.g., 36 g/L with a 25-50 g/L band). Intervals widen when lab conditions are narrow, when the production scale-factor exceeds 10,000x, or when input conditions (temperature, pH) sit at the edge of the training range. Any out-of-distribution extrapolation is surfaced as an explicit risk factor.

05Does the platform model downstream processing (DSP) or only fermentation?

DSP is about 60-80% of COGS for most fermentation products, so modeling it is not optional. Augur chains 8-9 DSP unit operations (centrifugation, filtration, chromatography, UF/DF, extraction, precipitation, crystallization, evaporation, foam fractionation) with organism-specific templates. For biosurfactants specifically, the extraction and phase-separation operations include industrial solvent recycling and gravity-phase-separation modeling. DSP yield, solvent cost, labor, and equipment depreciation all feed the final COGS/kg calculation.

06What if my organism isn't one of the built-in profiles?

The Custom / Novel Organism flow accepts whatever kinetics you know (typical mu_max, Ks, Yxs, temperature and pH ranges, DSP template, expected duration) and learns the rest from your uploaded runs. When Custom is used, a 'data-driven fit' risk factor appears on the result so reviewers understand the prediction came from your experimental data rather than a matched organism profile. This is how we handle engineered or novel strains, including biosurfactant producers beyond the three built-in glycolipid hosts.

07How long does a prediction take end-to-end?

Typical run time is 15-45 seconds for a scale-up scenario from data upload through to COGS output. The PINN ensemble trains in parallel, DSP unit operations run in sequence, and Monte Carlo sampling for economics adds a few seconds at the end. Contrast with traditional CFD coupled to metabolic models, which takes hours per scenario and is not feasible for the rapid iteration needed during early strain/process development.