Does Augur detect bad sensor data automatically?

Partially. The data quality scorer catches obvious issues: flat-lined signals, physically impossible values (negative biomass, pH outside 0-14, DO above 100%), gaps longer than 2 hours, non-monotonic growth curves, and outliers more than 3 standard deviations from the local median. It doesn't catch subtle drift unless multiple runs show the same drift pattern. The automated check is a safety net, not a replacement for inspecting your own data before uploading.

What's the single most common sensor issue I should check for?

DO probe calibration drift. DO probes lose responsiveness over the course of a long fermentation and often read higher than actual once the electrolyte ages. If your late-process DO values look suspiciously stable at 30-40% while you're feeding aggressively, your probe is likely reading high. Real kLa-limited fermentations drop DO toward zero in the high-feed phase. A probe that stays at 30% is usually broken, not the process being oxygen-sufficient.

Do I need to fix sensor issues in the CSV before uploading, or can the platform handle it?

Fix what you can fix. Trim runs to the valid data window if the probe failed mid-run. Flag or remove obvious outlier rows. Don't try to imputate: leaving gaps is better than filling in interpolated values, because the ODE fitter and PINN both treat gaps as missing rather than zero, but treat interpolated values as real. If you're uncertain whether a sensor drift is real process or measurement artifact, include both interpretations as separate scenarios and compare the predictions.

Why does sensor quality matter more at scale-up than at lab scale?

Because the PINN learns residuals from lab data and applies them at production scale. A 10% DO drift in lab data gets encoded as a systematic bias in the residual, which then propagates to the production prediction. Lab-scale errors don't self-correct when you extrapolate. Get the calibration right at lab scale or your production estimate inherits the error.

Which of these failure modes is the hardest to catch by eye?

Feed pump readback drift. It looks plausible. The process runs, the biomass grows, the titer climbs. But if the pump is delivering 5% less feed than the logged value, your yield calculation is systematically wrong and the ODE fitter compensates by over-estimating Yp/s. That inflated yield then gets applied at production scale and the predicted COGS looks better than reality. Cross-check: weight of feed tank at start and end of batch, integrate the logged feed rate, check the mass balance matches within 2%.

5 sensor failure modes that silently break your bioprocess predictions

Every bad prediction we've seen from Augur traces back to one of two things: too few runs, or bad sensor data in the runs. The first is a quantity problem and solvable by running more experiments. The second is a quality problem and much harder to catch, because a broken sensor produces data that looks plausible until the prediction disagrees with pilot-scale reality. This post covers the five sensor failure modes we see most often in customer CSVs, how to detect each one in your own data, and what the platform does to flag them.

1. Dissolved oxygen probe drift

DO probes are the number-one source of corrupted data. Three failure modes stack:

Calibration drift. The probe reads 30% when actual DO is 5%. This happens when the electrolyte ages, the membrane fouls, or the probe wasn't zeroed properly against a nitrogen-purged reference.
Response-time lag. New polarographic probes respond in 30-60 seconds. Old probes take 3-5 minutes, which smooths out DO transients. Your data looks stable even though the actual process is oscillating.
Saturation clipping. Some probes saturate at 100% or 110% and don't report above it. If your early-process saturation period is clipped, the PINN can't learn the true kLa because the driving force is misrepresented.

Detection: plot DO vs time for each run. If DO stays above 20% during heavy feeding at high biomass, your probe is probably drifting high. Cross-check with offgas O2 uptake rate if available. If OUR is high but DO is stable, the probe is off.

2. pH probe saturation and junction clogging

pH probes suffer from two subtle issues. First, the reference junction clogs with protein or salt precipitate over long fermentations, which slows response and causes small DC offsets. Second, very high ionic strength media (common in high-titer biosurfactant fermentations) can drive apparent pH outside the normal 4-8 range for reasons unrelated to process state.

Detection: look for step-change jumps in pH that don't correspond to base addition events. If your pH trace jumps 0.2 units in 60 seconds without a corresponding base pump action, something artifactual is happening at the sensor. Also check that post-batch calibration matches the pre-batch calibration. Any drift of more than 0.1 pH units suggests the probe was reading inaccurately during the run.

3. Feed pump readback drift

This is the most dangerous one because it's the hardest to detect. Peristaltic pumps in particular lose calibration as tubing ages (tubing becomes less elastic, bore diameter changes). The pump reports its setpoint value, not the actual delivered mass. A 5% drift between logged and actual feed is common after 50-100 pump hours of the same tubing.

Why it matters: the ODE fitter uses logged feed rate to compute glucose uptake and product yield. If actual feed is 5% lower than logged, Yp/s appears 5% higher than reality. The PINN residual doesn't correct this because it's a systematic bias. At production scale, predicted titer ends up 5-10% optimistic.

Detection: mass balance check. Weight of feed tank at start minus weight at end should equal integrated feed volume times feed density, within 2%. Any bigger discrepancy means the pump readback is off. Alternatively, periodic calibration with a graduated cylinder is the gold standard, and any team running fed-batch processes for prediction uploads should make this a weekly habit.

4. CO2 offgas analyzer fog and sample-line condensation

Offgas analyzers (CO2 and O2) need a conditioned sample stream: warm enough to prevent condensation in the line, dry enough not to fog the optical cell, and at a stable flow rate. Any of these going wrong produces noisy or biased offgas data.

Common mode: sample line condensation in a lab where the bioreactor is at 30°C and the sample line runs through a 20°C ambient room. Condensation in the line absorbs CO2 selectively (CO2 is much more soluble than O2), so reported CO2 comes out 5-15% low. CER (CO2 evolution rate) calculations propagate this error directly.

Detection: reported RQ (respiratory quotient, CER/OUR) outside the physical range 0.5-1.5 is a red flag. Most aerobic fermentations have RQ in the 0.9-1.1 range, with drops during lipid accumulation and spikes during overflow metabolism. Values below 0.5 or above 1.5 almost always mean sensor drift, not real process behavior.

5. Biomass probe calibration against wrong reference

Optical density probes (NIR, laser turbidity) need a calibration curve against a gold-standard biomass measurement (dry cell weight, OD600 with appropriate dilution). Three common errors:

The calibration was done at low biomass (up to OD600 10) but the fermentation reaches OD600 60. Extrapolating linearly produces substantial error at high biomass because the Beer-Lambert law breaks down at high optical density.
The calibration was done with a different strain or genetic background. Cell size and refractive index vary by strain, so a calibration curve from one strain is not automatically valid for another.
The calibration was done in clean defined media but the fermentation runs in complex media with precipitates (salt, media components, antifoam). The precipitates contribute to turbidity but not to actual biomass.

Detection: cross-check with offline samples. Pull at least 3-5 DCW samples across the batch and compare against the online probe reading. If the online reading drifts from the DCW regression line by more than 10%, the probe calibration has failed.

What the platform does about this

The data quality scorer catches obvious issues automatically: flat traces, physically impossible values, gaps, non-monotonic growth, outliers, duplicate timestamps. It doesn't catch subtle drift because that looks exactly like real process variation. The quality score and per-run notes appear on the results page with any flagged issues, so the reviewer sees the caveats explicitly.

Beyond the automated check, two things help: run multiple replicates and look for cross-run consistency, and include at least one offline data point per batch (DCW, HPLC titer) to anchor the online signals. Any process engineer reading a scale-up prediction should treat online-sensor-only data with appropriate skepticism and demand the offline anchors.

Bottom line

Scale-up predictions are downstream of data quality. A physics- informed prediction pipeline can't invent signal that wasn't in the data. It can catch obviously broken inputs, but subtle sensor drift passes through as systematic bias and propagates to the production estimate. The five failure modes above cover 80% of the cases we see. Checking your own data against them before uploading is the single highest-leverage thing you can do to improve your prediction accuracy.

If you want help interpreting quality-scorer output or diagnosing unusual patterns in your uploaded data, request access. We're helping pilot users work through data quality questions during the onboarding phase.