At System1, there is one rule that guides AI production.
“Ship fast, measure faster and never guess in production,” CTO Chris Testa said.
The marketing tech company prides itself on measurable outcomes and thorough testing, and unlike many big data companies, that means nothing is released without guardrails in place, Testa said.
“Practically, that means no black boxes,” Testa said. “Models have defined objectives, human oversight where it matters and rollback paths if performance or quality drifts.”
Built In spoke with Testa in detail about how and why the company doesn’t sacrifice transparency for speed when it comes to AI — and the impact this has on their teams and the business.
System1 builds desktop and mobile apps, search engines and online publications that empower consumers with information while respecting their privacy.
What’s your rule for fast, safe releases — and what KPI proves it works?
Our rule is simple: ship fast, measure faster and never guess in production. We move quickly, but every AI release is grounded in real-world testing, clear guardrails and measurable outcomes before it scales.
Practically, that means no black boxes. Models have defined objectives, human oversight where it matters and rollback paths if performance or quality drifts. Speed comes from tight feedback loops, not shortcuts — we’d rather run 10 controlled experiments than one big, blind launch.
The KPI that tells us it’s working is incremental performance lift — whether that’s improved user intent matching, higher downstream conversion rates, or reduced cost per acquisition for our partners. If an AI feature doesn’t demonstrably outperform the previous system in live environments, it doesn’t graduate.
Fast and safe isn’t a tradeoff for us. When you’re disciplined about measurement, speed actually becomes the safer option.
What standard or metric defines “quality” in your stack?
Quality, for us, is defined by signal, not noise. A system is high-quality if it consistently delivers clear, explainable signals that drive better decisions — for users, partners and our own teams.
From a metrics standpoint, that shows up as incremental lift and durability. We look at whether improvements hold up over time, across markets and through changing conditions — not just whether a model spikes in a short A/B test. If performance degrades the moment the environment shifts, that’s not quality, that’s luck.
We also hold ourselves to standards around transparency and controllability. A high-quality system is one we can understand, audit and tune — especially in AI. If we can’t explain why something is working, we don’t consider it production-ready.
Ultimately, quality in our stack means outcomes you can trust, systems you can interrogate and performance that compounds instead of decays.
Name one AI/automation that shipped recently and its impact on your team and/or the business.
One example is our AI-driven intent refinement and traffic routing system. It automatically evaluates user signals in real time and routes traffic to the most relevant experiences — without relying on third-party identifiers.
From a business perspective, the impact has been immediate: higher downstream conversion rates and more consistent performance for partners, especially in environments where signal loss used to create volatility. It’s helped us turn uncertainty into efficiency.
Internally, the biggest win has been focus. Automation took a lot of manual tuning and reactive decision-making off our teams’ plates, so engineers and product managers can spend more time improving the system rather than babysitting it. In other words, less time managing exceptions, more time building the next advantage.
