Machine learning fraud detection is the industry standard pitch. Train models on labeled data, deploy neural networks, let AI catch fraudsters.

We went a different direction: rule-based detection with lightweight anomaly signals. This wasn't because we couldn't build ML systems. It was a deliberate engineering choice.

The ML Fraud Detection Promise

In theory, ML models can:

Learn complex patterns humans miss
Adapt to evolving fraud techniques
Handle high-dimensional feature spaces
Provide probabilistic risk scores

This sounds great. The reality is more complicated.

Why ML Fraud Detection Struggles

1. Training Data Problem

ML needs labeled training data: examples of "fraud" and "not fraud." But fraud labels are often:

Delayed (you don't know if traffic was fraudulent until downstream metrics appear)
Subjective (what's "fraud" varies by buyer, offer, vertical)
Imbalanced (fraud is rare, creating class imbalance issues)
Adversarial (fraudsters adapt when you deploy models)

Without clean labels, models learn noise rather than signal.

2. Explainability Problem

When a neural network flags traffic as "87% fraud probability," what does that mean? Which signals triggered it? How can users adjust their tolerance?

Black-box models create black-box decisions. Users can't understand, verify, or customize.

3. Latency Problem

Fraud decisions happen at ad-serve time (milliseconds). Complex models add latency. Simple models are faster.

At scale, the difference between 2ms and 50ms fraud check matters enormously.

4. Adversarial Adaptation

Sophisticated fraud operations probe detection systems. They learn what triggers blocks. ML models trained on historical data struggle against novel attack patterns.

Simple rules are actually more robust to certain attack types because they're based on physical constraints (a human can't click in 5ms) rather than statistical patterns (fraudsters can manipulate).

Our Approach: Explicit Signals

Instead of ML models, we use explicit signals with configurable weights:

Physical Impossibility Signals

Click timing < 50ms (physically impossible for humans)
Interaction events outside screen bounds
Page engagement before page could load

These aren't statistical—they're physical constraints that can't be gamed without changing the fraud technique entirely.

Technical Environment Signals

WebDriver flag (Selenium/Puppeteer detection)
Headless browser indicators
Automation framework artifacts

These detect specific tools rather than inferring from patterns.

Behavioral Heuristics

No mouse/keyboard/scroll events
Untrusted event objects
Missing expected browser capabilities

Simple rules with clear meaning.

The Benefit: User Control

Because every signal is explicit and weighted, users can:

Understand exactly why traffic was flagged
Adjust weights based on their tolerance
Disable signals that don't apply to their use case
Add their own rules via IP blacklists

Try doing that with a neural network.

External ML Integration

For users who want ML-powered detection, we integrate external providers (IPQualityScore, HUMAN, etc.). They have the scale and labeled data to train effective models.

This separates concerns: we handle the platform, they handle specialized fraud ML. Users choose their preferred approach.

The Tradeoff

Our approach catches less sophisticated fraud than state-of-the-art ML (which catches patterns we'd miss). It also has fewer false positives on legitimate edge cases (which ML often flags incorrectly).

We chose simplicity, explainability, and user control over maximum detection accuracy. For a platform emphasizing transparency, that's the right tradeoff.

Why We Chose Lightweight Anomaly Detection Over ML Models