Back to Blog
Technology & Architecture

How PopTrade Detects Ad Fraud: A Technical Overview Without the ML Hype

November 28, 20255 min read

Every ad platform claims to have world-class fraud detection. Most are exaggerating. This article explains how PopTrades fraud detection actually works - the signals we use, our scoring approach, and why we chose practical methods over impressive-sounding but ineffective ML buzzwords.

The Fraud Detection Reality Check

Lets be honest about what fraud detection can and cannot do:

What We Can Detect

  • Known bot signatures and automation tools
  • Datacenter and proxy traffic
  • Behavioral anomalies that deviate significantly from human patterns
  • Technical inconsistencies in device/browser fingerprints
  • Geographic mismatches and VPN usage

What Nobody Can Reliably Detect

  • Sophisticated residential proxy traffic
  • Human click farms
  • Motivated fraudsters who study detection methods
  • Zero-day fraud techniques

Anyone claiming 100% fraud detection is lying. Our goal is catching the obvious fraud cheaply and making sophisticated fraud expensive enough to be unprofitable.

Signal-Based Scoring

We use a weighted signal approach rather than black-box ML:

Critical Signals (High Weight)

WebDriver Detection

Checks if navigator.webdriver is true - indicates Selenium, Puppeteer, or similar automation:

  • Weight: 0.95 (near-certain fraud)
  • Why: No legitimate user has this flag set

Automation Markers

Presence of window.callPhantom, window._phantom, or similar:

  • Weight: 0.90
  • Why: Only exists in headless browser environments

Headless Browser Signatures

User-Agent containing HeadlessChrome or PhantomJS:

  • Weight: 0.85
  • Why: Explicit admission of non-human browser

Behavioral Signals (Medium Weight)

Time-to-Event (TTE)

How quickly did a click happen after page load?

  • Under 50ms: Weight 0.65 (physically impossible for humans)
  • Under 500ms: Weight 0.40 (suspicious but possible)
  • Why: Bots click instantly; humans need time to perceive and react

No User Interactions

Zero mouse movements, keyboard events, or scroll events:

  • Weight: 0.60
  • Why: Real users move mouse, scroll, interact with page

Untrusted Events

Click events where event.isTrusted is false:

  • Weight: 0.80
  • Why: Programmatically generated clicks are not trusted by browser

Technical Signals (Variable Weight)

Plugin Count

navigator.plugins.length equals zero on desktop:

  • Weight: 0.30
  • Why: Most real browsers have some plugins; headless often has none

Screen Anomalies

Screen dimensions that dont match any real device:

  • Weight: 0.35
  • Why: Bots often use arbitrary or default screen sizes

Geographic Mismatch

Declared location doesnt match IP geolocation:

  • Weight: 0.40-0.70 depending on severity
  • Why: VPN/proxy users or spoofed location data

Score Aggregation

Individual signals combine into a final fraud score:

final_score = 1 - product(1 - signal_weight for each triggered signal)

This means:

  • Single weak signal: low score
  • Multiple weak signals: elevated score
  • Any critical signal: high score
  • Multiple critical signals: near-certain fraud

The Soft Reject System

Not everything is black and white. We have three outcomes:

Accept (Score below threshold)

Traffic looks legitimate. Impression served, advertiser charged, publisher paid.

Hard Reject (Score above high threshold)

Traffic is almost certainly fraud. Request blocked entirely. No impression, no charge, no payment.

Soft Reject (Score in middle zone)

Traffic is suspicious but not certain fraud. This is where it gets interesting:

  • Impression may still serve (configurable by buyer)
  • Traffic routed to fallback if configured
  • Logged for analysis but not automatically blocked
  • Publisher not penalized for borderline cases

Why Soft Reject Matters

The advertising ecosystem has a problem: aggressive fraud detection creates false positives that harm legitimate publishers. Soft reject addresses this:

For Publishers

Borderline traffic isnt immediately rejected. If your users happen to use VPNs for privacy, they might still see ads (depending on buyer settings) rather than being blanket-blocked.

For Advertisers

You control the threshold. Conservative buyers can hard-reject anything suspicious. Others can accept soft-reject traffic at lower bids.

For the Platform

Fewer disputes. When we say traffic is fraud, we mean it. Soft rejects handle the gray area without false accusations.

External Provider Integration

Our built-in detection catches common fraud. For sophisticated threats, we integrate external providers:

Available Integrations

  • IPQualityScore - IP reputation, proxy detection, device fingerprinting
  • Pixalate - MRC-accredited invalid traffic detection
  • Fraudlogix - Real-time bot and click fraud detection
  • HUMAN (White Ops) - Sophisticated bot detection

How Integration Works

  • External score fetched in parallel with internal checks
  • Results cached for 5-15 minutes to reduce API calls
  • External score weighted into final decision
  • Buyers choose which providers to enable

Why We Dont Use Deep Learning

The honest answer: it doesnt work well for our use case.

ML Requires Massive Labeled Data

You need millions of confirmed fraud/not-fraud examples. Labeling is expensive and often wrong.

Fraud Evolves Faster Than Models

By the time you train a model on last months fraud patterns, fraudsters have moved on.

Explainability Matters

When we reject traffic, we need to explain why. Black box says no isnt acceptable for dispute resolution.

Signal-Based is Faster

ML inference adds latency. Signal checks are microseconds. In RTB, speed matters.

Our approach: use interpretable signals that catch known fraud patterns, integrate external providers for sophisticated threats, and stay humble about what we cant detect.

Share: