← BlogEngineeringMay 8, 2026

Creating the Invisible Shopper

How we keep synthetic shoppers out of merchant analytics: a universal blocklist built from crawling 100 ecommerce sites, plus Sherlock — an agent that audits each new customer site and patches anything that slips through.

Squoosh Team

Engineering

4 min read

If synthetic shoppers show up in a merchant's analytics, three things break at once. The merchant can't read real user behavior from their own dashboards. They can't cleanly verify the impact of changes Squoosh recommended, because the post-change data is mixed with synthetic traffic. And on our side, the behavioral models that train Squoosh's synthetic shoppers get contaminated by their own outputs, which degrades the realism of every future run.

Squoosh's synthetic shoppers behave so similarly to real users that analytics tools, by default, include them right alongside genuine traffic. That's the problem: we need synthetic shoppers that act like real users on the page, but are invisible the moment analytics gets involved.

Why the obvious fix doesn't work

The intuitive answer is to have each synthetic shopper add an analytics-blocking extension to their browser. We tried it. The extension blocked too much, including traffic Shopify's own checkout page needs to function, so our shoppers couldn't complete checkout.

The extension approach was the wrong tool for the job.

What we built instead

We sent agents to crawl 100 ecommerce sites to identify the most common analytics calls in the wild: Google Analytics, Google Tag Manager, Meta Pixel, Hotjar, Clarity, and similar platforms. From that crawl we built a universal blocklist that every Squoosh customer gets by default, so the bulk of analytics traffic is silenced before a synthetic shopper ever touches a customer site.

A universal blocklist gets us most of the way there, but every site has its own quirks, especially merchants who route analytics through their own domain to evade ad blockers. To catch the rest, we built Sherlock: an agent that walks each new customer's site end-to-end and surfaces anything our default blocking missed, so we can patch it before the customer goes live.

Clean Analytics from Day 1 with Squoosh

Keeping synthetic traffic out of customer analytics matters enough to us that we built property-level analytics exclusion directly into our onboarding process. Before any customer runs an experiment, we run a four-step verification:

Sherlock walks the site. We send a single agent we named Sherlock to complete a full purchase journey on the site, deliberately triggering every analytics call the site might fire. Anything that slips past our default blocking shows up in the customer's analytics during this run.
Property-specific blocks if needed. If anything did pop through (usually nothing does), we add it to a property-specific blocklist for that customer.
Functionality check. We send ten synthetic shoppers through the site to confirm the additional blocking didn't break anything on the page.
Live monitoring. Once the customer completes their analytics setup with us, backend monitoring continuously traces their analytics for synthetic traffic. If anything seeps through, our on-call team is alerted.

The result: clean analytics from day one, verified before you launch and monitored after. If synthetic traffic ever seeps through, our on-call team catches it first.

Why the obvious fix doesn't work

The extension approach was the wrong tool for the job.

What we built instead

Clean Analytics from Day 1 with Squoosh

Sherlock walks the site. We send a single agent we named Sherlock to complete a full purchase journey on the site, deliberately triggering every analytics call the site might fire. Anything that slips past our default blocking shows up in the customer's analytics during this run.

Property-specific blocks if needed. If anything did pop through (usually nothing does), we add it to a property-specific blocklist for that customer.

Functionality check. We send ten synthetic shoppers through the site to confirm the additional blocking didn't break anything on the page.

Live monitoring. Once the customer completes their analytics setup with us, backend monitoring continuously traces their analytics for synthetic traffic. If anything seeps through, our on-call team is alerted.

The result: clean analytics from day one, verified before you launch and monitored after. If synthetic traffic ever seeps through, our on-call team catches it first.

Why the obvious fix doesn't work

What we built instead

Clean Analytics from Day 1 with Squoosh

New research, straight to your inbox.

Why the obvious fix doesn't work

What we built instead

Clean Analytics from Day 1 with Squoosh

New research, straight to your inbox.