June 3, 2026
Nick Selman
Shoplift Team
Head of Marketing

First Tests vs. Best Tests: What to Run When You Have No Testing History

Share this post
First Tests vs. Best Tests: What to Run When You Have No Testing History

Google Optimize was officially discontinued in September 2023, taking with it a lot of historical test data for a lot of brands. Now you're staring at a Shopify store, a roadmap with "launch testing program" on it, and a 60-day timeline. What do you actually run first?

This is the position a lot of brands are sitting in right now. Some are coming off Google Optimize's sunset and still evaluating replacements. Others parted ways with a CRO agency and discovered that the test archive wasn't theirs to keep. A few inherited the testing function from someone who left and found a roadmap with no receipts attached to it.

The instinct in all three cases is the same: run something big. Test the checkout. Redesign the PDP. Prove value fast so leadership keeps funding the program. The instinct makes sense. It's also wrong.

The best test you could run is not the first test you should run. With zero historical data, ambition is a guess in a costume. What you need first is a structured entry sequence, homepage, PDP, and navigation tests, run in that order, that builds the baseline that makes every subsequent test sharper, faster, and more likely to win.

Here's how to do it.

Why "Just Run Something Big" Fails When You Have No History

Historical test data does more than tell you what won last time. It quietly shapes every decision a mature CRO program makes, and when it's missing, you feel the absence in four places:

Effect size expectations. Is a 2% lift on your PDP exciting or boring? Without historical data, you have no idea. Mature programs know their normal range, which means they know when to celebrate, when to keep digging, and when to call a result a fluke.

Audience response patterns. Different segments, new vs returning, paid vs organic, discount-driven vs full price, all respond differently to the same change. Without historical data, you don't know which patterns hold in your store and can’t read tests at the segment level to catch what the aggregate hides. 

Velocity baseline. How long does your traffic actually take to reach statistical significance on a real test, not a calculator estimate, but the lived reality of your store? You only know this by running tests.

Tool and team fluency. Can your team ship a test in three days or three weeks? Does your tool surface results clearly enough that a marketing manager can read them, or does every readout require an analyst? You learn this by doing it.

The compounding cost of skipping these and jumping to an ambitious test is brutal: an inconclusive or misread early result poisons internal buy-in for months. The CFO who funded the program will remember the inconclusive checkout test long after they've forgotten the wins that followed.

Reframe the first 60 days like this: first tests are infrastructure. Best tests are outcomes. You can't build the second without the first.

The Entry Sequence Framework

Three tests, in order, designed to establish baseline data while still moving the metrics that matter.

3 tests designed to establish baseline data while moving metrics that matter

Test 1: Homepage hero or value prop test

The homepage goes first for one reason that overrides everything else: it has the most traffic. More traffic means faster path to significance, which means faster learning. When you're new to a testing program, learning velocity matters more than the absolute size of any single win.

Pick something low-risk. The point of test one is not to find a 15% lift, it's to confirm that your tool fires correctly, your team can ship, and your significance math matches reality. Good candidates:

  • Hero copy or headline rewording
  • Primary CTA wording ("Shop now" vs "Find your fit" vs product-specific language)
  • Above-the-fold layout adjustments
  • Social proof placement (reviews count, trust badges, press mentions)

What you actually learn from test one goes beyond the winner. You learn your site's day-of-week conversion volatility. You learn whether your traffic estimate was accurate. You learn how your team responds to mid-test stakeholder questions. You learn what your tool's "we have a winner" notification actually means in practice.

If test one is inconclusive, that's still a useful result. You now know that hero copy isn't where your homepage friction lives, and your next test can move down the page.

Test 2: PDP test

The PDP is where revenue happens on most Shopify stores, which is exactly why it's not test one. PDPs are more sensitive than homepages since small changes can swing AOV, return rates, and add-to-cart behavior in ways that take longer to interpret. You want one full homepage cycle behind you before you touch them, because the homepage test will have already given you a feel for how your store behaves under test conditions.

Entry-level PDP tests that tend to produce readable results:

  • Above-the-fold layout (image left vs right, gallery format)
  • Add-to-cart button treatment (color, size, sticky behavior on mobile)
  • Trust badge or guarantee placement
  • Review module position (above the fold vs below the description)
  • Variant selector format (dropdown vs swatches vs buttons)

One thing to call out specifically for Shopify operators: most stores can run PDP tests at the template level, meaning one test deploys across every product using that template. This is enormous. It's the difference between testing one product and learning nothing transferable, versus testing one template and learning something that applies to your entire catalog. If your tool doesn't support template-level testing, you're going to outgrow it fast.

What you learn from test two: which PDP elements your buyers actually weigh, how your AOV behaves under test conditions, and whether your significance timeline holds up on a lower-traffic page than the homepage.

Test 3: Navigation or category-level test

Navigation tests go third because they affect cross-funnel behavior. A change to your mega-menu doesn't just move conversion on the homepage; it moves traffic distribution across the entire site, which changes the inputs to every other test you're running. You want homepage and PDP baselines in place before you start moving traffic around, because otherwise you can't tell whether a downstream change came from the navigation test or from normal variance.

Tests worth running here:

  • Collection grid density (3-up vs 4-up, with/without filters visible)
  • Filter visibility and placement (sidebar vs top bar, collapsed vs expanded by default)
  • Mega-menu vs simple navigation
  • Mobile menu structure (hamburger vs bottom nav vs persistent CTA)

Navigation tests are often the most surprising of the three. Brands frequently discover that their assumption about how buyers move through the catalog is wrong, that the "Shop All" link nobody thought mattered is actually the highest-converting nav element, or that the carefully curated category structure is being bypassed entirely on mobile.

What "Ambitious" Tests Look Like Once You're Ready

Once the entry sequence is behind you, the ambitious tests stop being guesses. You have baseline effect sizes, velocity data, and a team that can read results. Now you can responsibly run:

  • Checkout flow tests (shipping calculator placement, guest checkout prominence, payment method ordering)
  • Pricing and offer tests (anchor pricing, bundle pricing, threshold-based free shipping)
  • Multi-page funnel tests that touch the homepage and PDP together
  • Personalization tests segmented by traffic source, returning vs new, or AOV tier

Every one of these is harder to design, run, and interpret than the entry sequence tests. They require more traffic to reach significance, they touch revenue more directly, and they're more politically expensive when they go sideways. The entry sequence is what earns you the political capital, and the data to run them well.

How to Tell You've Graduated Past First Tests

Below is a checklist for knowing when you're ready to stop running entry sequence tests and start running the ambitious ones:

  • You know your site's typical effect size range (e.g., "wins on our store tend to land between 4% and 9%")
  • You've reached significance on at least two or three tests; remember, winners or losers, both count as learning
  • Your team can ship a new test in under a week
  • Your stakeholder cadence doesn't require re-explaining what significance means every time
  • You can point to a baseline conversion rate by template, not just sitewide
  • You have a documented hypothesis library, even if it started as a Notion page

If you can check four or more of those, your next test can be ambitious. If you can't, run one more entry sequence test before you reach for the checkout.

Five Mistakes Brands Make When Restarting from Zero

  1. Testing the checkout first because "that's where the money is." It's also where the variance is highest, the traffic is lowest, and the interpretation is hardest. Save it for round four.
  2. Running four-way splits on low-traffic pages. Splitting traffic four ways on a page that struggles to reach significance with a two-way split means you'll never conclude anything. Two-way splits until you have velocity data.
  3. Ambitious redesigns disguised as A/B tests. If your variant changes seven things at once, you haven't run a test; you've run a launch. You can't learn what worked.
  4. Ignoring template-level testing in favor of one-off page tests. One-off tests are politically satisfying and operationally useless. The whole point of testing is transferable learning.
  5. Pulling the test before it reaches significance because a stakeholder got anxious. This is the most common and most expensive mistake. Document your significance criteria before the test starts, and hold the line.

Subscribe to the Shoplift newsletter

Get insights like these emailed to you bi-weekly!

{{hubspot-form}}

Choosing a Tool That Supports the Entry Sequence

Most brands choosing a Google Optimize replacement right now are evaluating tools on price and feature breadth. Those matter, but they're not the criteria that determine whether your first 90 days go well. Look for:

  • Shopify-native instrumentation. A tool that lives inside your Shopify environment, reads your existing product and order data, and doesn't require a developer to hook up basic events. If your team needs engineering tickets to ship a hero copy test, your velocity is dead before it starts.
  • Template-level testing as a default. Not a feature you pay extra for. The Shopify ecosystem runs on templates, and your testing tool should too.
  • Clear significance reporting. A non-analyst should be able to read the dashboard and know whether the test is conclusive. If significance is buried behind a stats glossary, your stakeholder cadence will suffer.
  • No-dev test creation for standard tests. Copy changes, layout swaps, CTA tests, all these should be ship-in-an-afternoon work, not engineering sprints.

Shoplift was built specifically for this entry sequence, with Shopify-native instrumentation, template-level testing by default, and a setup designed to get brands from zero to first test in days rather than weeks. But the criteria above hold regardless of which tool you choose; they're what separates a testing program that builds momentum from one that stalls in month two.

FAQs

What should I A/B test first on a Shopify store? 

Start with a low-risk homepage test, such as hero copy, primary CTA wording, or above-the-fold layout. The homepage has the highest traffic on most Shopify stores, which means the fastest path to statistical significance and the fastest learning velocity. Save PDP and checkout tests for after you've confirmed your tool, team, and significance timeline are working.

What's the best Google Optimize alternative for Shopify? 

The best Google Optimize replacement for a Shopify store is one built natively for the Shopify environment, with template-level testing as a default and no-dev test creation for standard changes. Shopify-native tools like Shoplift handle the instrumentation and template behavior automatically, which matters more than feature breadth when you're rebuilding a testing program from scratch.

How do I build a CRO strategy with no prior testing data? 

Run a structured entry sequence of three tests in order: a homepage test, a PDP test, and a navigation test. This sequence builds baseline data, such as typical effect size, velocity, and team fluency, before you commit to ambitious tests like checkout or pricing experiments. Without that baseline, prioritization is guesswork.

How long should my first A/B test run? 

Long enough to reach statistical significance, which depends on your traffic and effect size, not a fixed calendar window. For most Shopify stores, expect two to three weeks for a homepage test. Document your significance criteria before the test starts, and don't pull it early because a stakeholder is anxious.

Can I A/B test my Shopify store without a developer? 

Yes, if you choose a tool with no-dev test creation for standard changes like copy, layout, CTAs, and image swaps. Most Shopify-native testing tools support this out of the box. Developer involvement should be reserved for complex tests like multi-page funnels or custom event tracking.

What's the difference between template-level and page-level A/B testing on Shopify? 

Template-level testing deploys a single test across every product or page using that template, for example, one PDP test running across your entire catalog. Page-level testing runs on individual URLs. Template-level testing produces transferable learning and scales with your catalog; page-level testing produces one-off results that don't generalize.

How many A/B tests should I run per month when starting out? 

One at a time for the first three tests. Running concurrent tests before you have baseline data makes results harder to interpret, because you can't isolate which change moved the metric. After the entry sequence, two to three concurrent tests is a reasonable cadence for most Shopify stores.

Share this post
https://shoplift.ai/post/shopify-ab-testing-where-to-start
Close Cookie Popup
Cookie Preferences
By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Cookies helping us understand how this website performs, how visitors interact with the site, and whether there may be technical issues.
Cookies used to deliver advertising that is more relevant to you and your interests.
Cookies allowing the website to remember choices you make (such as your user name, language, or the region you are in).
Close Cookie Popup
Cookie Preferences
By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Cookies helping us understand how this website performs, how visitors interact with the site, and whether there may be technical issues.
Cookies used to deliver advertising that is more relevant to you and your interests.
Cookies allowing the website to remember choices you make (such as your user name, language, or the region you are in).