May 8, 2026

Nick Selman

Shoplift Team

•

Head of Marketing

•

Shopify Theme Testing During Redesigns: Why Not to Pause

Share this post

Testing Through a Rebrand: Why Not to Wait for the New Theme

Your new creative director is redesigning the site. The design system is taking shape, and somewhere in a Slack channel, someone has quietly suggested pausing your A/B testing program until the new theme is live.

The instinct to pause makes sense. Why run tests on a design you're about to retire? It feels tidy to close out the old and open fresh with the new.

The problem with that logic is that the window during your redesign is actually one of the most valuable testing periods you'll ever have. Creative decisions are still in motion, the design hasn't hardened, and the behavioral questions your tests answer right now will still need answering after the new theme launches. Except by then, you'll be answering them with developer hours instead of test traffic.

Teams that pause testing during redesigns launch into the unknown. Teams that test during redesigns launch with evidence and a basis for confidence on the path forward.

Rory O'Hara, Senior Ecommerce Account Strategist at By Association Only, has offered his insight throughout this piece. BAO is a Shopify Plus design and development agency that's worked with brands in luxury, fashion, lifestyle, and cosmetics since 2010. Rory has run point on enough redesigns to know where testing programs tend to break down, and his perspective is woven into the analysis below.

Why teams hit pause

The arguments for pausing aren't unreasonable since test results won't carry over to a new layout. Running experiments on a design the team is about to replace feels wasteful and there's a practical concern: active tests can create noise for developers building something new alongside them.

These are real friction points, but they're reasons to be selective about what you test, not reasons to stop testing altogether.

There's another layer that rarely gets named: organizational dynamics. When a new creative director is in the picture, there's often an implicit sense that the CRO roadmap should yield to the creative vision. Testing can feel like it's running counter to that momentum, so programs don't just pause for strategic reasons. They pause because nobody wants to create friction with the person driving the new direction.

The result is the same either way. When testing stops, the vacuum gets filled. Stakeholder opinions, designer instincts, and executive preferences move into the space that data should occupy. Decisions still get made. They just get made with less to back them up.

By the time the new theme launches, you've spent weeks building on assumptions you could have tested.

“There are also factors that can be tested whose impact can be carried over between designs. CTA content is a great example here. Decisions such as wording choice or whether to include iconography aren’t necessarily specific to their surrounding design elements, but can have significant impacts on a user’s confidence and overall conversion” - Rory O’Hara @ By Association Only

What you're losing by waiting

Behavioral data has a shelf life, but the questions it answers are more durable than most teams realize.

Does urgency messaging lift conversion on your product pages? Does your hero copy hold attention above the fold? Does social proof placed near the add-to-cart button outperform social proof lower on the page?

These questions don't expire when your theme changes. The answers might shift slightly with a new design, but directional findings hold. Benefit-led headlines outperform feature-led ones. Free-shipping framing beats percentage discounts for your audience. If you've validated those findings before the redesign ships, they get baked in from day one rather than discovered six months post-launch.

There's another cost that's harder to quantify: building the wrong thing. Redesigns are expensive, and when teams launch a new theme without validated hypotheses, they often spend the next quarter undoing decisions that a two-week test could have informed before a single line of code was written.

"We’ve seen cases where layout testing for key templates can result in conversion lifts as high as 10% or even 15%. These are changes that tweak or adjust the page’s design rather than take a whole new approach. That presents a huge opportunity to inform the decision making process during redesign projects in a way that optimises performance.” - Rory O’Hara @ By Association Only

Shopify store optimization isn't a post-launch activity. It's an input to the design process itself.

The case for testing during a redesign

The core mistake in the "pause and restart" approach is treating all tests the same.

Design-layer tests are the ones that genuinely don't survive a rebrand. Testing whether a teal button outperforms a navy one on your current theme? That result probably won't carry over. Testing whether a two-column product grid converts better than a three-column one in your existing layout? Also theme-dependent. Hold those.

Behavioral and messaging tests are a different category entirely. CTA copy, headline framing, offer structure, social proof format, urgency logic. These are hypotheses about how your customers think and respond, not about how your current design looks. Those hypotheses apply whether your theme is two years old or two weeks old.

The practical move is to split your test backlog into theme-dependent and theme-independent. Then run the theme-independent tests now, while the redesign is in motion.

It won't always feel clean. New elements don't yet fully fit the brand aesthetic, the site is in flux, some results will feel inconclusive. That's fine. The goal isn't perfection, it's signal. Even directional findings gathered in an imperfect environment are more useful than launching a new theme with no findings at all.

When a test proves that benefit-led headlines outperform feature-led ones, that finding belongs in the design brief, not in a post-launch roadmap. When a test shows customers respond better to review quotes than star ratings, the new theme should be built to support that format. These aren't just CRO metrics. They're design inputs.

This is also where native Shopify theme testing earns its keep mid-redesign. Running meaningful experiments in parallel with the redesign process, without heavy developer involvement, means testing can actually happen. Not in spite of the redesign, but alongside it.

What to test right now

If you're mid-redesign and want your testing program to actually feed the design process, focus on tests that answer questions the new theme will still need to answer.

Value proposition clarity. Does your hero copy communicate the right thing quickly enough? Test headline variants that lead with different benefits or framings. Whatever wins should inform your new above-the-fold design.

Social proof format and placement. Star ratings, review counts, pull quotes, press logos. These perform differently depending on your audience and category. Find out what your customers respond to before the new layout locks in where social proof lives.

CTA language. Transactional versus benefit-oriented framing can move the needle, and the winner belongs on every CTA in the new theme.

Offer framing. Free shipping versus percentage discount versus bundle incentive. If you're redesigning around a promotional strategy, know which frame converts before you build around it.

Product page information hierarchy. What do your buyers need to see first to feel confident adding to cart? The answer shapes how your new PDP should be structured.

Elements you've quietly deprioritized. Most sites have things that worked, guarantees, trust signals, loyalty messaging, that got buried during a busy stretch. A redesign is the right moment to retest them before they get locked out of the new theme entirely.

Every one of these results is a direct input into a design decision. Run them now, and your designers have evidence to work from instead of assumptions.

How to bring this to your team

The internal conversation isn't "we should keep testing despite the redesign." That framing sets up a turf conflict between CRO and creative.

The better framing: testing is how we make the redesign better before we build it.

If the creative director sees test findings as inputs that sharpen their vision rather than constraints imposed on it, the dynamic changes completely. Position your testing sprint not as a parallel program, but as research that feeds the brief. Designers who build from validated behavioral data ship fewer revisions. That's a pitch most creative leads will take.

Propose a lightweight pre-launch sprint. A defined window, a focused backlog of theme-independent tests, and one clear output: a findings document that feeds directly into the design brief. Frame the results as answers to questions the new theme will have to answer anyway. You're just answering them earlier, cheaper, and with data instead of instinct.

For broader stakeholders, the pitch is simple: "We'll have data behind the design decisions before we spend developer hours building them." Hard to argue with that.

Subscribe to the Shoplift newsletter

Get insights like these emailed to you bi-weekly!

‍

The redesign window is an opportunity

The conventional wisdom is to treat a major redesign as a reset. Close out everything, start fresh, run new tests once the new theme is live.

The case for the opposite is straightforward. The redesign window is exactly when your findings are most actionable. Creative decisions are still in motion, the design hasn't hardened, and behavioral data gathered now shapes what gets built rather than what gets revised after the fact.

If you're working with a design partner on the build, find one that treats testing findings as design inputs rather than friction. That's how the By Association Only team operates, and it's the dynamic that makes a redesign hold up.

If your A/B testing program was put on hold for a redesign, the first step is to audit your backlog. Separate the theme-dependent tests from the behavioral ones. Then start running the latter.

The redesign is still in progress. There's time to make it better.

Frequently Asked Questions

Can I run A/B tests on Shopify while a redesign is in progress?

Yes, and you should. Focus on theme-independent tests like copy, offer framing, and social proof format. These don't require changes to your current design and generate findings that directly inform your new theme before it's built.

Won't my test results be invalidated once the new theme launches?

Design-specific results (button colors, layout variants) may not carry over. But behavioral findings that show how your customers respond to different messaging, offers, or proof formats tend to be durable. Those results should be treated as inputs into the new design, not discarded with the old one.

How do I know which tests are worth running before a redesign?

Split your backlog into theme-dependent and theme-independent tests. If a test is about visual design or layout, hold it. If it's about copy, logic, offer structure, or messaging, run it now. The new theme will need to answer those same questions anyway.

Does A/B testing during a redesign slow down the dev team?

It doesn't have to. Theme-independent tests can typically run without touching the development roadmap. Tools that support theme A/B testing natively on Shopify make it possible to run experiments in a separate environment from whatever dev is building.

How do I get creative buy-in to test during a redesign?

Position test findings as design inputs, not CRO metrics. A creative director who sees behavioral data as something that sharpens their vision, rather than challenges it, becomes an ally. The pitch: we're answering questions your new design will need to answer anyway, just before the build rather than after.

When should I actually pause testing during a redesign?

If site traffic is being significantly disrupted by the build process, or you're in a final QA and staging phase, a short pause is reasonable. But the default shouldn't be "pause until launch." It should be "test what we can, when we can, and feed those findings forward."

Share this post

Close Cookie Popup

Cookie Preferences

By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.

Cookie Preferences

By clicking “Accept All”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts as outlined in our privacy policy.

Navigation

Home Pricing Partner program Start free trial Request a demo

Product

Price Testing

Resources

Docs Blog Case Studies Shoplift vs Intelligems Shoplift vs Visually

Legal

Featured Article