Mida uses a sequential testing methodology — you can monitor your test as data comes in, stop early when there's a clear winner or loser, and keep it running when results are inconclusive, all without losing statistical validity.

Tip: You can also have Claude or ChatGPT summarize results, flag significant winners, and surface insights — see our Mida MCP setup guide.

Under the hood, Mida runs every test through two engines: Bayesian (default) and Frequentist (alternate). Bayesian is what most marketers and agencies will want — it gives you a straight answer in plain English. Frequentist is there for teams that need to report in p-values.

You can switch between the two in Configuration → Analysis Method without resetting the test (although you will only be able to change the method before the test is live or after it has been paused).

What is Sequential Testing?

Unlike traditional fixed-horizon testing that requires a predetermined sample size, sequential testing allows for continuous monitoring of results as data accumulates. This means you can:

Check results at any time during the test
Stop the test early when clear winners or losers emerge
Continue collecting data when results are inconclusive
Maintain statistical validity throughout the monitoring process

How to read a Bayesian result (default)

Bayesian gives you three numbers. Read them together.

Confidence — the chance this variant is genuinely better than control. 95% confidence = a 95% chance the variant wins. Read it at face value.

Credible Interval — the range your variant's true lift most likely falls within. If the whole range is above 0%, the variant wins. If it crosses 0%, it's still too early. If it's narrow and close to 0%, the variants are basically the same.

Risk — what you'd lose if you ship this variant and turn out to be wrong. Below 1% is Very Safe, below 5% is Safe, above that is risky.

What the recommendation banner means

Mida combines all three into a single call at the top of your report:

✅ Very Safe / Safe to call [variant] the winner — Ship it.
✅ [variant] is ahead with low risk. Safe to call it. — Confidence is just under threshold but risk is tiny. Safe to ship if you want to move fast.
⚠️ [variant] is ahead with low risk. Keep running. — Promising. Let it run a bit longer.
⚠️ [variant] is likely the winner, but risky. — Lift looks big but sample is thin. Don't ship yet.
✅ Safe to deploy! No meaningful difference detected — The variants perform the same. Pick whichever you prefer.
📊 Collecting data... / ⚠️ Need more data — Keep the test running.

The Frequentist Framework

Mida's statistical engine is built on frequentist principles, which:

Calculates the probability of observing the test results if there were truly no difference between variants
Controls the false positive rate (Type I error) at your chosen significance level (typically 95%)
Provides confidence intervals to show the range of likely true effect sizes
Makes no assumptions about prior probabilities, relying purely on observed data

This combination of sequential testing and frequentist statistics ensures:

Efficient resource use by enabling early stopping when appropriate
Protection against false conclusions through rigorous statistical controls
Clear, interpretable results based on observed data
Flexible monitoring without compromising statistical validity

Test Result Cases

Case 1: Clear Winner

Green: This shows a winning variant. Given that the required confidence level is 95%, this test result is considered statistically significant because the statistical significance value of 99.71% surpasses the required threshold of 95%. Statistical significance refers to the probability that the differences observed in the test are not due to chance. In this context, a 99.71% statistical significance means that there is a less than 0.3% likelihood that the results occurred by chance. And, with an improvement of 233.04%, this is a meaningful lift.

Next, look at the confidence interval of the difference of means. The confidence interval provides the range of expected lift values, at the 95% confidence level. In other words, the lower bound is the “worst case” scenario of possible lift and the upper bound is the “best case” scenario. Here, you see a range from 0.44% to 1.32%. Since both numbers show a good increase compared to the Control CR (0.26%), you can feel confident about the change.

Case 2: Clear Loser

Red: This shows a losing variant. Given that the required confidence level is 90%, then the statistical significance of the losing variant should ideally be less than 10% (100%-90%). The statistical significance value of Variant 1 is 6.71% which implies that the observed difference occurred due to randomness is roughly 6.71%, a figure that is below the required threshold of 10%. In other words, you can be (100 - 6.71) = 93.29% confident that the losing variant, Variant 1, is indeed inferior to the winning variant, Control.

Looking at the confidence interval, we see a range from 4.69% to 5.29% Conversion Rate (CR). In simpler terms, 4.69% represent the “worst case” scenario, and 5.29% represent the “best case” scenario of Variant 1's performance. Since both values are below the 5.38% CR of Control, you can be confident that the Control is the winner.

Case 3: Inconclusive Result

Gray: This indicates that the test doesn't have definitive results and hasn't reached statistical significance yet. Depending on what you're trying to achieve with your experiment, here are some options you may want to consider:

Let It Run Longer: In some instances, you might need to allow the experiment more time to gather a larger sample size and achieve more accurate results.
Simplify Variations: If you have too many variations, consider reducing them. For instance, you might bring four variations down to just two or three.
Prioritize Brand Consistency: If the results are similar between two variations, choose the one that aligns best with your brand's guidelines.
Repeat the Test: Running the same test again can be beneficial for confirming your initial findings. Keep in mind that factors like the time of the year, or fluctuations in website traffic, may affect the end results.
Keep It As Is: Occasionally, your original design or strategy may not need any changes and is the most suitable version.

A few things to watch out for

Don't stop a test on day one even if confidence already looks high. Mida enforces a minimum duration (7 days) so weekly traffic patterns are captured. Day-of-week swings can flip an early result.
Wide intervals = not enough data. A "20% lift" with a range of -5% to +45% is not a 20% lift you can ship.
Don't compare Bayesian and Frequentist numbers side by side — 92% in one method means something different from 92% in the other. Pick a framework and stick with it for that test.

Interpreting test result in Mida report