I was 82% wrong about my onboarding

I was about to ship what I thought was an obvious improvement.

After three weeks of analyzing BlogSEO's onboarding metrics, the problem seemed clear: too much friction. Users had to navigate 2-3 screens, including a lengthy website analysis with AI-generated headlines, before reaching the free trial. I was confident that removing those barriers would boost conversions.

So I almost rolled it out immediately. But something made me pause. I remembered my first startup at Joko and their systematic approach: "If you can't measure it, you can't improve it." They AB-tested almost everything, especially the critical flows like onboarding.

So I decided to test my conviction instead of trusting it. I set up three variants to test:

Control: The current flow with full website analysis and generated headlines (high friction, but demonstrates value)
“Obvious” Winner: My "improved" version that skipped straight to the free trial paywall after users entered their URL (minimal friction, maximum speed)
Middle Ground: A compromise with a shortened analysis that showcased some AI magic without too much wait

4 weeks later, I analyzed the results:

Results after 4 weeks of testing

My "obvious winner" tanked free trial conversions by 29%, and checkout start conversions by 14% compared to the control group. If I'd shipped it without testing, I would have killed 29% of my future revenue overnight.

The actual winner was the middle ground: in hindsight, it makes sense. It has just enough friction to create a small "aha moment" and demonstrating the power of AI for SEO & providing a personalized experience, without making users wait too long.

The winning variant: a quick AI analysis that demonstrates value before the free trial paywall.

That new version improved onboarding conversions by 30% over the baseline. And it outperformed the one I thought would be the obvious winner by 82%.

This was not the first AB test I run for BlogSEO, but this one taught me that intuitions can sometimes be very expensive when dealing with critical flows.

After nearly tanking my conversions, I rebuilt my entire approach to product development. I needed a system that would catch my bad ideas before they cost me customers. That's when I developed what I call the MIT System.

The MIT System

The MIT System (pronounced “Mighty”) has 3 parts: Measure, Improvise, and Test.

It’s deceptively simple, yet it works very well in practice. Here is the breakdown of how it works:

1. Measure: Define success before you build

Before writing a single line of code for a new feature, I answer one question: "How will I know if this worked?"

Most engineers build backwards: they ship a feature, then scramble to find a metric that makes it look successful. Successful businesses prioritize what truly matters and don't waste resources on features that don't move the needle.

If you’re not able to find KPIs for your feature that ties to direct business revenue or user value, you’re probably not prioritizing the right thing (especially if you’re in the early days of your product).

Pick the right KPIs, not vanity metrics like "page views" or "time on site." Real success metrics that tie directly to revenue or user value. Feature activation, free trial conversions, ARPU (Average Revenue Per User); pick metrics that actually move the needle.

For my onboarding experiment, the success metric wasn't "fewer steps" or "faster completion." It was free trial conversion rate, the percentage of users who started a trial after entering their website URL. It is directly correlated to my top line revenue so it’s a good metric.

2. Improvise: Ship your best guess

Once you've defined your success metric, ship the feature. But ship it smart. Every feature I deploy goes out with:

Feature flags so I can roll it back instantly if metrics tank (or run experiments, more on this next)
Monitoring set up before the first user sees it with my success metrics

In practice, I usually create funnels that help get a better idea of where things are falling off.

3. Test: Challenge your guesses

Set up the experiment:

Create variants with feature flags
Define the success metric upfront (same KPI from step 1)
Decide how long to run it (I usually do 2-4 weeks minimum)

Let it run:

Don't make decisions based on 3 days of data
Wait for some statistical significance
Traffic needs time to normalize (weekday vs. weekend behavior differs)

Ship the winner:

Roll out gradually (10% → 50% → 100%)
Keep monitoring after full rollout

Some changes take time to show results.

When my onboarding test was at day 5, the results were noisy and inconclusive. By week 4, the trend was clear. If I'd made a snap decision early, I would have picked the wrong variant.

The same applies to acquisition channels. "I tried ads and spent $200" doesn't mean ads don't work, it means you didn't get through the learning curve. Ad platforms need data to optimize. You need iterations to find messaging that resonates.

My actual setup: PostHog

I use PostHog for all my analytics and testing. It's the best tool I've found for engineers who want to move fast without juggling five different platforms.

Why PostHog works for me:

Built for engineers: No marketing jargon, clean API, great docs
Generous free tier: You won't pay anything for months (I'm still on the free tier and will remain like that for some time).
All-in-one: Web analytics, AB testing, feature flags, session replay, product analytics, funnels, … everything is in one place and it helps me stay organized
Open source: If you care about data privacy or want to avoid vendor lock-in, you always have the possibility to do so (I use their cloud version though because it’s much more convenient for my use-case)

1. Define custom events

I define the events I need to track. For the onboarding flow, this meant:

user_signed_up
website_analysis_form_submitted
checkout_session_started
checkout_session_completed

These events become the building blocks for everything else.

2. Add feature flags

Every feature gets a feature flag, even if I'm not planning to AB test it. Why? Because if something breaks, I can kill it instantly without deploying new code.

For experiments, I create multiple variants (control, variant A, variant B) and let PostHog handle the random assignment.

3. Build funnels

This is where the magic happens. Funnels show you exactly where users drop off.

For my onboarding, the funnel looked like:

Signed up
Submitted website
Checkout session started
Checkout session completed (= started free trial with credit card)

One of my onboarding funnels. 8% of people who signed up have started their free trial with a credit card.

Note: Funnels don’t show you the evolution over time of your conversions, they are a static snapshot of where people drop off over the last X days, so it’s also important to track conversion trends over time.

4. Set up the experiment

In PostHog, I create an experiment linked to my feature flag:

Define variants (e.g: control, test 1, test 2, …)
Set the success metric (e.g: free trial conversion rate)
Choose traffic allocation (usually 33/33/33 for 3 variants)

Example of an experiment setup on PostHog. It shows you the statistical significance based on exposure, the delta in conversion and other metrics. Here my new version of my paywall seems to be performing worse.

Then I ship it and wait. No peeking at results for at least a week.

Setup tips

Track server-side: Where you can, always track events server-side rather than on the client. Today, a lot of people have ad blockers, and they tend to block analytics tool like PostHog.

If you track events on the client, they might never reach PostHog’s server, and so you’ll lose a significant chunk of data on some of your users. However, if you send the events on your API routes, the requests cannot be intercepted by ad blockers.

Test all your AB test variants: Before running an experiment, verify your events are firing correctly. I usually test in with an ad blocker to make sure everything works as intended.

Assign AB test groups on the server if possible: Like for event tracking, ad blockers can block the request which assigns an AB test group to your users. This can break your app for people with ad blockers.

For example, if you have rolled out one variant to 100%, but the default / false-like behavior is different from the rolled out version, people with the ad blocker will see the wrong version of your feature.

I handle variant assignment in my server-side rendering (NextJS server components), which ensures everyone gets a consistent experience regardless of their ad blocker setup.

If you can’t do that, make sure your default state is not broken (usually, this is the false-like / undefined state of your feature)

When NOT to split test

1. The improvement is really obvious

If you're fixing a bug that breaks your checkout flow, don't test "broken vs. working." Just fix it and ship.

Same goes for clear UX improvements like making your website responsive. For example, adding a Framer integration to BlogSEO doesn't disrupt existing behavior and clearly helps me serve more users, no need to test whether people want more options.

If something is objectively broken or obviously better, skip the test.

2. Inconsistent experiences will confuse users

When I added keyword research functionality to BlogSEO, I used a feature flag to roll it out gradually, but I didn't AB test it.

Why? Because users talk to each other. If half your users have a feature and half don't, they'll compare notes and get confused. "Wait, why don't I see the keyword tool?" creates support headaches and erodes trust. It also adds complexity to developing your product.

One thing I’m very reticent to AB test is pricing. If some users see $30/month and the others see $78/month, and notice there’s a difference, you might completely lose users trust. I’d love to have the data point on how well a given price converts, but it’s too risky (and it might even be illegal in some countries).

Don't test features where inconsistency creates a worse experience than just shipping to everyone.

3. The testing overhead isn't worth it

Not every change deserves the complexity of an AB test. Consider:

Traffic volume: If a page gets 50 visitors per month, testing will take forever to reach significance. Just ship and monitor.

Performance impact: AB testing adds complexity: feature flag checks, variant assignments, analytics events. For high-traffic pages where every millisecond counts, ask if the test is worth the performance cost, and make sure to engineer it so that it doesn’t impact performance too much (caching, after page load async request, etc.).

Engineering time: Setting up proper experiments takes time. Save it for decisions that actually move the needle.

Only test what matters. For everything else, ship smart with feature flags so you can roll back if needed.

Personal Update

In my last newsletter about user acquisition, I talked about how launching on niche directories can drive meaningful growth. Today, I'm putting that advice into practice: BlogSEO is launching on Uneed.

If you haven't heard of Uneed, it's a curated launch platform with a strong community of indie hackers and bootstrapped founders (exactly the audience that benefits most from BlogSEO). Unlike larger platforms, it focuses on quality over vanity metrics, which means launches actually get meaningful traction.

If you've found value in this newsletter or BlogSEO, supporting the launch would be hugely appreciated!

Discover BlogSEO on Uneed

Discover BlogSEO on Uneed - Rank #1 on Google & ChatGPT on autopilot. | SEO, AI, Blogging in Marketing (Freemium)

www.uneed.best/tool/blogseo

Thanks a lot!!!

Who’s building cool stuff: Vincent Tellene

Vincent is a French entrepreneur from Corsica building TokPortal. If you've ever tried to use TikTok for organic growth internationally, you know the pain: TikTok's algorithm is stubbornly local.

TikTok's algorithm prioritizes content to users based on location. Local accounts get pushed to nearby audiences first, giving you better initial visibility and engagement rates, which then signals to the algorithm that your content is worth showing more broadly.

But creating truly local TikTok accounts is nearly impossible, even if you manage to create the account, TikTok detects mismatches in device fingerprints, posting patterns, and content style, so the algorithm won't push your content to that local audience.

TokPortal solves this. It streamlines the entire process of creating location-optimized TikTok accounts so you can actually reach local audiences organically, without the algorithm flagging you.

I met Vincent while discussing SEO for TokPortal, and what impressed me most is that he's solving a real, hard problem that many marketers face. The product has a genuine moat: it requires deep technical expertise to get device fingerprinting and local optimization right, and Vincent has figured it out.

He's also built a strong Reddit marketing skillset, which is becoming one of the most underrated acquisition channels for B2B SaaS.

The other thing that stood out to me is that Vincent’s academic background would never lead you to think Vincent would become a tech entrepreneur. He has studied Philosophy, Social & Cognitive Science and still managed to build a great SaaS. While many people think they need a CS degree to build a SaaS, Vincent proves otherwise, he just built it. 🚀

If you're struggling to go international with TikTok content, check out what TokPortal!

Until next time, keep scaling! 🚀

Vince