How to A/B Test Email Subject Lines

Your subject line is the gatekeeper to everything else. You could write the most compelling email copy in existence, design beautiful templates, and craft offers that would make anyone click, but none of it matters if nobody opens the email in the first place. The subject line is the first thing people see, and for most recipients, it's the only thing they'll ever see. That tiny snippet of text in their inbox determines whether your carefully crafted email gets read or sent straight to the trash.
This is why A/B testing subject lines is one of the highest-leverage activities in email marketing. Unlike testing email body content (which requires someone to open the email first), subject line tests directly impact the top of your funnel. A 20% improvement in open rates means 20% more people seeing your message, clicking your links, and taking action. Those gains compound across every email you send.
Why Subject Lines Matter More Than You Think
Most email marketers spend the majority of their time on email content, design, and calls-to-action. Then they spend about thirty seconds writing a subject line right before hitting send. This is completely backwards. The subject line deserves as much attention as everything else combined because it's the single biggest determinant of whether your email gets read at all.
Think about your own inbox behavior. You probably receive dozens or hundreds of emails per day. You don't open most of them. You scan subject lines and sender names, making split-second decisions about what's worth your attention. Your subscribers do exactly the same thing with your emails. They're not carefully considering each message. They're making snap judgments based on 50 characters or less. If your subject line doesn't immediately grab attention or signal relevance, you've lost before the game even started.
The benchmarks for SaaS email marketing show that top performers can achieve 20-30% higher open rates than average. Much of that gap comes down to subject line quality. And unlike other optimizations that require extensive redesigns or new features, subject line improvements are fast, free, and immediately testable.
How A/B Testing Actually Works
A/B testing is simple in concept. You take your audience and randomly split them into two groups. Group A sees subject line A. Group B sees subject line B. You measure which group opens at a higher rate, and the winner becomes your new baseline. The beauty of this approach is that it removes guesswork and personal opinion from the equation. You're not debating whether questions or statements work better. You're measuring it directly with real data from real subscribers.
The standard approach is to test on a small portion of your list first, then send the winning version to everyone else. For example, you might send variant A to 15% of your list and variant B to another 15%, wait a few hours to see which performs better, then send the winner to the remaining 70%. This gives you the benefits of testing without risking your entire campaign on an unproven subject line.
Most modern email platforms handle this automatically. You create two subject lines, specify what percentage of your list should be in the test group, set a waiting period, and the platform does the rest. The winner is determined by open rate, and it goes out to everyone else without any manual intervention required.
What to Test in Your Subject Lines
The options for subject line testing are nearly endless, but some variables consistently produce meaningful differences in open rates. Start with these before moving on to more esoteric experiments.
Length is one of the most impactful factors to test. Short subject lines (under 40 characters) work well on mobile devices and create urgency through brevity. Longer subject lines (50-70 characters) give you room to be more specific about what's inside. Neither is universally better. It depends on your audience and content. A subject line like "Quick question" creates curiosity, while "3 ways to reduce your churn rate this month" tells people exactly what to expect. Test both approaches and let your data decide.
Personalization is another high-impact variable. Including the recipient's first name in the subject line can boost open rates significantly, but it can also feel gimmicky if overused. Beyond names, you can personalize based on company name, location, or product usage. "Sarah, your trial ends tomorrow" will outperform "Your trial ends tomorrow" in most cases, but the effect varies. Some audiences respond strongly to personalization while others see it as a manipulation tactic. You won't know until you test.
Questions versus statements is a classic test. Questions create an open loop in the reader's mind that they want to close by opening the email. "Are you making this onboarding mistake?" has a different psychological impact than "The onboarding mistake most SaaS companies make." Both can work well, but they appeal to different mental processes. Questions trigger curiosity while statements promise information.
Urgency and scarcity are powerful motivators, but they can also feel pushy. Test subject lines with time pressure ("Last chance: Sale ends tonight") against those without ("20% off everything in stock"). Be careful with manufactured urgency because overuse will erode trust and train subscribers to ignore your emails entirely.
Emojis are worth testing, but approach them carefully. A single relevant emoji can make your subject line stand out in a crowded inbox and boost open rates by 10-15%. But multiple emojis or irrelevant ones can look spammy and hurt deliverability. Test a tasteful emoji against a plain text version and see what your specific audience prefers. B2B audiences tend to be more skeptical of emojis than B2C, but there are exceptions to every rule.
Sample Size and Statistical Significance
Here's where most people get A/B testing wrong. They run a test on 100 people, see that variant A got a 22% open rate while variant B got 18%, declare A the winner, and move on. The problem is that with 100 people, that difference could easily be random chance. You haven't proven anything.
Statistical significance is the likelihood that your observed difference reflects a real underlying difference rather than random variation. The industry standard is 95% confidence, meaning there's only a 5% chance your result is a fluke. To achieve that confidence level, you need sufficient sample size.
As a rough rule of thumb, you need at least 1,000 recipients per variant to detect a 2-3 percentage point difference in open rates with confidence. For smaller differences (like 1 percentage point), you need 5,000 or more per variant. If your list is smaller than this, you'll need to either accept less certainty in your results or only test dramatic differences that might show up in smaller samples.
Many email platforms now show statistical significance directly in their A/B testing interfaces. They'll tell you when you have enough data to declare a winner with confidence. Pay attention to these indicators. Declaring winners based on insufficient data is worse than not testing at all because it gives you false confidence in changes that might not be real improvements.
How Long to Run Your Tests
Time is just as important as sample size. Even if you have 10,000 subscribers, sending to all of them at 9 AM and checking results at 10 AM won't give you accurate data. Different people check email at different times. Your early openers are not representative of your entire list.
For most B2B emails, you need to wait at least 4-8 hours before declaring a winner. For B2C emails where engagement is faster, 2-4 hours might be sufficient. The goal is to capture the bulk of opens that will happen for this email. If you cut off too early, you're basing decisions on an unrepresentative subset of your audience.
Some platforms let you run tests for a fixed time period before automatically choosing a winner. Others let you wait until statistical significance is reached. The second approach is better because it adapts to your actual data rather than an arbitrary time limit.
For important campaigns, consider letting the test run overnight or even for a full 24 hours before sending the winner. You'll capture opens from every timezone and get the most accurate picture of which subject line truly performs better.
Setting Up A/B Tests in Your Email Platform
The exact steps vary by platform, but the workflow is similar everywhere. Start by creating your email as normal, then look for an A/B test option when setting the subject line. Most platforms put this near the subject line field itself, often as a "Add variant" or "A/B test" button.
Enter your two subject lines. Try to test one variable at a time. If you're comparing "Quick update on your account" against "John, here's what happened last week", you're testing length, personalization, and framing all at once. If variant B wins, you won't know which factor made the difference. Better to test personalization in one experiment and length in another.
Set your test size. Sending to 20-30% of your list (split between the two variants) is a good starting point. This gives you enough data to detect meaningful differences while reserving the majority of your list for the winning version.
Set your waiting period. Choose based on your typical email engagement patterns. If you know that 80% of your opens happen in the first 4 hours, waiting 4 hours is fine. If your audience is spread across timezones and engagement trickles in over 24 hours, set a longer window.
Choose your success metric. Open rate is the standard for subject line tests since that's what subject lines directly influence. Some platforms also let you choose click rate or conversion rate. These can be useful if you're testing subject lines that set different expectations about email content, but for pure subject line optimization, open rate is what you want.
Interpreting Results and Avoiding Common Mistakes
When your test completes, you'll see the performance of each variant. Before declaring a winner, check the statistical significance. A 25% open rate beating a 23% open rate means nothing if your confidence level is only 60%. You need to see 95% or higher confidence to trust that the difference is real.
If your test is inconclusive (neither variant reached statistical significance), that's actually useful information. It means your two subject lines performed similarly with this audience. You can either rerun the test with a larger sample or conclude that this particular variable doesn't matter much and move on to testing something else.
Watch out for the multiple testing problem. If you test ten different subject line variables in ten different tests, one of them will likely show a "significant" result by pure chance (at 95% confidence, you expect one false positive per twenty tests). Be skeptical of results that contradict your other tests or seem too good to be true. Consider retesting surprising wins to confirm they're real.
Don't over-optimize for open rates at the expense of everything else. A clickbait subject line might boost opens but hurt clicks if recipients feel deceived when they read the email. The subject line should accurately represent what's inside. Measure downstream metrics (clicks, conversions, unsubscribes) in addition to opens to make sure your subject line improvements translate to business results.
Building a Testing Culture
The companies that get the best results from A/B testing don't treat it as an occasional tactic. They build it into their process. Every email is an opportunity to learn something, and every test adds to a growing body of knowledge about what works for their specific audience.
Start by testing one variable per campaign. You don't need elaborate testing matrices or dozens of variants. Pick the most interesting question for each email (Does personalization help? Does a question outperform a statement? Do shorter subject lines work better for this content?) and run a single clean test to answer it.
Document your findings. Create a simple spreadsheet or doc where you record what you tested, what won, by how much, and any relevant context. Over time, this becomes an invaluable reference. You'll notice patterns. Maybe questions consistently outperform statements for your welcome emails. Maybe shorter subject lines work better for your trial expiration emails. These insights should inform how you write subject lines going forward.
Share learnings with your team. Subject line insights aren't just valuable for email. The principles that make a good subject line (clarity, curiosity, relevance, urgency) apply to push notifications, ad headlines, and landing page titles. What you learn from email testing can improve marketing across channels.
Set a testing cadence. Decide that you'll test subject lines on at least one campaign per week or one per month, whatever makes sense for your volume. Having a regular rhythm prevents testing from falling by the wayside when things get busy.
When Not to Test
Not every email needs an A/B test, and not every situation lends itself to testing.
If your list is too small, testing is a waste of time. With under 1,000 subscribers, you won't reach statistical significance for most tests. You're better off following best practices and making intuitive improvements rather than running tests that won't produce reliable results.
Transactional emails usually shouldn't be tested. When someone is waiting for a password reset or order confirmation, your subject line just needs to be clear about what's inside. "Your password reset link" doesn't need optimization. Trying to make transactional emails cleverer often backfires because users just want the information without any marketing polish.
Urgent or time-sensitive emails sometimes can't wait for a test to complete. If you're sending a flash sale announcement that's only valid for 6 hours, you can't spend 4 hours testing before sending the winner. In these cases, apply what you've learned from previous tests and send immediately.
One-off emails that won't be repeated aren't good candidates for testing either. The value of testing comes from applying learnings to future emails. If you're sending a unique announcement that you'll never send again, the insights won't pay off. Save your testing effort for emails you send regularly.
Putting It All Together
Subject line testing isn't complicated, but it requires discipline. Start every email by drafting two potential subject lines. Set up an A/B test with proper sample sizes and waiting periods. Let the data choose the winner. Document what you learned. Apply those learnings to future emails.
Over time, you'll develop an intuition for what works with your specific audience. You'll know whether they prefer questions or statements, short or long, personalized or generic. But that intuition will be grounded in real data rather than guesswork. You'll have tested your assumptions and refined them based on actual subscriber behavior.
The cumulative impact of consistent testing is substantial. A 10% improvement in open rates, compounded across every email you send for a year, means dramatically more people engaging with your content. More trial users reading your onboarding sequence. More customers seeing your feature announcements. More churning users getting your reactivation outreach. Those incremental improvements add up to real business results.
Start with your next email. Draft two subject lines instead of one. Run the test. See what happens. That's all it takes to begin building a subject line testing practice that will improve your email performance for years to come.