Testers Besters
There is a common theme amongst those who win with direct mail. They test everything – list, media, copy, offer, timing – continuously, methodically. The only time you’re not testing is when you’re either not sending, or you’re sending your first shot – in which case you’re just trying something to test against later. Basically – an opening gamble.
Sure one or two chancers have got to a campaign with a positive ROI and stuck with it, but for reasons nobody seems to properly understand – the steam will run out eventually. If you have the stomach or the technology for it, you should be continuously testing direct mail campaigns against your benchmark even when a campaign is successfully running. When you understand what’s going on in the prospect’s mind as a result of your learning, you can get the real lifts you deserve. If your test doesn’t reveal to you something new about your consumer, something you can use to understand why they took the action they did, then it’s of little use.
We test, we learn, we minimise risk.
Since our goal eventually is to create a machine where £1 spent produces £1+N return, our job will eventually be simply to spend as many £1s as possible at the same rate. Testing is the only thing that can get, and keep us there.
Your direct mail testing schedule will be dictated by preference, budget and market size. Tests can be run sequentially or in parallel. Be aware when running in parallel that you’ll still need to send a significant number per test, not just in aggregate.
For example, if testing 4 different headlines against your base headline, it’s might not be enough to send 250 of each test and then assume that your sample included 1250 in total – it only really included 5 tiny tests.
How Many Makes A Good Test?
Million-dollar question with a ten dollar answer: enough to be statistically significant in your market.
Experience will start to guide you here, but logic can give you a good starting point.
First of all – use 1500-2000 units as a starting point for an A/B test if your market supports it. At that point, basic statistics start to kick in and say you’ve got a good chance at “significance”.
A mistake we saw our early clients make often was to say “Yeah, we’ll give it a test and see what happens..” They would send about 500 units, get no responses and decide direct mail was a bad channel.
Complete rookie mistake because:
1. Not a big enough sample for direct mail testing
2. Actually only testing one thing – their first mailer
3. Basing future decisions on non-significant past performance
It can go the other way too – they send a tiny number and get a 20% response (only takes two calls from 10 pieces…) and immediately assume that sending 50,000 a month will produce 10,000 responses. It won’t.
If your market size is small – let’s say you just want to approach every new florist when they startup – then your testing will be very different and take longer to produce results. In this case, you might consider a multi-step campaign with different mailers at each step, measuring which convert. Beware though – to be fully sure, you need to test if a piece converts because of the one that came before it too…
A/B Tests
A/B testing is where you have two identical pieces, save for ONE thing, and run them against each other.
Let’s make “A” your control and “B” your test. The control piece is the current winning piece according to results or, if just starting, your best thinking. The test piece is the one with a single difference.
You can have more than two things being tested at the same time, but for this type of test don’t have more than a single change in each mailer.
For example:
A: Control piece
B: Different headline to control
C: Same headline as control, but different price
If you go changing loads of things in a letter, you won’t know which change sealed the deal (or killed it) and you’ll end up needing to test all over unless you set out to create some good Multivariate Tests…
Multivariate Tests
Let me contradict that whole test-one-thing-only point for a moment – you could send two radically different pieces, measure results and rollout if the B piece wins, but you’re more likely to do this at the start of a campaign when testing radically different approaches or formats. Also, if you control just isn’t working, there’s no point at all making a small adjustment to a headline and waiting for the results! Hone in on the details as the campaign starts to mature.
Some tests might require Multivariate format too – if you test a segment of 50-60 year olds against a segment of 30-40 year olds, you will probably want to change the message as the two demographics have different requirements.
Scaling
By carefully measuring all of your campaigns and responses, documenting each change and all the data you can get your hands on, you will begin to learn quickly what is statistically significant and how much confidence you can have in your numbers going forwards.
Until you’ve reached that point, be very careful with the rollout, scaling or rejection of a test. If you tested 1000, don’t send 100,000 on the back of that data. Equally, if your result wasn’t very different from your control, don’t immediately ditch the test.
When scaling up, stick to small multiples of your test and treat them as new tests – if you started on 1K, send 5K next time before ditching your control.
Also, understand that the laws of probability are shot to pieces if you start sending something substantially different to your test. Don’t make the very dumb mistake of introducing a significant change to the scaled-up campaign. “Oh test 7 did well; we’ll 10x it but stick it in the different envelopes when we push it out”
A Note On Costs And Scaling
This might seem obvious, but guaranteed it gets overlooked:
When considering the ROI and success rates of a direct mail piece, run the calculations based on the scaled-up costs, not the test costs.
For example – if you test a thousand pieces at £1 each for a 1% response, your CPR is £100. However, were you to scale to a point where your cost per piece became £0.75, then your CPR also drops to £75 – which might mean the difference between loss and substantial profit.
Measuring Responses
Not rocket science – set your goal, count how many of the goals are scored.
- Example goals might be :
- Money received in orders
- Clicks to a URL
- Phones ringing
- Response mechanisms received
- Signups to an email list
The most important thing is to find a way to connect Mailing Piece X with Response Y. The basic ways to do this are:
- Personalised URL
- Matching original names to orders
- Specific per campaign landing pages (not personalised)
- Specific phone numbers
- Reference codes on redemption
One of the mildly irritating things about direct mail testing, in the beginning, is the time lag involved. If your order to mail goes in on Monday for two-day delivery, most likely your letters will hit Thursday / Friday. They might sit on a desk or in a folder over the weekend and become a todo item in someone’s folder. People might respond after a week or after three months, you don’t know.
In other words – decide a cutoff point for responses and go with it. It will be different per campaign – a time-limited offer with a one-week deadline is obviously different from an insurance ad that won’t be relevant for 6 months. The time of year, the weather, the current news cycle, the offer, the postman – they can all significantly affect the response timings. The biggest book on direct marketing ever suggests that some results are instant, then the significant dates are the first Monday after a first-class campaign, or the third Monday after a bulk campaign. For first-class, half the results are in two weeks after the first Monday, and bulk class three or four weeks later.
But – your data will differ.
Get your own stats and build a model that you’re comfortable with over time. If you see patterns or a curve emerging, use that in future to more accurately plan campaigns and expectations. For example – if 80% of responses come after two weeks and the remaining 20% over two months, you might be more confident in your one week test results but also know that you can expect an uplift over time, thereby having a range of response rate to calculate ROI.
Be as granular as you can when measuring responses in your direct mail testing. If you’ve mailed the same campaign to two different recipient types (e.g. male/female) you’ll benefit enormously from measuring the responses to each group. If you sent 1000 pieces for 50 responses, that 5% hit rate might be hiding the fact that 40 came from women and 10 from men. Without that data, you’ll miss a powerful optimisation opportunity.
The giant shoulders we stand upon:
- Nash, E. L. (2000). Direct Marketing: Strategy, Planning, Execution. McGraw Hill Professional