AB Testing for Product Managers by fmr Optimizely PM

This month, we’re focusing on all things Product Data Analytics. Keep an eye out for events, podcasts, blogs, and more!

It can be hard to know what to measure in AB testing and how to take action on the results from your AB tests to improve your products. As a former Product Manager at Optimizely, Byron Jones has a lot of firsthand experience with conducting the best AB tests possible and shares best practices.

About Byron

former Product Manager at Optimizely, Byron Jones

Byron Jones was a Product Manager at Optimizely at the time of this webinar; he’s now VP of Product at Lily AI. Before Optimizely, he spent 10 years working at NASA’s jet propulsion lab working on Mars rovers, x-ray telescopes and unmanned spacecraft. He found Optimizely when he decided he needed a break from government work. Like the typical tale of the Product Manager, he started out at a different team at a small company, but because the company was so small there were blurred lines between roles. He started on the Success team but quickly started to take on roles outside of that, which helped him transition into product.

AB Testing for Product Managers

As a Product Manager, you own the success or failure of your product. There are two primary ways that you can use experimentation and AB testing to be successful.

The first one is really about metrics and moving the metrics that matter to you. Say your goals are around daily active users or new user sign-ups or some sort of conversion rate. These are all metrics that are really important to you and your business and your product. So why not use experimentation to help move those for you? And we’ll talk about an example of that in a minute.

And secondly, something that’s really useful and sort of overlooked in a lot of AB testing world is actually phased rollouts and ways that you can release features in a measured and controlled way so that you can get better insights into the product with the control group and get that user feedback early on so that you can be more successful in the long-term of your product release.

Check out next: Product Management Skills: User Research

Trunk Club Case Study

Those are the two primary ways that I think about using and leveraging AB testing as a Product Manager. I’d love to dive into one example with a customer of ours here. This is an example from Trunk Club. Mike Wolf, product design lead there, was saying that testing fits in almost everywhere at Trunk Club. “We test everything from box sizes and packaging to taglines on our homepage. We’re constantly trying to improve our service and we validate success through qualitative feedback and data.”

I think that’s really powerful statement. On the product design team, they work very closely with the product team, hand in hand as making the user experience as great as possible and to drive new business for Trunk Club. Let’s go in and take a look at one of the tests that they ran.

Here’s an example of an original and a variation that they had in their signup flow. Trunk Club matches personal style to consumers who have signed up for the service, and Mike’s key goal here was to have additional sign-ups. Initially he had the sign up flow on the left, which you see is pretty generic. It’s got four fields for name, email, and that kind of stuff.

You might also be interested in: Product Management Skills: A/B Testing

comparison of original and variation of forms

And it’s very basic, but it’s following some of the industry best practices around sign up forms, where you actually want to have a really simple form field. We’ve used that at Optimizely. No distractions, just get the job done of signing up. You don’t want to distract the user to do something else.

But Mike’s hypothesis was that that wasn’t fitting with their brand and that users would respond better to something that was more personalized. So they came out with this cool new way of doing it, this variation that they wanted to try where it gave a personalized experience. It was a little bit longer of a signup, but there were more images, it was more interactive.

With this more interactive form, there was a huge increase in the new customers through this variation. Having that personalized experience clearly worked for Trunk Club, which validated Mike’s hypothesis that a personalized workflow that matched their brand would be a better way to sign up customers.

showing a 133% in new customers for the varied form

And what’s cool about this as it was also really challenging some of the industry best practices around signup flows. Most people would say that simpler is better and less distraction is better, but having something that fits with their brand and is more personal and interactive, I think made that a lot more fun. It was a great example of a way to challenge some of the norms, and just run an experiment. The beauty of experimentation is also it’s not permanent, and you get to prove through data whether or not you’re right.

You can present this to your boss or to whoever you’re trying to justify an argument, or justify the work to do. You can validate these hypotheses through data, and it’s not a permanent change that you’re making until you actually decide whether or not something’s winning or losing. That’s a great part about experimentation.

The 5 Stages of AB Testing

the 5 stages of AB testing: Analyze-->Hypothesize-->Build-->Run-->Measure, and repeat

Let’s talk a little bit about the experimentation process. These are the highlights here of the five stages of AB testing and what this means is: how are you going to go from ideation to implementation of an AB test? And the first step there is really to analyze the data that you have available to you. In Mike’s case he was looking at the data around signups to try and optimize his signup flow.

You could use any sort of qualitative or quantitative data that you have available to you about your product or your site, and your users and customers, and really trying to identify opportunities for you to experiment. Once you’ve narrowed that down to a place that you want to run a test, you come up with your hypothesis.

Mike’s case there, he was talking about, “Hey, I think that giving a more personalized sign-up experience that fits our brand will actually drive more signups than the industry standard really simple form.” And so they tested that hypothesis, they built out their experiment. I would of course advocate for using Optimizely for running an AB test, but there are definitely a lot of other tools out there to do that and very effective ones.

Whatever you choose to do it in is fine, but the important part is that you’re running tests. And so you built your experiment, and you’re measuring the metrics. And as you’re building it, you’re making sure that you’re keying in to measure the signup flow. Where the drop-offs might be, whatever metrics are that you care about for your test, you’re making sure to set those up to measure them so that later you can define and determine a winner for your test.

Save for later: FAQ: What Skills Should a Data-Driven Product Manager Have?

Next is you’re going to run your tests, and that can take a varying amount of time. We have a tool online you can Google: Optimizely Test Duration Calculator. This will effectively take in the number of visitors that you get to your site typically, and the conversion rate for the metric that you’re trying to move.

And those factor into how long will it take to reach statistical significance for your experiment. So that’s an important piece. You want to make sure that the test that you’re running reaches statistical significance. Don’t actually end up looking at data and having a false positive, which is common for a test before they’ve reached that statistical significance. Once you’ve reached that, you’re going to measure your results and take a look and see how things went. In Mike’s case, they found a winner and they saw a big increase in the number of new customers that were signing up for Trunk Club, which was a great thing. But I’ve seen a lot of tests that actually provide a negative impact to those metrics that matter to them, and that’s actually not a bad thing.

woman running an experiment with test tubes

The important part is that you’re running a test and that you’re learning something. And that test was running on a smaller percentage of your overall traffic or overall customer base. And so you’re learning and iterating quickly. If that test had lost for Trunk Club it would have saved them a lot of time from actually having to build out that full new signup flow permanently into the product.

With Optimizely and some other AB testing tools, you can actually build something pretty quickly as a prototype, put it into a test and iterate quickly on that and find out how it does so that you can learn from that before you invest all the engineering and development resources to put something into code permanently. So those are all things to think about as you’re going through and building your AB test. And then you rinse and repeat, you go through and analyze your data again, and think about where else you want to test. It’s an iterative process. And something that I certainly enjoy it and is a huge tool for any Product Manager out there.

Check out: Why Data Analytics Matters for Product Managers

Who to Target with AB Tests

One of the powerful tools of Optimizely in AB testing is figuring out what audience you want to include in your experiment. That’s typically dependent on the type of hypothesis that you have and what types of metrics you’re trying to measure. And if you’re combining those two things, you’ll figure out the right way to determine what visitors to include in your test.

AB Testing by Industry

AB testing doesn’t change drastically by industry. Industry just shifts the type of metrics that you’re looking for.

We work a lot with retailers. Those are very straightforward types of industries to do AB testing on. You’re looking for how do you take users through a signup funnel, going from online advertising into the site to make a conversion. And I’d say you can use that kind of generic template with any industry, whether it’s travel, or insurance, or FinTech.

You’re thinking about what is the customer journey. From how they find out about you, they gain that awareness about your company, all the way to some type of conversion that you’re trying to achieve. You map that out and you find the points along that journey that are going to be important for you to try and move metrics for. In e-commerce, it’s converting on going from that initial signup to putting something in the cart to purchasing. And there’s an equivalent of that to all the other industries out there.

Prioritizing AB Tests

Prioritization is a really hard thing as a Product Manager in general, and applied to AB testing, I think about it in two ways: what’s the level of effort that’s involved in this type of AB test and what’s the potential impact of the test if it’s successful? And you can plot those two out on the wall and say, give each a score and then find the test that might have the lowest level of effort, but the highest potential impact. And then you can start from there. That’s typically how we’ll prioritize our tests here. We use anything from a spreadsheet to JIRA, to lots of other tools out there to manage the testing process. And if you are using that type of scoring, it can really help you prioritize your work.

banner for product analytics course

Enjoyed the article? You may like this too: