A/B Testing in Product Management – To Learn or to Ship?

A/B testing in Product Management is invaluable. There is no such thing as good or bad idea. In the beginning, there are simply ideas. To find out if those ideas would work in the future they need to be A/B tested. But what is A/B testing? What is the difference between “to ship” and “to learn” tests? What do Product Managers have to keep in mind when doing A/B testing? Anna Marie Clifton from Yammer will answer these questions.



Anna Marie Clifton is the Product Manager for Yammer and also the co-host for the Clearly Product Book Club Podcast that comes out monthly. She started by managing a gallery in New York City but later switched to Product Management. Yammer is a collaboration tool that was acquired by Microsoft in 2012. It is a data-driven company which is one of the things that has made it strong over the years. Anna Marie has been doing projects on iOS, Android, The Web and is currently leading the notifications initiative and working on some algorithms and search based projects.

In the recent Webinar she talked about what A/B testing is, how to do it effectively and the best things you can do as Product Manager when it comes to completing the tests. “A/B testing, also called split testing, is in its simplest form a statistically valid way to go about seeing how good or bad the future ideas are. In A/B testing the idea is to release two different versions of a feature to a random set of users and then to measure what those users do relative to each other. To complete it you need thousands of users doing that particular thing in order to split test the idea.”

“The thing that is extremely important is that the two large groups have alternate versions of the feature at the same time. The reason for this is that there are a number of things that can happen that are out of the person’s control. However, you want to do your best in trying to control those factors. To avoid any bias based on the time series event, it has to happen at the same time with randomly selected users and in large sample sizes.”

She talked about two fundamental types of A/B testing which are “to ship” or “to learn.” “A Product Manager should try to determine which type of test they’re working on as soon as possible in the development process so that they can use that to make the trade-off calls and drive development faster. The purpose behind “to ship” type of testing is that they usually fix something or they are done for product completeness.”

On the other hand, the “to learn” type of test might not be a complete experience at all. “It might be only a front-end test without any back-end connected to it. Once the test type has been decided, the baseline for how to make trade-off calls needs to be established. As a Product Manager, you want to make sure that when you’ve decided the test type, you communicate it clearly to your engineers and designers and that you’re making decisions that are backed up by what you’re planning to do.”

The Thing about A/B tests - To Learn or To Ship?


“The best thing a Product Manager can do for his team during A/B testing to make people move faster is to remove ambiguity and help clear the path for the team to work. In A/B testing environments, people are normally testing twice as many things as they ever end up shipping. On average you can expect four or five of your tests to fail for every four or five tests that succeed which is a high percentage of tests that you expect to lose. They are expected to be bad ideas on purpose because you are working on new creative and innovative ideas. However, you can’t always run tests “to learn.” At times you have to run tests and ship it if it wins and move on.

She also gives Product Managers a pro tip when she says that they should keep in mind what their main objective is in the company. “Even though you might feel like you’re doing a lot of unnecessary “to learn” testing or iteration your engineers and designers don’t think so. They think about, for example, the technical debt and visual inconsistencies which are what you should think about as well but don’t forget that the most important thing for you as a Product Manager is to keep the company afloat and that is really what the team is tasked to do.”


Questions from viewers:


What do you typically do when there are no definitive results from your test?

There’s almost never a very clear-cut case. You follow which metrics move on the global level, and those will be your top-line metrics. At the end of the test, you’ll see which metrics moved on and which didn’t. Typically they move very little and they’re very hard to affect which is why they are your top-line metrics.

Secondly, you’re looking at the local metrics. There can also be a mismatch meaning that the local metrics are doing what the global metrics were expected to do and if that doesn’t create growth you need to stop it. It grows into a lively conversation trying to define why the metrics moved in that particular way. From those little changes the Product Manager has to make direct conclusions and decide which way the project will move.


How do you overcome the challenge of explaining a product that’s not yet built?

I haven’t done a lot of this in the past year and a half but a Product Manager has to be capable of storytelling about data to tie the results together and to be able to tell how the metrics moved when they changed so little. Therefore, a Product Manager needs to be able to do a little storytelling about the product itself as well. A good point to start with, would be figuring out who you’re communicating with, understand what the product would mean to them and what their needs are.


You mentioned that for A/B testing you should really have thousands of users. Do you have any recommendations for Product Managers with 10 to 30 enterprise clients as opposed to thousands of consumers?

One of the things about Yammer is that we’re an enterprise product. We don’t test on enterprise clients, but instead, we test on their users. For example you may have 10 to 30 enterprises with several hundred or several thousand end-users each and you can carry out tests on these end-users. You don’t have to do it on the network level.

At Yammer most of the things that we test, we can isolate user-to-user. If you don’t have thousands of end-users that you can test in the A/B way then there’s also a lot of customer driven development that you can do.

The Thing about A/B tests - To Learn or To Ship?


What are your thoughts about using session replay tools when doing A/B testing?

We don’t do a lot of sessions at Yammer so what I think you’re probably asking is, how to map out the number of users that do things and in what order they do them. The reason why we don’t do many sessions is because session tracking is complicated. At Yammer we’ve built our core metrics to be robust enough that we don’t need to measure sessions to find out why people are doing something.

As mentioned earlier, it’s the Product Manager’s job to tie together the global metrics and the local metrics but session tracking is a layer between the two and it can confuse them. It’s also technically difficult to perform and I wouldn’t rely on it. There are third-party tools that can do that for you but they tend to be broad definitions, and you don’t have direct control over what the experience will be like for the user.


How do you prioritize features that should go in an A/B test? I understand it’s mostly about prioritizing and not related to A/B test but what feature types have you found that have worked for an A/B test?

I’ll give an example. We have a project based on being able to edit posts and we are testing with end-users in big networks. If you imagine that one user created a post and then edited the post afterward, and another user that was not in the experiment and doesn’t have the possibility to edit posts or see edited posts went to see it, they would only ever see the first version of this post. However, the first person would think that everyone can see his edited post. This would break the user experience across the board. This is one of the situations you have to keep in mind and then decide whether you want the users to be able to interact with each other or does the feature need to be available for everyone.

The Thing about A/B tests - To Learn or To Ship?



Can you share your advice or insights for those that are aspiring Product Managers and want to break into the field?

Before I was a Product Manager I was managing an art gallery and knew that product management was what I wanted to do but I didn’t have the background to do it. I want to be very encouraging because when you change careers you’re going to get more rejections than you ever get acceptance. Be emotionally prepared to be rejected and don’t take it personally. You have to handle that because even when you get a job you will get a lot of rejection from the customers, engineers and designers. The best way to get into it is to do it on the side working with someone building a little project. You need to be hanging out with people who are making those kinds of things. There are a lot of free tools available that you can use to help you.


Have any comments? Tweet us @ProductSchool

Also, check out our upcoming Webinars!


Enjoyed the article? You may like this too: