The Why & How of Enterprise Analytics with Spotify Data Scientist

This month, we’re focusing on all things Product Data Analytics. Keep an eye out for events, podcasts, blogs, and more!

Enterprise analytics is one of former Spotify Senior Data Scientist Gordon Silvera’s nerdy passions. In this talk, he takes us through an in-depth discussion not only on enterprise analytics practices at Spotify, but also on his personal philosophy on data science and why it’s so valuable.

About Gordon

Former Spotify Senior Data Scientist Gordon Silvera

Gordon had a unique data upbringing. He originally started as a strategist, but his curiosity about data led him to learn more about the subject. He went on to work with some of today’s data visionaries and top data-driven companies, and attributes his growth as a data scientist to working in these environments.

His journey into data started at Caesar’s Entertainment, where he worked in strategy pricing hotel rooms. Caeser’s Entertainment was a pioneer in CRM analytics and gaming, meaning the analytics and business intelligence team there was top class. Gordon became enthusiastic about using data to inform his business decisions, and began asking the data team more and more questions. It got to a point where the business intelligence manager told him that he should learn how to do it himself.

This was a turning point in his career. He started getting into SQL and VBA, learning how to pull information seamlessly to automate many aspects of his job. To learn even more about data, he started working at Dunnhumby, pioneers in CRM, analytics, and retail.

When he worked there, he and his team had full access to data from Kroger’s groceries, a client of Dunnhumby. The data was at an individually identifiable level: every user ID and everything that they bought at Kroger, which was pretty substantial data and Gordon’s first time working with something akin to big data. He eventually moved into their advanced analytics team, before moving on to fantasy sports company FanDuel and eventually to Spotify.

He spent nearly 5 years at Spotify and became Senior Data Scientist. He has since moved on to become Head of Product at Streaks, a sports gaming startup.

The Why and How of Enterprise Analytics

Why Data Analytics Matters

It’s very rare that I actually get to talk about enterprise analytics without people shutting me up very quickly. When people ask me what I do at Spotify, I just tell them that I analyze stuff, because that’s true. But more specifically I work on a team called data SWAT and we are in charge of providing support to a number of the highest priority projects within our company.

For those of you that know a bit about Spotify and the way that we operate our business, we have a system called Bets. These are the top 10 or so highest priority projects within the company, and I work on an internal consulting team that has been thrown on these various Bets to do different types of analysis. They last anywhere between three months and a year, and we need data to help drive the decisions that occur within these Bets. This has been a great opportunity because I get to see the breadth of the company and a variety of analysis methods.

This is a very exciting time at Spotify; we have a great data team with the vision and skillset to transform Spotify into one of the top data analytics companies. But before we discuss Spotify more in-depth, I’m going to talk about analytics generally and why I think that analytics is incredibly important in organizations.

This is a very important moment in the data science industry, because people are starting to lose faith in data. In order to illustrate this, I’m going to use a Google trends chart. I have three lines here: data analysis system in the blue, big data in green, and data science and yellow. And the range of the X-axis is the duration of my overall career tenure.

graph showing Google Trend data for data science and big tech

You’ll notice around 2015, “big data,” according to Google, has started to stall. This could be concerning to me as a data professional, but I’m not so worried because this trend reminded me of was Gartner’s Hype Cycle for emerging technology. To me, this resembles the peak of inflated expectations.

Gartner's Hype Cycle for emerging technology

So let’s test this hypothesis. The first point is this Google trends, which is kind of a weak trend. I don’t really know that much about Google trends, so you could argue one way or the other. Secondly, we see a lot of academic and business-oriented articles from top scholars questioning the value of data. For example, Chris Brahm, who’s the head of advanced analytics at Bain, mentioned that only 4% of companies are able to fully capture the value of big data.

And then the third point here is from Gartner itself. This is their 2016 hype cycle for big data. And what you’ll notice is a number of the principle applications of big data are at the top of the hype cycle. So you have things like predictive analytics, Python, data lakes, spark, predictive analytics, machine learning. Gartner is saying that all of these things are at the top of the hype cycle. This is a bit of a yellow flag, because data isn’t meant to do everything.

The way that I think about it—and this is central to the way I think about data—the way that we break out of this hype cycle to make analytics valuable for companies on an ongoing basis is by thinking about connecting data to value. If you only take one thing away from this, it’s this. You have to connect data to underlying business value, and that should be ideally quantifiable business value.

Winning at the Margin: Why Companies Need Enterprise Analytics

Before we go into the Why of enterprise analytics, let’s define it. I consider enterprise analytics to be when an analytical organization in a business is considered as important as any other major function within that company. So be it marketing, finance, operations, et cetera. And the way that we define “important” is through the financing or the investment that you get, the types of people that you’re bringing in, the prioritization of the hiring, and executive level support.

But beyond the business piece, I have philosophical reasons for believing in analytics. Analytics itself is more than crunching numbers or information; it’s critical to being successful at most anything that is highly competitive or challenging.

As Al Pacino’s character says in On Any Given Sunday, “life’s this game of inches. So is football. Because in either game, life or football, the margin for error is so small—I mean one-half a step too late, or too early, and you don’t quite make it. One-half second too slow, too fast, you don’t quite catch it. The inches we need are everywhere around us. They’re in every break of the game, every minute, every second.”

Football is a game of inches, life’s a game of inches. And I really believe that business is a game of inches. And coming from competitive sports, doing business for a while, I think of business and sports as quite similar in many ways. And if you want to be successful in business, you have to win the margin. So this is really the concept that Al Pacino’s character is talking about here, winning at the inches. You have to execute the small things well.

The way I define winning at the margin is: consistently executing the profit generating processes of the business better than all competitors. And it’s not only me that thinks this.

These are business people I follow who embody this ideal as well. Jeff Bezos is a highly competitive individual, and the culture at Amazon is competitive; they are really obsessed about optimizing market share and cash flow. You can think about increasing your revenue as offensive, but Bezos thinks about this defensively as well through the principle of frugality. At Amazon, frugality is one of their principles and they really live it.

Ray Dalio, founder of Bridgewater. I think this guy is an amazing, brilliant dude. The reason that I bring up Ray Dalio is because he has this concept of machines. You have like a machine, and you have where you want to go and you take an action to get you to where you want to go, and that causes a result.

You take that result, you learn from it and you take another action that is based off of that initial learning and you progress from there. Based off of that process, he was able to build the largest hedge fund in the world and arguably one of the most successful hedge funds in the world.

And finally, Sam Walton. He says, “you can make a lot of mistakes and can still recover if you have an efficient operation, or you can be brilliant and can still go out of business if you’re too inefficient.”

Analytics is the best path to iterative improvement

Okay. So now you know winning at the margins important, but how do you actually do this? Here are the lowest common denominators of what is required to enter the margin:

  1. Culture of Discipline
  2. Efficient Execution
  3. Iterative improvement

When I think about business as a data strategist, I break up business into three sections to understand it at a high level. And the way that I break it down is:

  1. Strategy
  2. Operations
  3. Analytics

Strategy could be anything from marketing, your executive team, et cetera. Operations are the people that are executing whatever is necessary to have the business run, and analytics are the nerds that analyze stuff. And these three sections align pretty neatly ties into the framework for entering the margin. Culture of discipline is aligned to strategy, efficient execution is aligned to operations, and iterative improvement is aligned with analytics.

When you think about iterative improvement, at the very basic level it requires two things: 1) You need to know how well you’re doing relative to where you’ve been in the past, how you are doing relative to competitors, et cetera. 2) You need to know how to get better. And analytics are within a business context are quite critical for both of these things, which is why analytics is the best path to iterative improvement.

The Value of Enterprise Analytics

How do we quantify the value of analytics? Especially in the face of people starting to question the value of big data and data science, it’s very important to be able to quantify the value that we as a profession are bringing to companies.

For those of you that do analysis in optimization or user targeting, it’s quite a bit easier because you can A/B test these things.

If you’re predicting when users are going to churn, you can build a predictive model against it. You can split the groups into two, you can predict the A group and then target them, and add a particular propensity to churn. The B group is completely untargeted, and then you test the two.

I consider anything related to optimization or user targeting as operational analytics: analytics where you’re not really gleaning information off of the analysis you’re doing, but you’re improving processes. You’re doing more black box streamlined work.

insight = info point that guides a strategic decision

The other side of that is strategic analytics, where you’re creating insights that people can use to change their decisions. This is a simple definition you can use: an insight is an information point that guides a strategic decision. The strategic side is a bit more difficult to quantify, because how do you quantify an insight?

For example, every time somebody across your entire analytical community sees a PowerPoint presentation and says, Hey, you know, something that’s interesting, I’m going to do X differently. Or every time that somebody on a line of production workers sees a data point come up throughout their process and says, Hm, I think we need to lower the capacity on this machine because of Y.

Those are both insights, but they are very difficult to tag and practice. If we think about this from a conceptual perspective, it allows us to put this into a more quantitative or economic framework. And one of the frameworks you can use is economies of scale. Economies of scale are a very important concept within enterprise analytics because if you begin as a startup, you generally start small with analytics. You’ve got just a couple business intelligence people, data scientists, analysts, what have you. But this is really limiting.

If you look at the number of insights per analyst, per period of time, it’s clear that building a scalable infrastructure early on will lead to more insights in the longterm.

Building Enterprise Analytics at Spotify

So how do we build enterprise analytics at Spotify? Coming into the organization, it was very gratifying for me to see that they’re really thinking about being pioneers in analytics and where they’re trying to go. One of the important things for context about Spotify is that a few years ago Spotify acquired a company called Seed Scientific, which was a consulting analytical consulting company but now comprise the analytics department within Spotify.

They’re the ones now driving a lot of the analytical vision at the company, and the vision the company has in regard to data is incredibly comprehensive and deep. A lot of the tools that they’re building right now are very impressive and interesting, and make my job a lot easier.

At Spotify, we use what we call a Center of Excellence model, which is essentially an org model. You can have embedded analytics teams, you can have a centralized analytics team. This is kind of more of a hub and spoke model, where we have Data Mission at the center of everything. And we almost act like consultants, supporting these embedded teams within their functions.

Center of 
Excellence model

As an example: we have a marketing team, and then within the marketing team, we have a marketing sciences team. We have Content Insights that focuses on better understanding our artists and the way that users engage with our artists. We have Product Insights that looks more deeply looks at our actual products and how of those perform, et cetera.

And then you have Data Mission. They generate a lot of data and develop best practices, processes, and tools to use. They hire some people like me to throw around the business. Data Mission is not focused on any functional objective per se, but generally asks: How can we move the company forward in regard to analytics?

Spotify’s Data Analytics Tools

Now we’re to the good stuff: what Spotify actually does. I use 2 frameworks that give a conceptual understanding of how to approach analytics in an organization. One is the Keys to Excellence, and the other is the Analytics Maturity Pyramid. And I blend these together along with things that Spotify uses, like BigQuery for big data.

You might also be interested in: What Is The Spotify Model?

Keys to Excellence
Analytics Maturity Pyramid

There’s an article from Harvard Business Review called “Does Your Data Have a Strategy?” It talks about sources and versions of truth. The single sources of truth are the most important data: things that flow through to your financials, things that are shown across a company for company-wide metrics. These have to be highly accurate. At Spotify, we have teams that are dedicated to generating single sources of truth, which we call business-critical data. This is incredible data with their own SLOs.

On the other hand, you have multiple-version data. This is more of exploratory work. We use the squad model—even within product insights, you’re going to have bunch of analysts distributed across a bunch of different squads, and they’re going to generate data. So for instance, on Search. every time that you search for an artist or a song on Spotify, we retrieve that data. That data is very important to the search team, maybe less so to a lot of other teams.

So that’s where the multiple versions of truth come in. Search data doesn’t have to be a hundred percent accurate. I would imagine with search data, we could probably get decent insights off of 80-90% accuracy. Other companies use this model as well, especially tech companies that generally operate with the squad model, where it’s important to have multiple versions of truth.

Our AB testing system is called ABBA, like the band. This has to coordinate across engineers who do instrumentation on particular aspects of the AB test. Our product owners and strategic people look at this to find performance, T test, significance testing, et cetera.

We also have what we call System Z. This is a platform that runs and schedules all of our internal products. I specifically use this for something called BigQuery runner, which allows you to schedule queries on an ongoing basis. It applies to dependencies, so if you have to have run one query and then pull data off of that query and run another one afterwards, it can do all that.

My favorite internal tool at Spotify is called Lexicon. It has every single analysis that data scientists have done. And it has a ton of metadata that you can index and search, et cetera. This is invaluable to me, as I’m kind of rare case where I don’t have a lot of context on the projects that I’m going into, they just throw me around the business on different projects and say, figure it out.

Or with exploring data. We have 60,000 data sets. Here you’ll see all the primary data sets related to artists, albums, et cetera. It makes it a hundred times faster for me to find the data that I actually need to use. And it looks pretty!

It also helps that Spotify is very specific about certain aspects of their hiring. And one of the things that they really look for are T-shaped skillsets. This is somebody that has a lot of depth in one thing, but also has a lot of broad knowledge in a variety of other things.

I was talking to this guy the other day who was a product owner for one aspect of the business. He came in and started to look at a piece of information, and just pulled up his command line terminal and started querying a massive set of data. And it turns out he used to be an engineer for six years before going over to the business side. So you have these people with very wide skills, but also very deep skills.

Conclusion

In the Hype Cycle of emerging technology, data science is at the peak of inflated expectations, where people expect too much from data and are becoming disappointed in it. We should break beyond the big data hype, and the way that we can do that is by connecting data to its underlying value.

turn data into value

And also, business is a game of inches, a game of margin. It requires discipline, effective execution, and iterative improvement, and analytics is the best way to make this improvement.

Because Spotify invests in data and puts it at the center of everything it does, it’s well positioned to become one of the top analytics companies in the world.

Note: at the time of this talk, Gordon Silvera was just 9 months into his almost 5 year tenure as a Spotify Data Scientist. He went on to become Senior Data Scientist at Spotify, helping transform it into a world-class analytical company as he hoped for in this talk. He gave this talk at an interesting point for data science, when the initial hype of the possibilities of data was cooling down and the reality emerged—data can’t perform miracles, but it can help us continuously improve to become the best of the best.

Enjoyed the article? You may like this too: