Are You Capitalizing on Conversational Data? What Product Leaders Need to Know

Editor’s note: the following was written by a guest blogger. If you would like to contribute to the blog, please review the Product Blog contribution guidelines and contact [email protected]

Digital transformation was already well underway before 2020, however COVID-19 dealt a massive shock to businesses. The pandemic forced retail storefronts to close and employees across all industries to work remotely. Conversations that previously took place face-to-face moved to over the phone, to digital applications, and to handsfree or voice-enabled devices — spoken from behind a mask.  If a business wasn’t having conversations with employees, customers or partners over the phone or on digital communication channels, the pandemic surely forced them into it.

An Opportunity to Capitalize on Voice.  

The rapid growth of conversations over digital channels has created a massive data opportunity. Did you know only 10% of data in an enterprise is structured, making it available for analysis and for triggering time saving automations? The remaining 90% of data is unstructured — items such as audio, video, and images — a class of data which is inaccessible to many analytical methods and was growing at 55-60% per year, BEFORE the pandemic.  Now, unstructured data growth is increasing at an even faster rate with remote work being the norm.  

Among audio, video and images, audio is growing at a massive rate. To put it in perspective Zoom reportedly grew 20x in meeting participants over the pandemic.   Contact center calls increased by 300% at the early part of the pandemic. If you were to estimate the total number of conversations available to mine in your customer base and/or target market, no matter how you slice it, the amount of audio information is staggering and worth exploring.

Unearthing Actionable Insights

Companies big and small are working to unlock this staggering amount of audio data because they know that this is where they can truly understand the Voice of the Customer (VOC) and implement immediate changes to increase sales, reduce customer churn, improve support, and save costs.  

All the large customer experience (CX) solutions like Clarabridge, Genesys, Medallia, and Qualtrics are implementing voice data into their CX analyses.  They realize that mining voice data is where you will genuinely understand your customers.  Post transaction surveys normally get 5-30% response rate and these surveys are biased to the “very happy” or “very angry” customers.  Companies may never hear from the 70-95% of customers who don’t care or don’t have the time to fill in a survey.  Are these customers no less important or their response less useful?

What can you do with this audio data?

  • Detect customers who are likely to churn and win them back
  • Find insights on how to improve your product or service
  • Test if certain sales pitches work
  • Coach salespeople on cold calling
  • Assure that agents are in compliance with reading regulatory statements

There are so many valuable insights left on the table if audio data is not added to your business analysis.  So how do you start?

How to Unlock Voice Data? Speech Recognition

Speech recognition is emerging in nearly every software category as an underlying infrastructure service to unlock unstructured audio data. In some software verticals, such as Conversational AI and Natural Language Processing, product leaders have already incorporated transcription as a foundational part of their product. Using speech-to-text or automatic speech recognition software, in conjunction with other capabilities such as dialog flow software or Text-to-Speech (TTS), allows them to acquire new customers, retain existing customers or increase add on sales.

man holding blue and white smartphone

In other verticals such as Unified Communications as a Service (UCaaS) and Contact Center as a Service (CCaaS) product leaders turn to speech recognition to expedite the creation of new add-on products such as virtual agents or advanced analytics.  Product leaders in Talent Recruiting software companies are using transcriptions and audio metadata (tone, sentiment, pauses, etc.) to screen for the right candidate and analyze their responses in a more unbiased way.  Regardless of the software category or stage of growth, speech recognition improves customer experience and reduces headaches for the innovators bringing voice-enabled experiences to market. 

Speech Technologies Are Evolving Rapidly Thanks to AI

In the past 10 years, there has been a renaissance in Speech-to-text (STT) technology.  Voice technology has moved from command and response systems like Apple Siri, Google Home, and Amazon Alexa, to solutions that can listen and parse full conversation in milliseconds.  The advent of Artificial Intelligence (AI) allows even the most eccentric of use cases to be solved. 

Deep Learning, a subset of AI, has quickened this revolutionary change in STT.  End-to-End Deep Learning platforms have eliminated the multi-step, compute heavy, legacy speech recognition process and optimized it into one step; run on fast GPUs, audio in and accurate text out.  This End-to-End Deep Learning process creates speech models that are extraordinarily flexible and fast to train. By using an End-to-End Deep Learning approach, organizations are able to continuously train their models, which both reduces error rates and provides flexibility for the model to learn new terms on the fly.

You might also be interested in: What Are Voice User Interfaces by Amazon Alexa Senior PM

These new approaches for STT have made the technology finally ready for business applications.  Gone are the days of “acceptable” 70% accuracy, trading high accuracy for slower speed, or having to limit audio transcriptions due to high STT costs.  

End-to-End Deep Learning STT solutions can reach 90%+ accuracy, run at millisecond transcription speeds, are less costly, use fewer compute resources, and are more scalable with 100s of transcriptions streaming on one GPU.  As a result, product leaders are free to re-imagine user experiences and create amazing voice products.

Our Future is Now, Powered by Voice

Voice-powered is the new frontier for businesses.  After being initially developed for consumer-based command and response applications, the new breed of speech recognition solutions can now meet the more challenging business audio needs with more accuracy, faster transcriptions and lower costs.  

Speech recognition enables product users to complete tasks easier, faster, and with greater accuracy. In addition it allows product leaders to launch new features or add-on products quickly, at a fraction of the cost. Is voice data something you are considering adding into your roadmap? Add any questions, thoughts into the comments, or @ mention us at @DeepgramAI on Twitter or #Deepgram on LinkedIn. 

Check out our whitepaper, How to Make Your Application Voice-Enabled to learn more.

Meet the Author

Katie Bryne

Katie Byrne has led marketing for enterprise infrastructure, security and network automation technologies. She has a proven track record of amplifying brands, driving user adoption, and executing programs that deliver business outcomes.  She is also an all around nice human being.

Product Podcast Season 8

Enjoyed the article? You may like this too: