Blog / Data & Viz

Why Your Call Center Data Is Probably Lying to You (And How to Fix It)

Building an AI sentiment analyzer for call recordings changed how I think about customer data—and why dashboards alone aren't enough.

Juan David Avellaneda April 19, 2026 4 min read 5 views
Why Your Call Center Data Is Probably Lying to You (And How to Fix It)
AI Insights
The article highlights the limitations of relying solely on dashboards for call center data, emphasizing that while metrics may appear stable, they can mask underlying customer frustrations. By utilizing advanced tools like Whisper and BERTopic, businesses can uncover the real issues behind customer sentiments and improve decision-making.
dashboard trap Whisper BERTopic sentiment analysis data compliance customer insights
Understanding the true nature of customer feedback is crucial for effective decision-making. By bridging the gap between available data and actual customer experiences, businesses can make informed changes that enhance customer satisfaction and reduce churn.

The Dashboard Trap

I've spent the last five years building dashboards in Looker Studio and Power BI. Beautiful ones. The kind that make executives nod during presentations. Charts that trend upward. KPIs that look healthy. But here's what nobody tells you: a 4.2-star customer satisfaction score tells you almost nothing about why customers are actually angry.

Last month, a client showed me their Q3 metrics. Everything looked stable. Revenue up 3%. Support tickets down 8%. Then I listened to ten random call recordings and heard something different entirely—customers were frustrated with a specific feature rollout, but they weren't escalating their complaints because the IVR system kept disconnecting them. The data wasn't wrong. It was just incomplete. Worse, it was actionable in exactly the wrong direction.

What Changed When I Built This Thing

I started experimenting with Whisper (OpenAI's speech-to-text model) in August of last year because I was tired of reading transcripts and guessing. The workflow is straightforward enough: feed it audio, get text back, then throw that text at BERTopic to surface what customers are actually talking about. Not what your taxonomy thinks they're talking about. What they're actually discussing.

  • Whisper handles multiple languages without needing separate models, which matters if you're dealing with Colombian Spanish mixed with English like we do in Bogotá tech circles
  • BERTopic clusters conversations into topics dynamically—you don't have to predefine categories, which sounds convenient except now I'm drowning in 47 different micro-topics and trying to figure out which ones matter
  • Streamlit made the interface fast to prototype. Too fast, honestly. I had something working in a weekend and suddenly everyone wanted real-time analysis.

The Sentiment Part Gets Messy

Here's where I'll be honest: sentiment analysis from call recordings isn't as clean as you want it to be. A customer can sound frustrated but be asking a genuine question. They can sound polite while describing a broken workflow. I'm not sure sentiment scores alone capture what actually matters. Maybe what you really need is to know when a customer's tone shifts—when they go from neutral to frustrated mid-call—because that's when something broke.

I built a component that flags sentiment swings rather than absolute sentiment scores. It probably works better. It also requires more interpretation, which means your analysts can't just stare at a red/yellow/green indicator and call it done.

The bigger challenge: recording compliance. If you're not careful about consent and data retention, you've built a beautiful tool that your legal team will shut down on a Tuesday morning.

Why This Matters to People Like Me

I've always believed that better data makes better decisions. What I didn't fully appreciate until I started digging into call recordings is that the gap between "data you have" and "data you need" is where most insights get lost. Your Looker Studio dashboard shows you that churn increased. The call recordings show you why—usually something nobody predicted in the sprint planning meeting three months ago.

Building this forced me to think differently about what a "metric" actually is. In Power BI, metrics are clean. Defined. Queryable. In a call recording, a metric is buried in 47 minutes of conversation between two people who don't know they're being analyzed. You have to extract meaning from ambiguity, and that's not a dashboard problem. It's an AI problem.

What I'm Still Figuring Out

Whether this should be a standalone tool or integrated into your existing stack, I genuinely don't know. Does it live in your CRM? In your data warehouse? As a separate Streamlit app that analysts use on demand? I've seen it work three different ways at three different companies and the answer seems to depend on whether your team is more comfortable with notebooks or SQL queries, which is... not a satisfying answer.

Also, the computational cost keeps me up. Running Whisper on 500 hours of calls monthly isn't free. Neither is the hosting. Neither is the person who has to babysit the pipeline when a model update breaks everything.

If you're sitting on years of call recordings and you haven't looked at them as data yet, you're probably missing something important. The question isn't whether to build this. It's whether you're ready to act on what you'll find.

#AI #sentiment analysis #customer insights #Streamlit #data visualization #voice analytics

Was this helpful?

Juan David Avellaneda

Juan David Avellaneda

Innovation Specialist · Bogotá, Colombia