Blog / Data & Viz

The JSON Overhead Nobody Talks About: Why Your LLM Pipeline is Bleeding Money

You're probably overpaying for structured data in LLM pipelines. Here's what I've learned from years of building data systems.

Juan David Avellaneda May 9, 2026 4 min read 4 views
The JSON Overhead Nobody Talks About: Why Your LLM Pipeline is Bleeding Money

The Problem Nobody Names

Last month I was building a dashboard integration that fed customer data into an LLM for automated insights. Standard approach: JSON. I realized halfway through that we were spending 40% more on API tokens than the actual computation deserved. Not because the LLM was inefficient. Because we were wrapping everything in curly braces and quotation marks.

Here's the thing. JSON is everywhere. It's the lingua franca of web development. But when you're paying per token—and language models count every single character—it becomes a tax you didn't budget for. I'm not sure this even registers for most teams building these systems, but once you see it, you can't unsee it.

  • A simple customer record with 10 fields balloons in JSON formatting alone
  • Nested structures multiply the overhead exponentially, and honestly I keep making this mistake
  • Schema definitions, type declarations, closing brackets

What I've Learned From Looker Studio and Power BI

Years of building dashboards taught me something unexpected about data: the format you present it in changes how expensive it is to process. In Looker Studio, we obsess over field density. In Power BI, we compress data before loading. The principle transfers to LLMs, except the cost is immediate and measurable.

When I moved from JSON to delimited formats for certain pipelines—pipe-separated values, tab-delimited, even custom encodings—the token count dropped by 30-35%. The LLM still understood the structure. It just didn't have to decode unnecessary syntax overhead.

But here's where I hesitate. Not every format works for every use case. Complex nested hierarchies still benefit from JSON's clarity, and swapping formats adds cognitive load to the team maintaining the code. I've seen projects where we saved tokens but lost readability, and that's not always a win.

The Real Cost Calculation

Let me be concrete. OpenAI's GPT-4 pricing (as of 2024) runs roughly $0.03 per 1K input tokens. If you're processing thousands of records daily through an LLM pipeline, and JSON overhead is consuming 30-40% of your tokens, you're not talking about theoretical savings. You're talking about actual dollars every single day.

I worked with a mid-size SaaS company last year running customer analysis through Claude. They were spending approximately $4,000 monthly on token usage. We restructured their data format—not even drastically different, just removing redundant schema repetition—and cut it to $2,600. Same output. Same accuracy.

That's real money. That's also $16,800 annually that doesn't have to justify itself to a CFO.

What Format Should You Actually Use

  • CSV or delimited formats for flat, tabular data—simple, lean, LLMs parse them fine
  • Minimal XML or custom schema only when hierarchies genuinely demand it, though I'm honestly uncertain whether most systems need this complexity
  • Column-major layouts instead of row-major when feeding aggregate data

The catch is integration. JSON dominates because it integrates everywhere. Switching formats means updating your entire pipeline—data collection, transformation, LLM input, output parsing. Not trivial. Not impossible, but it requires the kind of refactoring that teams deprioritize until someone actually quantifies the savings.

The Uncomfortable Middle Ground

I want to say there's a clean answer here. Use structured-but-lean formats, measure your token consumption, optimize ruthlessly. But most production systems aren't built that way. They're built incrementally. JSON works. It ships. It integrates. Nobody gets fired for choosing JSON.

What I actually recommend: audit your LLM pipeline's token consumption. Run the same data through two different formats and measure the difference. If you're processing enough volume, the answer becomes obvious. If you're not, the overhead probably doesn't matter and you have bigger problems to solve.

The question that haunts me is whether this optimization is even worth discussing in most contexts, or whether we're all just too focused on token economics instead of the actual quality of insights the LLM generates. Maybe the format doesn't matter as much as whether anyone's actually reading the output.

#LLM #JSON #token-optimization #data-engineering #cost-reduction

Was this helpful?

Juan David Avellaneda

Juan David Avellaneda

Innovation Specialist · Bogotá, Colombia