How we used AI to map an entire data infrastructure in hours | House of Data
← Back to blog
Herman Holterman · 17 March 2026

How we used AI to map an entire data infrastructure in hours

AIData & BIdata architecture

AI-driven data infrastructure analysis

Imagine a data platform that has grown organically over the years: 1,200 tables, 7,000 data mappings, and 150+ reports spread across 10 business domains, fed by over 70 source systems. From finance to production, from supply chain to HR. Everything interconnected through a complex web of data transformations, dependencies, and refresh schedules.

This was the environment we recently encountered. The client’s question was clear: “We want control over our data infrastructure. What do we have, how does it connect, and where are the risks?”

A perfectly valid question. But the traditional answer—manual documentation by consultants spending weeks sifting through tables, queries, and reports—didn’t fit the timeline, the budget, or frankly the current state of technology.

The traditional approach: why it no longer suffices

Let’s be honest about how these projects typically unfold. A team of two to three consultants retreats for weeks with SQL exports, Excel sheets, and Visio diagrams. They interview key users, reverse-engineer ETL logic, and try to piece together a coherent picture of something that was never designed as a whole.

The result is often a stack of documents that are already outdated at the moment of delivery. This isn’t because the consultants did poor work, but because the complexity is simply too large to track manually.

For a platform of this scale, you’re looking at:

  • Source systems: which operational systems feed the platform, and through which integrations?
  • Data transformations: what happens to the data between source and report? Which business rules are applied?
  • Semantic relations: how do tables and fields relate to each other, and to business terminology?
  • Data lineage: if a source field changes, which reports are affected?
  • Refresh schedules: when is what refreshed, in what order, and with which dependencies?

Manually untangling this for 1,200 tables isn’t just time-consuming. It’s error-prone, incomplete, and barely repeatable.

Our approach: AI as analytical accelerator

We chose a different route. Instead of manual documentation, we let AI do the heavy lifting.

The metadata, transformation logic, and report structures are ultimately data. And if there’s one thing AI excels at, it’s processing large volumes of structured and semi-structured data.

Here’s what our approach looked like in practice:

Step 1: Extraction and structuring

We exported the complete configuration from both the ETL/Data Warehouse layer and the BI environment. Think table definitions, field descriptions, transformation rules, report metadata, and refresh configurations. We fed this as structured input to Claude Code, Anthropic’s AI-powered development tool.

Step 2: Analysis with Claude Code

Claude Code analyzed the complete dataset and produced structured overviews per domain:

  • A complete inventory of all source systems and their integrations
  • Transformation chains from source to report, including business rules
  • Dependency graphs between tables, fields, and reports
  • A complete data lineage overview
  • Identification of issues: unused tables, circular dependencies, missing documentation

One of the first insights: by far the largest tables on the platform turned out to be logging tables in the database. Tables originally set up for troubleshooting had grown over the years into the dominant storage consumer, without anyone actively managing them.

What took hours here would have taken weeks manually. And the output wasn’t just faster, but also more complete; AI doesn’t miss tables because it’s Friday afternoon.

Step 3: From technical output to visual communication

This is perhaps the biggest innovation in our approach. Raw technical analysis is valuable for the project team but often indigestible for stakeholders. And it’s precisely those stakeholders who need to make the decisions.

We fed the structured AI output into Google NotebookLM Studio. This allowed us to rapidly generate:

  • Infographics that visually summarize the architecture per domain
  • A short video that walks the client through the complete environment in a few minutes

The result was that we could take the client through the full current state of their data infrastructure in a single session. No death-by-PowerPoint, no 80-page documents that nobody reads. Visual, understandable, and directly usable as a basis for decision-making.

The power of the chain

It’s tempting to reduce this story to “we used AI.” But the real strength isn’t in any single tool. It’s in the chain:

  • Claude Code for rapid analysis: understanding complexity at a scale that’s not achievable manually.
  • NotebookLM Studio for visual communication: translating technical complexity into comprehensible content.
  • Domain expertise as the connecting factor: AI delivers the data, but interpretation and translation into concrete recommendations remains human work.

Without domain knowledge, you don’t know which questions to ask the AI. Without AI, you can’t answer the questions at this scale. Without visual communication, you don’t reach the stakeholders. Every component is essential.

What does this deliver in practice?

For this client, our approach delivered:

  • Time savings: hours instead of weeks for a complete architecture analysis.
  • Completeness: every table, every field, every dependency mapped, no blind spots.
  • Accessibility: stakeholders who actually understand the infrastructure, not just the IT team.
  • Repeatability: the analysis can be re-run at any time when the environment changes.
  • Decision-making: a solid foundation for roadmap choices around modernization, rationalization, or migration.

In closing

The tools available to us today are changing how we look at data architecture. Not as something you document once and file away, but as a living insight you can continuously refresh and communicate.

For those still wondering whether AI has a place in these kinds of projects: we’ve put the proof on the table. In hours, not weeks.

Related articles

Strategic Data Consultation

Want your own data environment analyzed?

Learn more
Let's talk

Get in touch for an initial consultation.