
Imagine asking your data questions in plain English—and getting lightning-fast, insightful answers directly from your data warehouse, without writing a single line of SQL. That’s not a dream anymore. It’s AgentHouse, a real-world demonstration of how the power of ClickHouse and Large Language Models (LLMs) come together through the Model Context Protocol (MCP) to make structured analytics conversational.
Let’s walk through what makes AgentHouse a powerful step forward for data professionals, developers, and AI enthusiasts alike.
The Idea Behind AgentHouse
Shortly after Anthropic released the Model Context Protocol (MCP)—a standardized way for LLMs to interact with external tools like databases—the ClickHouse team saw an opportunity. What if an LLM could generate SQL queries, interpret schemas, and provide summaries from actual business data… all within a secure and streamlined setup?
They already had an internal prototype, nicknamed Dwaine, that could answer real business queries like:
- “What’s our revenue for last month?”
- “Show me the most active users from the last 7 days.”
But since Dwaine was tied to proprietary data, the team decided to create AgentHouse, a public playground that replicates the same experience using safe, public datasets.
What Exactly Is AgentHouse?
AgentHouse is a conversational analytics demo where you can interact with a fully-managed ClickHouse backend through a user-friendly chat interface powered by LibreChat. It brings together multiple layers of intelligence and infrastructure:
1. Anthropic Sonnet (LLM)
A large language model fine-tuned for structured data. It understands schemas, writes efficient SQL, and interprets results conversationally.
2. ClickHouse MCP Server
The magic layer that translates user queries into SQL, securely queries the data, manages conversational context, and returns results in natural language. It also enforces access controls and schema boundaries to ensure safety.
3. LibreChat UI
An open-source frontend that simulates a chat with a data expert. You can ask questions in natural language, preview the generated SQL, and view the results as tables or visualizations.
4. ClickHouse Cloud
The lightning-fast, scalable backend powering the analytics engine behind the scenes. No infrastructure setup needed.
Together, these components create a seamless system where LLMs query live databases intelligently, while still letting users peek under the hood.
What’s Under the Hood: Demo Datasets
AgentHouse gives access to 37 open datasets, letting users explore real-world analytics scenarios. Some of the popular ones include:
- GitHub activity logs
- PyPI and RubyGems download metrics
- Hacker News, Reddit, and StackOverflow content archives
- NYC Taxi trip data
- Flight records from OpenSky
- IMDB movie listings
- UK real estate transactions
These datasets provide diverse schema structures and allow you to experiment with both simple and complex queries.
How AgentHouse Works (Step by Step)
Let’s break down what happens when you type a question like:
“Which GitHub repositories had the most stars last month?”
- Your query is received by LibreChat and passed to the Sonnet LLM.
- Sonnet interprets the question, fetches metadata from the MCP server (such as available datasets and schema details), and writes a well-formed SQL query.
- The ClickHouse MCP server runs the query against the correct dataset in ClickHouse Cloud.
- The results are returned to Sonnet, which formats them into a clear, conversational response.
- You see the output—complete with a table, chart (if applicable), and an explanation.
At every step, you can preview the SQL used behind the scenes, enabling transparency and learning for users who want to understand the logic.
Why AgentHouse Matters
Here’s why this integration is a big deal:
✅ Natural Language Analytics
No more struggling with SQL for non-technical users. Teams from marketing, finance, operations, or sales can get answers instantly—just by asking.
✅ Developer Playground
AgentHouse is a reference implementation that developers can use to understand how to build their own LLM-powered analytics agents using ClickHouse and MCP.
✅ Secure and Scalable
MCP ensures security, stateful sessions, and context sharing—crucial for real-world production use. All this runs on ClickHouse Cloud’s robust infrastructure.
✅ Future of Agentic Systems
AgentHouse hints at the rise of autonomous agents that can not only query data but reason over it. As LLMs become more context-aware and multi-modal, the potential here is massive.
Use Case Ideas
AgentHouse sets the foundation for many enterprise use cases. Imagine:
- Customer success teams exploring usage trends per customer with zero SQL skills.
- Product managers asking questions like “Which features had the highest engagement last week?”
- Operations analysts spotting anomalies in real-time system metrics via natural chat.
These experiences go beyond dashboards—they offer interactive data exploration.
What’s Next?
While AgentHouse is a demo, the potential is clear. The future may include:
- Multi-agent systems where different bots handle schema analysis, query optimization, and visualization.
- Live event monitoring with AI-generated alerts and explanations.
- Enterprise-level agent frameworks integrated into internal BI tools, CRMs, and more.
ClickHouse is also encouraging developers to experiment with CopilotKit, LangChain, and other orchestration tools alongside their MCP server to build intelligent, autonomous applications.
Final Thoughts
AgentHouse is more than a cool demo—it’s a real glimpse into how structured analytics can evolve with LLMs. It lowers the barrier for data access, enhances transparency, and showcases how AI can complement human decision-making without replacing the rigor of good data practices.
Whether you’re a data engineer, an analyst, or a curious product manager, AgentHouse gives you a front-row seat to the future of conversational analytics.
Explore the full demo and details here:
https://clickhouse.com/blog/agenthouse-demo-clickhouse-llm-mcp
Follow us for more Updates