Is a semantic layer necessary to build kick-ass AI agents, or does it just introduce more complexity? What’s the connection between a semantic layer and LLM-powered agents, anyway? Let’s dive in and find out.
In This Post
- What's a semantic layer? What's an AI agent?
- The anatomy of a semantic layer
- Why semantic layers are integral to AI agents
- Challenges with semantic layers
- Case study: AI agent leveraging a semantic layer for commercial pharmaceutical success
- Alternatives to structured semantic layers for AI agents
- The future of semantic layers
What's a semantic layer? What's an AI agent?
First things first, let’s get our definitions straight.
When we talk about AI agents in the enterprise, we’re not talking about those chatbots or copilots that can barely tell the difference between “Hello” and “Help, the server’s on fire!” We’re talking about sophisticated software entities that can perceive, reason, plan, leverage tools, and act autonomously to solve complex business problems.
And when we say semantic layers, we’re talking about the bridges between raw data and actual understanding. More on that next.
The anatomy of a semantic layer
Semantic layers are like Rosetta Stones for your AI agents—bridging raw data and understanding. Here’s what’s under the hood of a typical semantic layer:
- Data models/relationships map entities and concepts into a coherent, unified model of diverse data sources so AI agents can reason across multiple domains and tackle complex multistep workflows.
- Knowledge graphs are a massive, interconnected web of facts, critical to understanding how one thing impacts another (i.e., multidimensional cause-and-effect).
- Business definitions (metrics, dimensions, ontologies) are the periodic table of your AI agent’s universe. They define the concepts, relationships, and rules that your AI agent uses to make sense of the world.
Architecturally, the interaction between semantic layers and AI agents looks like this:
- APIs: Semantic layers expose APIs that AI agents call to query data, retrieve metadata, and execute operations.
- Query language: Many semantic layers implement a specific query language that AI agents use to interact with the data. This could be SQL-like, or it could be a custom, domain-specific language. The agent then formulates queries in this language to request information or perform operations.
- Service layer: A service layer typically sits between the semantic layer and AI agents. This layer handles request routing, authentication, caching, federated queries, and event-driven updates that AI agents can subscribe to.
Why semantic layers are integral to AI agents
Now that we’ve peeked under the hood and explored their mechanics, let’s talk about why you might want your AI agents calling upon a semantic layer.
TL;DR: They play a pivotal role by providing the essential context, consistency, and business logic that AI agents need to make intelligent, autonomous decisions; manage complex workflows; and dynamically adapt in real time.
Semantic layers help AI agents:
- Make sense of complex data. Semantic layers abstract away the underlying complexity of data sources and structures, allowing AI systems to focus on high-level reasoning rather than low-level data manipulation. Imagine you have 100s of data sources spanning on-prem and in the cloud, and within these datasets, you have 17 different flavors of “customer” (e.g. ‘client’, ‘member’, ‘shopper’, etc.). How is the AI agent to know which one to select? A semantic layer helps your AI agent to make sense of disparate data sources to avoid jumping to the wrong conclusion.
- Dramatically improve accuracy & consistency. It’s critical that AI agents operate with a knowledge of the specific business domain, industry-standard terminology, KPIs, and data intricacies of the areas they’re working in to minimize the risk of hallucinated answers and actions. Semantic layers excel at this type of context and contextual understanding. By encoding domain knowledge and business logic, semantic layers provide context to raw data, ensuring that AI agents work with consistent definitions and metrics across different data sources, preventing confusion and errors that could arise from inconsistent data representations. Just to put some real-world numbers around accuracy:
- A key 2023 study showed a 54% accuracy boost of SQL queries by leveraging Knowledge Graphs’ LLM-powered question-answering systems
- dBT claims an 83% accuracy rate for natural language questions being answered via AI in the dbt Semantic Layer
- Make smarter decisions and continuously learn. Well-designed semantic layers include rules and axioms that support logical inference, enabling agents to derive new insights and connections that aren’t explicitly stated in the data. They can also represent multi-dimensional relationships—essential for understanding complex real-world scenarios—which allows AI agents to analyze problems from multiple perspectives. Finally, as the semantic layer is updated to incorporate new knowledge and relationships, AI agents can learn and evolve, creating a positive feedback loop where agent insights enhance the semantic layer—which, in turn, improves agent performance.
- Enhance explainability, auditability, and governance. In the age of GDPR and AI ethics committees, being able to explain why your AI agent made a particular decision is not just nice-to-have—it’s a must-have. With a semantic layer, you can trace the reasoning process step by step.
- Increase interoperability. Your AI agents interact with various software systems like SaaS platforms, business planning tools, analytics platforms, web analytics services, and of course, other agents. A common language and understanding is critical to the success of your AI agents and the ability to identify and execute tasks on the appropriate data sources based on user inquiries, for example, to construct suitable SQL queries.
- Improve scalability and optimization. As more data and relationships are added to the semantic layer, the agent’s ability to handle complex reasoning tasks scales accordingly, allowing for continuous improvement and adaptation. On the flip side, in a world without a semantic layer, increasing the number of data sources and systems can strain an AI agent’s ability to quickly parse through, plan, and act on stuff. Semantic layers are also helpful for optimizing complex queries, allowing agents to efficiently retrieve and process relevant information for multi-faceted reasoning tasks.
Challenges with semantic layers
Semantic layers aren’t without their challenges. The big three are:
- Scalability/performance issues. Remember when you thought having a comprehensive ontology was a good idea? Welcome to scalability hell. Large-scale knowledge bases can become unwieldy, degrading query performance as the ontology grows. Balancing reasoning depth with response time is critical, as is handling concurrent requests without bringing the system to its knees.
- Ontology evolution. Your ontology was perfect… until the business decided to pivot. Updating and maintaining consistency becomes a Herculean task.
- Integration with legacy systems. Having a state-of-the-art AI agent is great! Now make it talk to that mission-critical system written in COBOL back when disco was cool. Bridging the gap between modern AI and legacy data formats is a real struggle, as is dealing with inconsistent or poorly documented legacy systems.
Fortunately, solutions abound such as distributed knowledge bases, intelligent caching mechanisms, ontology modularization design patterns, automated consistency checking and repairs (think: semantics quality agents!), ontology versioning and change management systems, and developing robust API layers and data transformation workflows to interop better with legacy systems.
Case study: AI agent leveraging a semantic layer for commercial pharmaceutical success
Let’s look at a real-world example. PharmaCorp, a leading global pharmaceutical company, implemented a semantic layer and connected their commercial effectiveness AI agent up to it. This enhancement allowed their AI analyst agent, dubbed “CommInsight,” to optimize engagement with healthcare providers for their new diabetes medication, Glucolance.
The semantic layer integrated data and nuanced knowledge about:
- Historical HCP interaction logs
- Prescription data
- Regional healthcare system information
- HCP specialization and patient demographics
The results?
– 42% increase in successful first-time prescriptions from targeted HCP engagements via personalized HCP outreach
– 28% reduction in time spent by sales reps on non-productive HCP visits
– 35% increase in HCP-reported satisfaction with the relevance of information provided during engagements
The semantic layer enabled CommInsight to understand the complex context of HCP engagement, leading to recommendations for sales reps to be more effective and efficient in their interactions, ultimately driving better prescription rates for GlucoBalance.
Not too shabby, right?
Alternatives to structured semantic layers for AI agents
Now let’s play devil’s advocate and take the stance that structured semantic layers are just another piece of complexity when building AI agents. Is there an alternative?
Standalone RAG
One idea is to just use plain old RAG (retrieval-augmented generation) alone versus combining it with a semantic layer for AI agents. But you lose out on a lot with this approach:
- Contextual understanding. RAG alone retrieves relevant information based on similarity, but it typically lacks deeper contextual understanding, whereas a semantic layer provides structured relationships and domain-specific knowledge, enabling a more nuanced interpretation of retrieved information.
- Consistency and standardization. AI agents calling upon RAG alone may retrieve inconsistent or conflicting information from different sources, whereas semantic layers ensure consistent definitions, metrics, and business logic across all data sources.
- Complex reasoning. RAG is great for fact retrieval and simple inferences, but with a semantic layer, you’ll have trouble pulling complex reasoning by leveraging predefined relationships and business rules.
- Performance optimization. RAG requires searching through large amounts of unstructured data, whereas a semantic layer can optimize queries and retrieval based on predefined data models.
- Real-time updates. When it comes to retraining or fine-tuning for new information, semantic layers can more easily integrate updates without full model retraining.
So, while RAG is powerful, combining it with a semantic layer can provide AI agents with a more comprehensive, consistent, and contextually rich understanding of the data and business environment.
Learning-based approaches to inferring semantic relationships
The idea here is to use large language models or neural networks to automatically extract and infer semantic relationships from vast amounts of raw data to uncover hidden patterns and connections without explicit programming. Basically it’s the architectural equivalent of saying “Ontologies? We don’t need no stinking ontologies!”
The benefit of feeding a system enough data to figure out the semantic connections approach is that it can handle a wide range of tasks without explicit programming while continually improving with more data. On the flip side, however, explainability is a nightmare with this black box approach. It also requires massive amounts of data and computational resources to pull off—but hey, the price of model-tuning is dropping every day! Finally, there’s a potential for semantic hallucinations.
On this last point, imagine the following scenario in a healthcare setting:
- A learning-based semantic layer mistakenly infers a strong correlation between a benign genetic marker and a high risk of heart disease. This false relationship becomes part of the semantic layer’s knowledge base.
- An AI agent relying on this knowledge recommends aggressive preventive treatments for patients with this benign marker, such as invasive procedures and strong medications.
- Now we have necessary medical procedures, increased healthcare costs and insurance premiums, potential side effects, etc.
Yikes!
Let’s stick with a structured semantic layer approach to our AI agent system design, then, for now. Let’s also now turn our attention to those AI agents.
AI agent autonomy spectrum
AI agent autonomy and automation levels exist on a spectrum. A useful framework to judge AI agents is the five levels of agentic automation below:
(Moran, J. (2024). [The five levels of agentic automation]. Sema4. https://sema4.ai/blog/the-five-levels-of-agentic-automation/)
This spectrum is important to consider because your AI agents will have different interactions and needs of your semantic layer at each level.
Level 0: Fixed Automation
▪ Minimal/no agent<>semantic interaction
▪ Agents use predefined data paths & simple queries
Level 1: AI Augmented Automation
▪ Agent<>Semantic interaction is basic querying & data retrieval
▪ Agent needs consistent definitions & simple metric calcs
▪ Semantic layer should have standardized way to access & interpret basic data
Level 2: Agentic Assistant
▪ Agent<>Semantic interaction involves more complex querying & basic inference
▪ Agents use contextual information about data relationships, business rules, & logic embedded in semantic layer
▪ Semantic layer acts as a knowledge base for understanding business contexts
Level 3: Plan & Reflect
▪ Agent<>Semantic interaction involves advanced querying, data exploration, & pattern recognition
▪ Agents use historical data, trends, cross-domain data relationships, & metadata about data freshness/reliability embedded in the semantic layer
▪ The semantic layer becomes crucial for adaptive planning, allowing the agent to understand data in a broader context for more informed decision-making
Level 4: Self-Refinement
▪ Agent<>Semantic interaction involves dynamic querying, feedback loops, & learning
▪ Agents have access to usage patterns and performance metrics of queries as well as the ability to suggest updates to the semantic layer itself
▪ The semantic layer needs to be more flexible to accommodate learned improvements, as agents might contribute to evolving the semantic layer over time
Level 5: Autonomy
▪ Agent<>Semantic interaction involves full utilization of the semantic layer’s capabilities
▪ Agents use the deep understanding of business domains encoded in the semantic layer and access to external data sources for context enrichment in order to create new metrics or data relationships
▪ The line between the semantic layer and the agent might blur, as agents dynamically reshape the semantic layer based on evolving needs. This raises questions about governance and control of the semantic layer.
Some general implications across all levels:
- As we move up the levels, the semantic layer needs to handle increasingly complex relationships and contexts.
- Higher levels require more sophisticated access controls and data governance within the semantic layer.
- There’s a growing tension between the need for a stable, consistent semantic layer and the desire for dynamic, AI-driven updates.
- The semantic layer becomes crucial for providing explainable AI, especially at higher levels where decision-making becomes more complex.
- As the autonomy of agents increases, the quality and consistency of data in the semantic layer become even more critical.
- Higher-level agents may require the semantic layer to integrate more seamlessly with external systems and data sources.
The future of semantic layers
As AI agents become more prevalent, semantic layers will continue to evolve as their trusty sidekicks.
Here is a little crystal-ball gazing of what the future of semantic layers might look like:
- Structured and unstructured data. Gone will be the days when semantic layers were picky eaters, only digesting neatly structured data. The future will be omnivorous semantic layers of structured, semi-structured, and unstructured data, resulting in a Rosetta Stone for your entire data ecosystem.
- Richer. Future semantic layers will beef up, becoming more robust and feature-packed than ever. We’re talking not just understanding relationships between data points, but grasping context, intent, and even the subtle nuances of human communication, inferring meaning across multiple domains.
- Superglue layer. Imagine a future where—instead of a stand-alone CRM software requiring sales reps to log in and provide updates—you have AI agents mix-and-matching data from various enterprise apps like Gong, Google Calendar, DocuSign, etc., to create a real-time, 360-degree view of your customer and prospect relationships. The semantic layer is the glue and uber system of record holding it together.
- Baked in. Currently the semantic layer sits outside the LLM, acting as a bridge between raw data and the AI agent, but the future might see a hybrid approach—with key semantic understanding baked right into the LLM itself—while more dynamic, real-time semantic processing happens externally.
- Adaptive semantic layers. Future semantic layers may become more adaptive, automatically adjusting their structure and relationships based on new data and usage patterns. This could lead to more flexible and resilient data architectures that evolve with an organization’s needs.
The future of semantic layers is bright. Especially in the rapidly evolving world of enterprise AI, semantic layers are emerging as the unsung heroes: versatile, powerful, and absolutely crucial to gaining a competitive edge.