Memory and RAG are the knowledge backbone of multi-agent AI. When they work together correctly, agents have reliable access to both past interactions and domain knowledge with full citation tracking. This live workshop engineers the memory and RAG integration that makes multi-agent systems genuinely knowledgeable.
By Packt Publishing · Refunds up to 10 days before
Episodic memory stores the history of past interactions for future retrieval. Semantic memory (the RAG knowledge base) stores domain knowledge for citation-grounded responses. In a multi-agent system, both must be accessible to multiple agents simultaneously, with appropriate access controls and consistent citation tracking across agent boundaries. This workshop engineers both systems and their integration.
Context engineering is the discipline of designing systems that give AI the right information, in the right format, to reason and act reliably. It goes beyond prompt engineering — building structured, deterministic systems that scale in production.
A multi-agent system uses multiple specialised AI agents working together — each with a defined role, context, and tools — to complete complex tasks no single agent could handle reliably. Context engineering makes them predictable.
MCP is Anthropic's open standard for connecting AI models to tools, data sources, and other agents. It provides structured agent orchestration with clear context boundaries — making systems transparent and debuggable.
Context engineering requires hands-on practice to truly understand. This live workshop lets you build a working system with a world-class instructor answering your questions in real time.
Six modules. Six hours. A production-ready context-engineered AI system by the time you finish.
Understand why prompts fail at scale and how semantic blueprints give AI structured, goal-driven contextual awareness.
Design and orchestrate multi-agent workflows using the Model Context Protocol. Build transparent, traceable agent systems.
Build RAG pipelines that deliver accurate, cited responses. Engineer memory systems that persist context reliably across agents.
Architect a transparent, explainable context engine where every decision is traceable and debuggable in production.
Implement safeguards against prompt injection and data poisoning. Enforce trust boundaries in multi-agent environments.
Deploy your context-engineered system to production. Apply patterns for scaling, monitoring, and reliability.
Concrete working deliverables — not just theory and slides.
A working Glass-Box Context Engine with transparent, traceable reasoning
Multi-agent workflow orchestrated with the Model Context Protocol
High-fidelity RAG pipeline with memory and citations
Safeguards against prompt injection and data poisoning
Reusable architecture patterns for production AI systems
Certificate of completion from Packt Publishing
Denis Rothman brings decades of production AI engineering experience to this live workshop.
Denis Rothman is a bestselling AI author with over 30 years of experience in artificial intelligence, agent systems, and optimization. He has authored multiple cutting-edge AI books published by Packt and is renowned for making complex AI architecture concepts practical and immediately applicable. He guides you step by step through building production-ready context-engineered multi-agent systems — answering your questions live throughout the 6-hour session.
Intermediate to advanced workshop. Solid Python and basic LLM experience required.
Everything you need to know before registering.
Episodic memory provides continuity: it remembers what happened in past interactions, what decisions were made, and what the user's preferences and context are. RAG provides knowledge: it retrieves relevant domain information for the current query from the embedded knowledge base. In a multi-agent system, the context router combines both: episodic memory provides the conversation and user context, RAG provides the domain knowledge, and together they give each agent a complete, grounded context for accurate and consistent responses.
Shared RAG access for multiple agents uses a centralized RAG service exposed as an MCP server: all agents invoke the RAG service through typed MCP tool calls rather than accessing the vector store directly. This centralized design handles concurrent access through connection pooling, implements a shared retrieval cache to avoid redundant embedding searches, and maintains consistent citation metadata across all agents that retrieve the same documents. The workshop covers implementing this centralized RAG service pattern.
When agent A retrieves a document through the RAG service and uses a fact in its output, the citation is attached to the output as structured metadata. When agent B receives agent A's output and passes the fact to the RAG service for verification or extension, the original citation travels with the fact. The RAG service's citation manager tracks this provenance chain, so the final output can trace every factual claim back to the original retrieved source regardless of how many agents processed it.
Memory and RAG bottleneck prevention requires three layers: connection pooling that allows multiple concurrent agent queries without blocking, semantic caching that serves frequently retrieved content without repeating the vector store lookup, and asynchronous retrieval that allows agents to begin processing non-knowledge-dependent portions of their task while RAG and memory retrieval run concurrently. The workshop covers implementing all three optimizations in the centralized RAG and memory services.
Shared episodic memory consistency uses an optimistic concurrency control pattern: each memory record has a version number, memory updates include the expected version, and the memory store rejects updates where the provided version does not match the current version (indicating a concurrent modification). The orchestrator handles rejected updates by retrying with the latest version after applying the new update on top of the current state. The Glass-Box logging records all memory operations and their version information for consistency auditing.
A long-running multi-agent system's memory and RAG architecture must handle: growing episodic memory stores (managed through TTL eviction and importance-based compression), evolving knowledge bases (managed through incremental RAG indexing), shifting user context (managed through memory relevance decay that reduces the weight of old episodic memories over time), and long-running conversation state (managed through session summarisation that compresses multi-session histories into retrievable summaries). The workshop covers each of these long-term management considerations.
6 hours. Bestselling AI author. Production context-engineered multi-agent system by the end. Seats are limited.
Register Now →Saturday April 25 · 9am to 3pm EDT · Online · Packt Publishing · Cohort 2