Build RAG System With Memory for Agents · April 25

Build a RAG System With Memory Designed for Multi-Agent AI

A RAG system for multi-agent AI needs more than a vector store. This live workshop builds a complete memory-augmented RAG system: working memory, episodic memory, and semantic retrieval working together to give agents persistent, reliable knowledge access across long conversations.

Saturday, April 25  9am – 3pm EDT
6 Hours  Hands-on coding
Cohort 2  Intermediate to Advanced

Workshop Details

📅
Date & Time
Saturday, April 25, 2026
9:00am – 3:00pm EDT
Duration
6 Hours · Hands-on
💻
Format
Live Online · Interactive
📚
Level
Intermediate to Advanced
🎓
Includes
Certificate of Completion
Register on Eventbrite →

By Packt Publishing · Refunds up to 10 days before

✦ By Packt Publishing
6 Hours Live Hands-On
Cohort 2 — April 25, 2026
Intermediate to Advanced
Certificate of Completion
Why Trust Packt

Over 20 Years of Helping Developers Build Real Skills

7,500+
Books and video courses published for developers worldwide
108
Live workshops and events hosted on Eventbrite
30+
Years of AI experience from your instructor Denis Rothman
100%
Hands-on — every session involves real code and live building
About This Workshop

Why Multi-Agent AI Needs a RAG System With Memory, Not Just Search

Basic RAG retrieves documents for each query independently. A memory-augmented RAG system for agents maintains continuity: it knows what was retrieved before, what was important in past interactions, and what the agent needs to remember for future queries. This workshop builds all three memory layers.

🧠

What is Context Engineering?

Context engineering is the discipline of designing systems that give AI the right information, in the right format, to reason and act reliably. It goes beyond prompt engineering — building structured, deterministic systems that scale in production.

🤖

What is a Multi-Agent System?

A multi-agent system uses multiple specialised AI agents working together — each with a defined role, context, and tools — to complete complex tasks no single agent could handle reliably. Context engineering makes them predictable.

🔗

What is the Model Context Protocol?

MCP is Anthropic's open standard for connecting AI models to tools, data sources, and other agents. It provides structured agent orchestration with clear context boundaries — making systems transparent and debuggable.

🎯

Why Attend as a Live Workshop?

Context engineering requires hands-on practice to truly understand. This live workshop lets you build a working system with a world-class instructor answering your questions in real time.

Workshop Curriculum

What This 6-Hour Workshop Covers

Six modules. Six hours. A production-ready context-engineered AI system by the time you finish.

01

From Prompts to Semantic Blueprints

Understand why prompts fail at scale and how semantic blueprints give AI structured, goal-driven contextual awareness.

02

Multi-Agent Orchestration With MCP

Design and orchestrate multi-agent workflows using the Model Context Protocol. Build transparent, traceable agent systems.

03

High-Fidelity RAG With Citations

Build RAG pipelines that deliver accurate, cited responses. Engineer memory systems that persist context reliably across agents.

04

The Glass-Box Context Engine

Architect a transparent, explainable context engine where every decision is traceable and debuggable in production.

05

Safeguards and Trust

Implement safeguards against prompt injection and data poisoning. Enforce trust boundaries in multi-agent environments.

06

Production Deployment and Scaling

Deploy your context-engineered system to production. Apply patterns for scaling, monitoring, and reliability.

What You Walk Away With

By the End of This Workshop You Will Have

Concrete working deliverables — not just theory and slides.

A working Glass-Box Context Engine with transparent, traceable reasoning

Multi-agent workflow orchestrated with the Model Context Protocol

High-fidelity RAG pipeline with memory and citations

Safeguards against prompt injection and data poisoning

Reusable architecture patterns for production AI systems

Certificate of completion from Packt Publishing

Your Instructor

Learn From a Bestselling AI Author With 30+ Years of Experience

Denis Rothman brings decades of production AI engineering experience to this live workshop.

Denis Rothman

Denis Rothman

Workshop Instructor · April 25, 2026

Denis Rothman is a bestselling AI author with over 30 years of experience in artificial intelligence, agent systems, and optimization. He has authored multiple cutting-edge AI books published by Packt and is renowned for making complex AI architecture concepts practical and immediately applicable. He guides you step by step through building production-ready context-engineered multi-agent systems — answering your questions live throughout the 6-hour session.

Prerequisites

Who Is This Workshop For?

Intermediate to advanced workshop. Solid Python and basic LLM experience required.

Frequently Asked Questions

Common Questions About Building a RAG System With Memory for AI Agents

Everything you need to know before registering.

What are the three memory layers in the RAG system built in this workshop? +

The system built in this workshop has three memory layers: working memory that holds the current context window contents including recent retrievals, episodic memory that stores compressed summaries of past interactions and their RAG retrievals for future reference, and semantic memory which is the embedded knowledge base queried by the RAG pipeline. The memory manager coordinates these three layers so agents have the right knowledge at every point in a conversation.

How does episodic memory improve RAG retrieval for multi-agent systems? +

Episodic memory improves RAG retrieval by storing the context of past queries alongside their results. When a new query arrives that is related to past interactions, the memory manager can use episodic memory to reformulate the retrieval query with additional context, retrieve episodic summaries alongside new knowledge base results, and avoid redundant retrievals for information that was recently accessed. This creates continuity that plain RAG cannot provide.

How do I implement memory-augmented RAG in Python? +

Memory-augmented RAG in Python involves four components: a vector store client for semantic memory retrieval, an episodic memory store (typically a lightweight database or structured file store) for past interaction summaries, a memory manager that coordinates reads and writes across both stores, and a retrieval orchestrator that combines results from both stores with appropriate citation tracking. The workshop implements all four in Python during the live session.

How does the memory system decide what to store in episodic memory? +

The memory system uses an importance scoring function to decide what to store in episodic memory: facts that were cited multiple times in a conversation, user preferences and constraints that were explicitly stated, task outcomes and decisions that may be relevant in future sessions, and any information the agent flagged as important to remember. The workshop covers implementing this importance scoring as a lightweight classifier.

Can the RAG memory system be shared across multiple agents safely? +

Yes. The workshop covers implementing shared episodic memory with appropriate access controls: read access is open to all agents in the system, while write access uses distributed locking to prevent concurrent write conflicts. The Glass-Box logging layer records all memory reads and writes so you can audit which agent accessed which information and detect any memory consistency issues.

How do I clear or update stale information in the RAG memory system? +

Stale memory management involves two mechanisms: time-to-live settings on episodic memory entries that automatically expire information after a configured period, and explicit invalidation when the source documents in semantic memory are updated. The workshop covers implementing a memory maintenance job that runs periodically to remove expired episodic memories and re-index changed semantic memory content.

Context Engineering for Multi-Agent Systems · Cohort 2 · April 25, 2026

Ready to Build Production AI With Context Engineering?

6 hours. Bestselling AI author. Production context-engineered multi-agent system by the end. Seats are limited.

Register Now →

Saturday April 25 · 9am to 3pm EDT · Online · Packt Publishing · Cohort 2