Memory management is the most underestimated engineering challenge in production AI agents. This live workshop teaches the complete memory management stack: working memory budgets, episodic memory compression, semantic memory retrieval, and the coordination patterns that keep memory consistent across a multi-agent system.
By Packt Publishing · Refunds up to 10 days before
AI agent memory sounds simple: give the agent what it needs to remember. In production it is a complex engineering challenge: managing finite context window budgets across multiple simultaneous agents, compressing episodic memory without losing critical context, and synchronising memory reads and writes across a distributed agent system. This workshop solves all three.
Context engineering is the discipline of designing systems that give AI the right information, in the right format, to reason and act reliably. It goes beyond prompt engineering — building structured, deterministic systems that scale in production.
A multi-agent system uses multiple specialised AI agents working together — each with a defined role, context, and tools — to complete complex tasks no single agent could handle reliably. Context engineering makes them predictable.
MCP is Anthropic's open standard for connecting AI models to tools, data sources, and other agents. It provides structured agent orchestration with clear context boundaries — making systems transparent and debuggable.
Context engineering requires hands-on practice to truly understand. This live workshop lets you build a working system with a world-class instructor answering your questions in real time.
Six modules. Six hours. A production-ready context-engineered AI system by the time you finish.
Understand why prompts fail at scale and how semantic blueprints give AI structured, goal-driven contextual awareness.
Design and orchestrate multi-agent workflows using the Model Context Protocol. Build transparent, traceable agent systems.
Build RAG pipelines that deliver accurate, cited responses. Engineer memory systems that persist context reliably across agents.
Architect a transparent, explainable context engine where every decision is traceable and debuggable in production.
Implement safeguards against prompt injection and data poisoning. Enforce trust boundaries in multi-agent environments.
Deploy your context-engineered system to production. Apply patterns for scaling, monitoring, and reliability.
Concrete working deliverables — not just theory and slides.
A working Glass-Box Context Engine with transparent, traceable reasoning
Multi-agent workflow orchestrated with the Model Context Protocol
High-fidelity RAG pipeline with memory and citations
Safeguards against prompt injection and data poisoning
Reusable architecture patterns for production AI systems
Certificate of completion from Packt Publishing
Denis Rothman brings decades of production AI engineering experience to this live workshop.
Denis Rothman is a bestselling AI author with over 30 years of experience in artificial intelligence, agent systems, and optimization. He has authored multiple cutting-edge AI books published by Packt and is renowned for making complex AI architecture concepts practical and immediately applicable. He guides you step by step through building production-ready context-engineered multi-agent systems — answering your questions live throughout the 6-hour session.
Intermediate to advanced workshop. Solid Python and basic LLM experience required.
Everything you need to know before registering.
Production AI agents need three memory types: working memory (the active context window for the current invocation, managed through semantic blueprint templates and context budgets), episodic memory (a compressed, retrievable record of past interactions, managed through a persistent store with importance scoring and TTL-based eviction), and semantic memory (the embedded knowledge base accessed through the RAG pipeline, managed through incremental indexing and retrieval confidence scoring). Each type requires different management techniques.
Working memory management for AI agents uses a context budget allocator that divides the available context window tokens among content categories: a fixed allocation for the semantic blueprint, a dynamic allocation for RAG-retrieved knowledge (sized based on query complexity), a capped allocation for conversation history (the most recent N turns), and a reserve for the agent's response. The allocator assembles the context package and truncates or compresses any category that exceeds its allocation before agent invocation.
Episodic memory compression uses an importance-weighted summarisation approach: the memory manager scores each past interaction segment by recency, explicit importance flags (facts, decisions, user preferences), and relevance to the current session topic, then summarises the lower-importance segments while retaining the higher-importance segments at full detail. The compressed memories are stored with their importance scores, allowing the retrieval layer to surface the most relevant episodic context at any future invocation.
Memory consistency across multiple agents uses a shared episodic memory store with distributed locking for writes and read-through caching for reads. Each agent's memory writes go through a version controller that detects conflicts (two agents trying to update the same memory record simultaneously) and applies a merge strategy. The Glass-Box logging layer records all memory operations with agent identifiers, making memory consistency issues detectable and diagnosable.
Memory eviction policy determines when episodic memory entries are removed to stay within storage limits. The workshop covers three eviction strategies: TTL-based eviction (entries expire after a configured time period), LRU eviction (least recently accessed entries are evicted first), and importance-based eviction (entries below a importance threshold are evicted first, preserving high-importance memories regardless of age). The right strategy depends on your use case: TTL for session-specific contexts, importance-based for long-running agent relationships.
Testing AI agent memory management covers: unit tests for each memory operation (store, retrieve, compress, evict) with known inputs and expected outputs, integration tests that verify memory consistency across multiple agent invocations in a complete workflow, long-conversation tests that verify important context is retained through multiple compression cycles, and concurrent access tests that verify the locking mechanism prevents memory corruption under parallel agent load.
6 hours. Bestselling AI author. Production context-engineered multi-agent system by the end. Seats are limited.
Register Now →Saturday April 25 · 9am to 3pm EDT · Online · Packt Publishing · Cohort 2