DeepAgents — Multi-Agent Orchestration Research

Contributing to DeepAgents, a research framework exploring how multiple LLM-powered agents can collaborate on complex tasks through hierarchical planning and shared memory.

Contributions

Memory module — implemented persistent vector memory for cross-session agent context
Tool registry — built a dynamic tool discovery and registration system
Evaluation harness — added benchmarks for multi-agent task completion on SWE-bench

Research Questions

How do agents decompose ambiguous tasks into sub-plans?
When should agents delegate vs. execute directly?
What memory architectures minimize hallucination in long-horizon tasks?

Learnings

The biggest insight: agent reliability scales better with structured state machines (like LangGraph) than with pure prompt-driven autonomy. Explicit control flow + LLM reasoning at decision nodes beats end-to-end agent prompting.