Architecture¶

The framework concentrates custom engineering on the three things no framework ships off the shelf — self-learning memory, a dynamic swarm composer, and a versioned, MCP-compatible tool registry — and leans on LangGraph for durable execution, checkpointing, and HITL interrupts.

Execution graph¶

flowchart TD
    START([START]) --> GI[guard_input: block injection / redact PII]
    GI -->|blocked| EN([END])
    GI -->|ok| RC[recall: inject past lessons]
    RC --> OR{orchestrator: cost-aware composer}
    OR -->|single| WK[worker: on-demand tools]
    OR -->|swarm| SW[swarm_worker: dependency waves + blackboard]
    WK -->|side-effecting tool| HA[human_approval: interrupt]
    WK -->|more subtasks| WK
    WK -->|done| FZ[finalize]
    HA --> WK
    SW --> FZ
    FZ --> RF[reflect: distill lesson + episodic]
    RF --> GO[guard_output: redact PII]
    GO --> EN

Every node is optional and additive: with no memory/guardrails/composer configured the graph collapses to the Stage-1 spine (orchestrator → worker → finalize). recall/reflect appear with memory, guard_input/guard_output with guardrails, and swarm_worker when the composer chooses a swarm.

Layers — every one is a swappable seam¶

Layer	Implementation	Later-stage seam
Model gateway	`LiteLLMGateway` (API-first, OpenAI-compatible) + `DemoGateway`	local vLLM endpoint
Agent core	thin `Agent` over the gateway	typed agent core
Orchestration	LangGraph orchestrator-worker graph + `SqliteSaver`	richer graphs
Memory	`JsonFileMemory` (persistent) + `LLMReflector`; BM25+RRF recall, distilled lessons	Letta/Mem0 + pgvector at scale
Swarm composer	`HeuristicSwarmComposer` — cost-aware single-vs-swarm gate + parallel execution	LLM-driven team formation
Tool registry	`StaticToolRegistry` — versioned, on-demand BM25 retrieval	MCP interop adapter
HITL	LangGraph `interrupt()` approval gate	escalation queues
Guardrails	`GuardrailPipeline` — block prompt-injection, redact PII (input + output)	LlamaFirewall / LLM Guard / NeMo
Multi-tenancy	tenant-isolated memory namespaces + per-tenant `CostTracker` dashboard	per-tenant rate limits / quotas
Observability	Langfuse via OTEL + own graph spans	eval/regression gates
Durability	LangGraph `SqliteSaver` checkpointer	Temporal for multi-day workflows

Repository layout¶

Riptide-Watergraph/
├── pyproject.toml               # setuptools build, src layout
└── src/riptide_watergraph/
    ├── interfaces/              # ABCs = the swappable seams (incl. Reflector)
    ├── gateway/                 # LiteLLMGateway + DemoGateway (offline)
    ├── memory/                  # JsonFileMemory, ranking, reflection, types
    ├── tools/                   # StaticToolRegistry (versioned, on-demand) + tools
    ├── swarm/                   # HeuristicSwarmComposer + cost model
    ├── guardrails/              # PII redaction, injection blocking, pipeline
    ├── mcp/                     # MCP tool interop (client, adapter, stdio)
    ├── graph/                   # state, nodes (recall/reflect/swarm/guard), builder
    ├── observability/           # OTEL + Langfuse tracing + per-tenant CostTracker
    ├── server/                  # FastAPI app + the dependency-free Studio (static/)
    ├── evaluation/              # offline task suite + scoring runner
    ├── config.py                # pydantic-settings
    └── cli.py                   # riptide run | costs | eval | serve

The retrieval core (BM25 lexical scoring + Reciprocal Rank Fusion, k=60) lives in memory/ranking.py behind a small, stable signature — if it ever shows up as a hot path it can be swapped for a native implementation without touching the framework.