Skip to content

Architecture

The framework concentrates custom engineering on the three things no framework ships off the shelf — self-learning memory, a dynamic swarm composer, and a versioned, MCP-compatible tool registry — and leans on LangGraph for durable execution, checkpointing, and HITL interrupts.

Execution graph

flowchart TD
    START([START]) --> GI[guard_input: block injection / redact PII]
    GI -->|blocked| EN([END])
    GI -->|ok| RC[recall: inject past lessons]
    RC --> OR{orchestrator: cost-aware composer}
    OR -->|single| WK[worker: on-demand tools]
    OR -->|swarm| SW[swarm_worker: dependency waves + blackboard]
    WK -->|side-effecting tool| HA[human_approval: interrupt]
    WK -->|more subtasks| WK
    WK -->|done| FZ[finalize]
    HA --> WK
    SW --> FZ
    FZ --> RF[reflect: distill lesson + episodic]
    RF --> GO[guard_output: redact PII]
    GO --> EN

Every node is optional and additive: with no memory/guardrails/composer configured the graph collapses to the Stage-1 spine (orchestrator → worker → finalize). recall/reflect appear with memory, guard_input/guard_output with guardrails, and swarm_worker when the composer chooses a swarm.

Layers — every one is a swappable seam

Layer Implementation Later-stage seam
Model gateway LiteLLMGateway (API-first, OpenAI-compatible) + DemoGateway local vLLM endpoint
Agent core thin Agent over the gateway typed agent core
Orchestration LangGraph orchestrator-worker graph + SqliteSaver richer graphs
Memory JsonFileMemory (persistent) + LLMReflector; BM25+RRF recall, distilled lessons Letta/Mem0 + pgvector at scale
Swarm composer HeuristicSwarmComposer — cost-aware single-vs-swarm gate + parallel execution LLM-driven team formation
Tool registry StaticToolRegistry — versioned, on-demand BM25 retrieval MCP interop adapter
HITL LangGraph interrupt() approval gate escalation queues
Guardrails GuardrailPipeline — block prompt-injection, redact PII (input + output) LlamaFirewall / LLM Guard / NeMo
Multi-tenancy tenant-isolated memory namespaces + per-tenant CostTracker dashboard per-tenant rate limits / quotas
Observability Langfuse via OTEL + own graph spans eval/regression gates
Durability LangGraph SqliteSaver checkpointer Temporal for multi-day workflows

Repository layout

Riptide-Watergraph/
├── pyproject.toml               # setuptools build, src layout
└── src/riptide_watergraph/
    ├── interfaces/              # ABCs = the swappable seams (incl. Reflector)
    ├── gateway/                 # LiteLLMGateway + DemoGateway (offline)
    ├── memory/                  # JsonFileMemory, ranking, reflection, types
    ├── tools/                   # StaticToolRegistry (versioned, on-demand) + tools
    ├── swarm/                   # HeuristicSwarmComposer + cost model
    ├── guardrails/              # PII redaction, injection blocking, pipeline
    ├── mcp/                     # MCP tool interop (client, adapter, stdio)
    ├── graph/                   # state, nodes (recall/reflect/swarm/guard), builder
    ├── observability/           # OTEL + Langfuse tracing + per-tenant CostTracker
    ├── server/                  # FastAPI app + the dependency-free Studio (static/)
    ├── evaluation/              # offline task suite + scoring runner
    ├── config.py                # pydantic-settings
    └── cli.py                   # riptide run | costs | eval | serve

The retrieval core (BM25 lexical scoring + Reciprocal Rank Fusion, k=60) lives in memory/ranking.py behind a small, stable signature — if it ever shows up as a hot path it can be swapped for a native implementation without touching the framework.