From experiment runner → product surface.
A pragmatic sequence: stabilize the engine, ship a clean UI for iteration, then expand capabilities (reflection, better evaluation, and multimodal toolchains).
Near-term
Foundation and repeatability.
• CLI as installable package
• YAML schema validation
• Experiment viewer (top entries + scores)
• Archive logging defaults (.jsonl)
Mid-term
Better iteration loops.
• Reflection viewer + search/filter/export
• Plugin registry (pre/post hooks)
• Self-nomination profiles per agent
• Promotion workflows into Codex
Product surface
GUI-driven workflow.
• Drag-and-drop experiment builder
• Two-way YAML ↔ Canvas sync
• Monitor tab with “Add to Archive”
• Codex viewer with archetypes + entries
Longer-term
Multimodal + world interaction.
• Vision & video toolchains
• Outcome-constrained learning loops
• Domain-specific swarms (sports, research)
• Stronger governance over memory
How to use this roadmap
Treat each milestone as a test: “Can a new user define an experiment, run it, understand the result, and promote a learned pattern into the Codex?”