MiroFish - AI Swarm Intelligence Engine for Predicting the Future

What if you could simulate thousands of AI agents with unique personalities, memories, and behaviors to predict future outcomes? MiroFish makes this possible - it’s an open-source multi-agent prediction engine that creates high-fidelity digital worlds for scenario simulation.

What is MiroFish?

MiroFish is a next-generation AI prediction engine powered by multi-agent technology. By extracting seed information from the real world (such as breaking news, policy drafts, or financial signals), it automatically constructs a parallel digital world where thousands of intelligent agents interact and evolve.

Key Capabilities

Upload seed materials (data analysis reports, news articles, or even novels) and describe your prediction requirements in natural language. MiroFish returns:

  • A detailed prediction report
  • A deeply interactive high-fidelity digital world

How It Works

Complete Workflow

Step Component Description
0 Text Processing Document parsing, preprocessing, and chunking
1 Ontology Generation LLM-based entity and relationship type definition
2 Graph Building Zep Cloud knowledge graph construction with GraphRAG
3 Entity Extraction Filter and enrich entities from knowledge graph
4 Profile Generation LLM + Zep search for detailed agent personas
5 Simulation Config LLM-based simulation parameter generation
6 OASIS Simulation Dual-platform parallel simulation (Twitter + Reddit)
7 Report Generation ReACT-based ReportAgent with Zep tools
8 Deep Interaction Chat with simulated agents and ReportAgent

Functional Flow Diagrams

The MiroFish workflow is divided into two main phases for better visualization:

Part 1: Data Preparation & Knowledge Graph Construction

MiroFish Flow Part 1 - Data Preparation

Phase 1 Components:

  1. Input Layer - Users upload documents (news, reports, novels) and describe prediction requirements in natural language
  2. Text Processor - FileParser extracts text, preprocessing cleans and normalizes, chunking splits into 500-character segments
  3. Ontology Generator - LLM analyzes content to define 10 entity types (Person, Organization, etc.) and 6-10 edge types
  4. Zep Cloud - Creates unique graph ID, sets ontology, uploads episodes in batches, performs GraphRAG extraction
  5. Entity Reader - Filters entities by defined types, enriches with edges and relationships, outputs filtered entities

Part 2: Simulation & Output Generation

MiroFish Flow Part 2 - Simulation & Output

Phase 2 Components:

  1. Profile Generator - Uses Zep hybrid search + LLM to generate detailed personas (age, MBTI, country, interests)
  2. Config Generator - LLM generates simulation parameters (rounds, active hours, posting frequency)
  3. OASIS Engine - CAMEL-AI powered social simulation on Twitter and Reddit platforms
  4. Report Agent - ReACT pattern with InsightForge, Panorama, and Interview tools
  5. Chat Interface - Interact with simulated agents or ask ReportAgent questions

Technical Architecture

Backend Services (Python + Flask)

backend/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/                    # REST API endpoints
β”‚   β”‚   β”œβ”€β”€ graph.py           # Graph building API
β”‚   β”‚   β”œβ”€β”€ simulation.py      # Simulation management API
β”‚   β”‚   └── report.py          # Report generation API
β”‚   β”œβ”€β”€ models/                 # Data models
β”‚   β”‚   β”œβ”€β”€ project.py         # Project model
β”‚   β”‚   └── task.py            # Task management
β”‚   β”œβ”€β”€ services/               # Core services
β”‚   β”‚   β”œβ”€β”€ text_processor.py          # Document parsing & chunking
β”‚   β”‚   β”œβ”€β”€ ontology_generator.py      # LLM-based ontology definition
β”‚   β”‚   β”œβ”€β”€ graph_builder.py           # Zep Cloud graph construction
β”‚   β”‚   β”œβ”€β”€ zep_entity_reader.py       # Entity extraction & filtering
β”‚   β”‚   β”œβ”€β”€ oasis_profile_generator.py # Agent persona generation
β”‚   β”‚   β”œβ”€β”€ simulation_config_generator.py # LLM-based config
β”‚   β”‚   β”œβ”€β”€ simulation_manager.py      # Simulation orchestration
β”‚   β”‚   β”œβ”€β”€ simulation_runner.py       # Script execution
β”‚   β”‚   β”œβ”€β”€ report_agent.py            # ReACT report generation
β”‚   β”‚   └── zep_tools.py               # Zep search tools
β”‚   └── utils/                  # Utilities
β”‚       β”œβ”€β”€ file_parser.py      # Multi-format file parsing
β”‚       β”œβ”€β”€ llm_client.py       # LLM API client
β”‚       └── logger.py           # Logging utilities

Frontend Components (Vue.js + Vite)

frontend/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ components/             # Vue components
β”‚   β”‚   β”œβ”€β”€ Step1GraphBuild.vue    # Document upload & graph building
β”‚   β”‚   β”œβ”€β”€ Step2EnvSetup.vue      # Entity extraction & profile generation
β”‚   β”‚   β”œβ”€β”€ Step3Simulation.vue    # Simulation configuration & execution
β”‚   β”‚   β”œβ”€β”€ Step4Report.vue        # Report viewing & download
β”‚   β”‚   └── Step5Interaction.vue   # Chat interface
β”‚   β”œβ”€β”€ views/                  # Page views
β”‚   β”‚   β”œβ”€β”€ MainView.vue        # Main workflow view
β”‚   β”‚   β”œβ”€β”€ SimulationView.vue  # Simulation monitoring
β”‚   β”‚   └── InteractionView.vue # Chat interface
β”‚   └── api/                    # API clients
β”‚       β”œβ”€β”€ graph.js            # Graph API
β”‚       β”œβ”€β”€ simulation.js       # Simulation API
β”‚       └── report.js           # Report API

Key Dependencies

Package Purpose Version
camel-oasis Social media simulation engine Latest
camel-ai Multi-agent framework Latest
zep-cloud Long-term memory & knowledge graph Latest
openai LLM API integration Latest
langchain ReACT agent framework Latest
flask Backend web framework 3.x
vue.js Frontend framework 3.x

Core Components Deep Dive

1. Text Processor (text_processor.py)

Handles document ingestion and preparation:

  • FileParser: Extracts text from PDF, DOCX, TXT, MD files
  • Preprocessing: Removes excess whitespace, normalizes line endings
  • Chunking: Splits text into 500-character chunks with 50-character overlap
class TextProcessor:
    @staticmethod
    def split_text(text: str, chunk_size: int = 500, overlap: int = 50) -> List[str]:
        return split_text_into_chunks(text, chunk_size, overlap)

2. Ontology Generator (ontology_generator.py)

LLM-powered entity and relationship type definition:

  • Entity Types: Exactly 10 types including Person and Organization fallbacks
  • Edge Types: 6-10 relationship types (WORKS_FOR, STUDIES_AT, etc.)
  • Validation: Ensures Zep API compatibility (max 10 types each)
class OntologyGenerator:
    def generate(self, document_texts: List[str], simulation_requirement: str) -> Dict[str, Any]:
        # Returns entity_types, edge_types, analysis_summary

3. Graph Builder (graph_builder.py)

Zep Cloud knowledge graph construction:

  • Create Graph: Generates unique graph ID (mirofish_<uuid>)
  • Set Ontology: Dynamically creates Pydantic models for entities/edges
  • Add Episodes: Batch uploads text chunks (default 3 per batch)
  • Wait for Processing: Monitors episode processing status
  • GraphRAG: Automatic entity and relationship extraction
class GraphBuilderService:
    def build_graph_async(self, text: str, ontology: Dict, ...) -> str:
        # Returns task_id for async processing

4. Entity Reader (zep_entity_reader.py)

Extracts and enriches entities from knowledge graph:

  • Filter by Type: Only includes entities matching defined types
  • Enrich with Edges: Adds related facts and relationships
  • Pagination: Handles large graphs with paging
class ZepEntityReader:
    def filter_defined_entities(self, graph_id: str, defined_entity_types: List[str]) -> FilteredEntities:
        # Returns filtered entities with enriched context

5. Profile Generator (oasis_profile_generator.py)

Creates detailed agent personas:

  • Zep Hybrid Search: Searches nodes and edges for context
  • LLM Persona: Generates age, MBTI, country, profession, interests
  • Individual vs Group: Different prompts for persons vs organizations
  • Output Formats: Twitter CSV and Reddit JSON
class OasisProfileGenerator:
    def generate_profile_from_entity(self, entity: EntityNode, user_id: int, use_llm: bool = True) -> OasisAgentProfile:
        # Returns detailed agent profile

6. Simulation Manager (simulation_manager.py)

Orchestrates the entire simulation:

  • State Management: Tracks simulation status (created, preparing, running, completed)
  • Platform Support: Twitter and Reddit dual-platform simulation
  • Progress Callbacks: Real-time progress updates
class SimulationStatus(str, Enum):
    CREATED = "created"
    PREPARING = "preparing"
    READY = "ready"
    RUNNING = "running"
    COMPLETED = "completed"
    FAILED = "failed"

7. Report Agent (report_agent.py)

ReACT-based report generation with Zep tools:

  • InsightForge: Deep insight retrieval with multi-dimensional analysis
  • Panorama Search: Broad overview of simulation results
  • Quick Search: Simple fact lookup
  • Interview Agents: Real interviews with simulated agents
class ReportAgent:
    def generate_report(self, simulation_id: str, graph_id: str, simulation_requirement: str) -> Report:
        # Returns structured prediction report

Use Cases

1. Public Opinion Prediction

Upload news articles or social media data to simulate how public sentiment might evolve:

  • Viral content spread patterns
  • Crisis communication outcomes
  • Brand reputation trajectories

2. Financial Market Simulation

Feed financial reports and market signals for agent-based market simulations:

  • Investor behavior modeling
  • Market sentiment analysis
  • Risk scenario testing

3. Creative Writing

Upload the first 80 chapters of a novel and let MiroFish predict the lost ending based on character personalities and plot dynamics.

4. Policy Impact Assessment

Test policy drafts in a zero-risk digital sandbox:

  • Public reaction simulation
  • Stakeholder behavior prediction
  • Unintended consequence discovery

Quick Start Guide

Prerequisites

Tool Version Purpose
Node.js 18+ Frontend runtime
Python 3.11-3.12 Backend runtime
uv Latest Python package manager

Installation

# Clone the repository
git clone https://github.com/666ghj/MiroFish.git
cd MiroFish

# Copy environment configuration
cp .env.example .env

# Configure API keys
# LLM_API_KEY - Your LLM API key (OpenAI SDK compatible)
# ZEP_API_KEY - Zep Cloud API key for memory management

Environment Variables

# LLM API Configuration (supports any OpenAI SDK compatible API)
LLM_API_KEY=your_api_key
LLM_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
LLM_MODEL_NAME=qwen-plus

# Zep Cloud Configuration (free tier available)
ZEP_API_KEY=your_zep_api_key

Run the Application

# Install all dependencies
npm run setup:all

# Start both frontend and backend
npm run dev

Service URLs:

  • Frontend: http://localhost:3000
  • Backend API: http://localhost:5001

Docker Deployment

# Configure environment
cp .env.example .env

# Start with Docker Compose
docker compose up -d

API Reference

Graph Building API

POST /api/graph/build
Content-Type: application/json

{
  "documents": ["file1.pdf", "file2.docx"],
  "simulation_requirement": "Predict public reaction to policy X",
  "chunk_size": 500,
  "chunk_overlap": 50
}

Simulation API

POST /api/simulation/create
Content-Type: application/json

{
  "project_id": "proj_123",
  "graph_id": "mirofish_abc123",
  "enable_twitter": true,
  "enable_reddit": true
}

Report API

POST /api/report/generate
Content-Type: application/json

{
  "simulation_id": "sim_456",
  "graph_id": "mirofish_abc123",
  "simulation_requirement": "Analyze public sentiment trends"
}

Why MiroFish Matters

For Decision Makers

  • Zero-risk testing - Try policies in simulation before real-world implementation
  • Scenario exploration - Test multiple what-if scenarios
  • Stakeholder mapping - Understand how different groups might react

For Researchers

  • Multi-agent systems - Study emergent behaviors
  • Social simulation - Model complex social dynamics
  • LLM applications - Explore large-scale agent coordination

For Developers

  • Open source - Full code access for customization
  • Modular architecture - Easy to extend and modify
  • Modern stack - Vue.js frontend, Flask backend, Python agents

Acknowledgments

MiroFish is incubated by Shanda Group and powered by OASIS (Open Agent Social Interaction Simulations) from the CAMEL-AI team.

Resources

Conclusion

MiroFish represents a fascinating convergence of multi-agent systems, large language models, and social simulation. Whether you’re a researcher studying emergent behaviors, a decision-maker testing scenarios, or a developer exploring AI applications, MiroFish offers a powerful platform for predicting the future through simulation.

The ability to create thousands of unique agents with persistent memories and let them interact in simulated social environments opens up possibilities that were previously confined to science fiction. As LLMs continue to improve, systems like MiroFish will become increasingly accurate at modeling complex social dynamics.

Try it yourself - clone the repository, set up your API keys, and start simulating your own scenarios!


Have questions or want to share your MiroFish experiments? Join the Discord community or check out the GitHub repository for more details.

Watch PyShine on YouTube

Contents