DeepCritical / docs /implementation /13_phase_modal_integration.md
VibecoderMcSwaggins's picture
docs: add Phase 14 Demo Video & Hackathon Submission specification
e16b9e6
|
raw
history blame
28.7 kB

Phase 13 Implementation Spec: Modal Pipeline Integration

Goal: Wire existing Modal code execution into the agent pipeline. Philosophy: "Sandboxed execution makes AI-generated code trustworthy." Prerequisite: Phase 12 complete (MCP server working) Priority: P1 - HIGH VALUE ($2,500 Modal Innovation Award) Estimated Time: 2-3 hours


1. Why Modal Integration?

Current State Analysis

Mario already implemented src/tools/code_execution.py:

Component Status Notes
ModalCodeExecutor class Built Executes Python in Modal sandbox
SANDBOX_LIBRARIES Defined pandas, numpy, scipy, etc.
execute() method Implemented Stdout/stderr capture
execute_with_return() Implemented Returns result variable
AnalysisAgent Built Uses Modal for statistical analysis
Pipeline Integration MISSING Not wired into main orchestrator

What's Missing

Current Flow:
  User Query β†’ Orchestrator β†’ Search β†’ Judge β†’ [Report] β†’ Done

With Modal:
  User Query β†’ Orchestrator β†’ Search β†’ Judge β†’ [Hypothesis] β†’ [Analysis*] β†’ Report β†’ Done
                                                                    ↓
                                                          Modal Sandbox Execution

*The AnalysisAgent exists but is NOT called by either orchestrator.


2. Prize Opportunity

Modal Innovation Award: $2,500

Judging Criteria:

  1. Sandbox Isolation - Code runs in container, not local
  2. Scientific Computing - Real pandas/scipy analysis
  3. Safety - Can't access local filesystem
  4. Speed - Modal's fast cold starts

What We Need to Show

# LLM generates analysis code
code = """
import pandas as pd
import scipy.stats as stats

# Analyze extracted metrics from evidence
data = pd.DataFrame({
    'study': ['Study1', 'Study2', 'Study3'],
    'effect_size': [0.45, 0.52, 0.38],
    'sample_size': [120, 85, 200]
})

# Meta-analysis statistics
weighted_mean = (data['effect_size'] * data['sample_size']).sum() / data['sample_size'].sum()
t_stat, p_value = stats.ttest_1samp(data['effect_size'], 0)

print(f"Weighted Effect Size: {weighted_mean:.3f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    result = "SUPPORTED"
else:
    result = "INCONCLUSIVE"
"""

# Executed SAFELY in Modal sandbox
executor = get_code_executor()
output = executor.execute(code)  # Runs in isolated container!

3. Technical Specification

3.1 Dependencies (Already Present)

# pyproject.toml - already has Modal
dependencies = [
    "modal>=0.63.0",
    # ...
]

3.2 Environment Variables

# .env
MODAL_TOKEN_ID=your-token-id
MODAL_TOKEN_SECRET=your-token-secret

3.3 Integration Points

Integration Point File Change Required
Simple Orchestrator src/orchestrator.py Add AnalysisAgent call
Magentic Orchestrator src/orchestrator_magentic.py Add AnalysisAgent participant
Gradio UI src/app.py Add toggle for analysis mode
Config src/utils/config.py Add enable_modal_analysis setting

4. Implementation

4.1 Configuration Update (src/utils/config.py)

class Settings(BaseSettings):
    # ... existing settings ...

    # Modal Configuration
    modal_token_id: str | None = None
    modal_token_secret: str | None = None
    enable_modal_analysis: bool = False  # Opt-in for hackathon demo

    @property
    def modal_available(self) -> bool:
        """Check if Modal credentials are configured."""
        return bool(self.modal_token_id and self.modal_token_secret)

4.2 Simple Orchestrator Update (src/orchestrator.py)

"""Main orchestrator with optional Modal analysis."""

from src.utils.config import settings

# ... existing imports ...


class Orchestrator:
    """Search-Judge-Analyze orchestration loop."""

    def __init__(
        self,
        search_handler: SearchHandlerProtocol,
        judge_handler: JudgeHandlerProtocol,
        config: OrchestratorConfig | None = None,
        enable_analysis: bool = False,  # New parameter
    ) -> None:
        self.search = search_handler
        self.judge = judge_handler
        self.config = config or OrchestratorConfig()
        self.history: list[dict[str, Any]] = []
        self._enable_analysis = enable_analysis and settings.modal_available

        # Lazy-load analysis components
        self._hypothesis_agent: Any = None
        self._analysis_agent: Any = None

    async def _get_hypothesis_agent(self) -> Any:
        """Lazy initialization of HypothesisAgent."""
        if self._hypothesis_agent is None:
            from src.agents.hypothesis_agent import HypothesisAgent

            self._hypothesis_agent = HypothesisAgent(
                evidence_store={"current": []},
            )
        return self._hypothesis_agent

    async def _get_analysis_agent(self) -> Any:
        """Lazy initialization of AnalysisAgent."""
        if self._analysis_agent is None:
            from src.agents.analysis_agent import AnalysisAgent

            self._analysis_agent = AnalysisAgent(
                evidence_store={"current": [], "hypotheses": []},
            )
        return self._analysis_agent

    async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
        """Main orchestration loop with optional Modal analysis."""
        # ... existing search/judge loop ...

        # After judge says "synthesize", optionally run analysis
        if self._enable_analysis and assessment.recommendation == "synthesize":
            yield AgentEvent(
                type="analyzing",
                message="Running statistical analysis in Modal sandbox...",
                data={},
                iteration=iteration,
            )

            try:
                # Generate hypotheses first
                hypothesis_agent = await self._get_hypothesis_agent()
                hypothesis_agent._evidence_store["current"] = all_evidence

                hypothesis_result = await hypothesis_agent.run(query)
                hypotheses = hypothesis_agent._evidence_store.get("hypotheses", [])

                # Run Modal analysis
                analysis_agent = await self._get_analysis_agent()
                analysis_agent._evidence_store["current"] = all_evidence
                analysis_agent._evidence_store["hypotheses"] = hypotheses

                analysis_result = await analysis_agent.run(query)

                yield AgentEvent(
                    type="analysis_complete",
                    message="Modal analysis complete",
                    data=analysis_agent._evidence_store.get("analysis", {}),
                    iteration=iteration,
                )

            except Exception as e:
                yield AgentEvent(
                    type="error",
                    message=f"Modal analysis failed: {e}",
                    data={"error": str(e)},
                    iteration=iteration,
                )

        # Continue to synthesis...

4.3 MCP Tool for Modal Analysis (src/mcp_tools.py)

Add a new MCP tool for direct Modal analysis:

async def analyze_hypothesis(
    drug: str,
    condition: str,
    evidence_summary: str,
) -> str:
    """Perform statistical analysis of drug repurposing hypothesis using Modal.

    Executes AI-generated Python code in a secure Modal sandbox to analyze
    the statistical evidence for a drug repurposing hypothesis.

    Args:
        drug: The drug being evaluated (e.g., "metformin")
        condition: The target condition (e.g., "Alzheimer's disease")
        evidence_summary: Summary of evidence to analyze

    Returns:
        Analysis result with verdict (SUPPORTED/REFUTED/INCONCLUSIVE) and statistics
    """
    from src.tools.code_execution import get_code_executor, CodeExecutionError
    from src.agent_factory.judges import get_model
    from pydantic_ai import Agent

    # Check Modal availability
    from src.utils.config import settings
    if not settings.modal_available:
        return "Error: Modal credentials not configured. Set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET."

    # Generate analysis code using LLM
    code_agent = Agent(
        model=get_model(),
        output_type=str,
        system_prompt="""Generate Python code to analyze drug repurposing evidence.
Use pandas, numpy, scipy.stats. Output executable code only.
Set 'result' variable to SUPPORTED, REFUTED, or INCONCLUSIVE.
Print key statistics and p-values.""",
    )

    prompt = f"""Analyze this hypothesis:
Drug: {drug}
Condition: {condition}

Evidence:
{evidence_summary}

Generate statistical analysis code."""

    try:
        code_result = await code_agent.run(prompt)
        generated_code = code_result.output

        # Execute in Modal sandbox
        executor = get_code_executor()
        import asyncio
        loop = asyncio.get_running_loop()
        from functools import partial
        execution = await loop.run_in_executor(
            None, partial(executor.execute, generated_code, timeout=60)
        )

        if not execution["success"]:
            return f"## Analysis Failed\n\nError: {execution['error']}"

        # Format output
        return f"""## Statistical Analysis: {drug} for {condition}

### Execution Output

{execution['stdout']}


### Generated Code
```python
{generated_code}

Executed in Modal Sandbox - Isolated, secure, reproducible. """

except CodeExecutionError as e:
    return f"## Analysis Error\n\n{e}"
except Exception as e:
    return f"## Unexpected Error\n\n{e}"

### 4.4 Demo Script (`examples/modal_demo/run_analysis.py`)

```python
#!/usr/bin/env python3
"""Demo: Modal-powered statistical analysis of drug repurposing evidence.

This script demonstrates:
1. Gathering evidence from PubMed
2. Generating analysis code with LLM
3. Executing in Modal sandbox
4. Returning statistical insights

Usage:
    export OPENAI_API_KEY=...
    export MODAL_TOKEN_ID=...
    export MODAL_TOKEN_SECRET=...
    uv run python examples/modal_demo/run_analysis.py "metformin alzheimer"
"""

import argparse
import asyncio
import os
import sys

from src.agents.analysis_agent import AnalysisAgent
from src.agents.hypothesis_agent import HypothesisAgent
from src.tools.pubmed import PubMedTool
from src.utils.config import settings


async def main() -> None:
    """Run the Modal analysis demo."""
    parser = argparse.ArgumentParser(description="Modal Analysis Demo")
    parser.add_argument("query", help="Research query (e.g., 'metformin alzheimer')")
    args = parser.parse_args()

    # Check credentials
    if not settings.modal_available:
        print("Error: Modal credentials not configured.")
        print("Set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET in .env")
        sys.exit(1)

    if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
        print("Error: No LLM API key found.")
        sys.exit(1)

    print(f"\n{'='*60}")
    print("DeepCritical Modal Analysis Demo")
    print(f"Query: {args.query}")
    print(f"{'='*60}\n")

    # Step 1: Gather Evidence
    print("Step 1: Gathering evidence from PubMed...")
    pubmed = PubMedTool()
    evidence = await pubmed.search(args.query, max_results=5)
    print(f"  Found {len(evidence)} papers\n")

    # Step 2: Generate Hypotheses
    print("Step 2: Generating mechanistic hypotheses...")
    evidence_store: dict = {"current": evidence, "hypotheses": []}
    hypothesis_agent = HypothesisAgent(evidence_store=evidence_store)
    await hypothesis_agent.run(args.query)
    hypotheses = evidence_store.get("hypotheses", [])
    print(f"  Generated {len(hypotheses)} hypotheses\n")

    if hypotheses:
        print(f"  Primary: {hypotheses[0].drug} β†’ {hypotheses[0].target}")

    # Step 3: Run Modal Analysis
    print("\nStep 3: Running statistical analysis in Modal sandbox...")
    print("  (This executes LLM-generated code in an isolated container)\n")

    analysis_agent = AnalysisAgent(evidence_store=evidence_store)
    result = await analysis_agent.run(args.query)

    # Step 4: Display Results
    print("\n" + "="*60)
    print("ANALYSIS RESULTS")
    print("="*60)

    if result.messages:
        print(result.messages[0].text)

    analysis = evidence_store.get("analysis", {})
    if analysis:
        print(f"\nVerdict: {analysis.get('verdict', 'N/A')}")
        print(f"Confidence: {analysis.get('confidence', 0):.0%}")

    print("\n[Demo Complete - Code was executed in Modal, not locally]")


if __name__ == "__main__":
    asyncio.run(main())

4.5 Verification Script (examples/modal_demo/verify_sandbox.py)

#!/usr/bin/env python3
"""Verify that Modal sandbox is properly isolated.

This script proves to judges that code runs in Modal, not locally.
It attempts operations that would succeed locally but fail in sandbox.

Usage:
    uv run python examples/modal_demo/verify_sandbox.py
"""

import asyncio
from functools import partial

from src.tools.code_execution import get_code_executor
from src.utils.config import settings


async def main() -> None:
    """Verify Modal sandbox isolation."""
    if not settings.modal_available:
        print("Error: Modal credentials not configured.")
        return

    executor = get_code_executor()
    loop = asyncio.get_running_loop()

    print("="*60)
    print("Modal Sandbox Isolation Verification")
    print("="*60 + "\n")

    # Test 1: Prove it's not running locally
    print("Test 1: Check hostname (should NOT be your machine)")
    code1 = """
import socket
print(f"Hostname: {socket.gethostname()}")
"""
    result1 = await loop.run_in_executor(None, partial(executor.execute, code1))
    print(f"  Result: {result1['stdout'].strip()}")
    print(f"  (Your local hostname would be different)\n")

    # Test 2: Verify scientific libraries available
    print("Test 2: Verify scientific libraries")
    code2 = """
import pandas as pd
import numpy as np
import scipy
print(f"pandas: {pd.__version__}")
print(f"numpy: {np.__version__}")
print(f"scipy: {scipy.__version__}")
"""
    result2 = await loop.run_in_executor(None, partial(executor.execute, code2))
    print(f"  {result2['stdout'].strip()}\n")

    # Test 3: Verify network is blocked (security)
    print("Test 3: Verify network isolation (should fail)")
    code3 = """
import urllib.request
try:
    urllib.request.urlopen("https://google.com", timeout=2)
    print("Network: ALLOWED (unexpected)")
except Exception as e:
    print(f"Network: BLOCKED (as expected)")
"""
    result3 = await loop.run_in_executor(None, partial(executor.execute, code3))
    print(f"  {result3['stdout'].strip()}\n")

    # Test 4: Run actual statistical analysis
    print("Test 4: Execute real statistical analysis")
    code4 = """
import pandas as pd
import scipy.stats as stats

data = pd.DataFrame({
    'drug': ['Metformin'] * 3,
    'effect': [0.42, 0.38, 0.51],
    'n': [100, 150, 80]
})

mean_effect = data['effect'].mean()
sem = data['effect'].sem()
t_stat, p_val = stats.ttest_1samp(data['effect'], 0)

print(f"Mean Effect: {mean_effect:.3f} (SE: {sem:.3f})")
print(f"t-statistic: {t_stat:.2f}, p-value: {p_val:.4f}")
print(f"Verdict: {'SUPPORTED' if p_val < 0.05 else 'INCONCLUSIVE'}")
"""
    result4 = await loop.run_in_executor(None, partial(executor.execute, code4))
    print(f"  {result4['stdout'].strip()}\n")

    print("="*60)
    print("All tests complete - Modal sandbox verified!")
    print("="*60)


if __name__ == "__main__":
    asyncio.run(main())

5. TDD Test Suite

5.1 Unit Tests (tests/unit/tools/test_modal_integration.py)

"""Unit tests for Modal pipeline integration."""

from unittest.mock import AsyncMock, MagicMock, patch

import pytest

from src.utils.models import Evidence, Citation


@pytest.fixture
def sample_evidence() -> list[Evidence]:
    """Sample evidence for testing."""
    return [
        Evidence(
            content="Metformin shows effect size of 0.45 in Alzheimer's model.",
            citation=Citation(
                source="pubmed",
                title="Metformin Study",
                url="https://pubmed.ncbi.nlm.nih.gov/12345/",
                date="2024-01-15",
                authors=["Smith J"],
            ),
            relevance=0.9,
        )
    ]


class TestAnalysisAgentIntegration:
    """Tests for AnalysisAgent integration."""

    @pytest.mark.asyncio
    async def test_analysis_agent_generates_code(
        self, sample_evidence: list[Evidence]
    ) -> None:
        """AnalysisAgent should generate Python code for analysis."""
        from src.agents.analysis_agent import AnalysisAgent

        evidence_store = {
            "current": sample_evidence,
            "hypotheses": [
                MagicMock(
                    drug="metformin",
                    target="AMPK",
                    pathway="autophagy",
                    effect="neuroprotection",
                    confidence=0.8,
                )
            ],
        }

        with patch("src.agents.analysis_agent.get_code_executor") as mock_executor, \
             patch("src.agents.analysis_agent.get_model") as mock_model:

            # Mock LLM to return code
            mock_agent = AsyncMock()
            mock_agent.run = AsyncMock(return_value=MagicMock(
                output="import pandas as pd\nresult = 'SUPPORTED'"
            ))

            # Mock Modal execution
            mock_executor.return_value.execute.return_value = {
                "stdout": "SUPPORTED",
                "stderr": "",
                "success": True,
                "error": None,
            }

            agent = AnalysisAgent(evidence_store=evidence_store)
            agent._agent = mock_agent

            result = await agent.run("metformin alzheimer")

            assert result.messages[0].text is not None
            assert "analysis" in evidence_store


class TestModalExecutorUnit:
    """Unit tests for ModalCodeExecutor."""

    def test_executor_checks_credentials(self) -> None:
        """Executor should warn if credentials missing."""
        import os
        from unittest.mock import patch

        with patch.dict(os.environ, {}, clear=True):
            from src.tools.code_execution import ModalCodeExecutor

            # Should not raise, but should log warning
            executor = ModalCodeExecutor()
            assert executor.modal_token_id is None

    def test_get_sandbox_library_list(self) -> None:
        """Should return list of library==version strings."""
        from src.tools.code_execution import get_sandbox_library_list

        libs = get_sandbox_library_list()

        assert isinstance(libs, list)
        assert "pandas==2.2.0" in libs
        assert "numpy==1.26.4" in libs


class TestOrchestratorWithAnalysis:
    """Tests for orchestrator with Modal analysis enabled."""

    @pytest.mark.asyncio
    async def test_orchestrator_calls_analysis_when_enabled(self) -> None:
        """Orchestrator should call AnalysisAgent when enabled and Modal available."""
        from src.orchestrator import Orchestrator
        from src.utils.models import OrchestratorConfig

        with patch("src.orchestrator.settings") as mock_settings:
            mock_settings.modal_available = True

            mock_search = AsyncMock()
            mock_search.search.return_value = MagicMock(
                evidence=[],
                errors=[],
            )

            mock_judge = AsyncMock()
            mock_judge.assess.return_value = MagicMock(
                sufficient=True,
                recommendation="synthesize",
                next_search_queries=[],
            )

            config = OrchestratorConfig(max_iterations=1)
            orchestrator = Orchestrator(
                search_handler=mock_search,
                judge_handler=mock_judge,
                config=config,
                enable_analysis=True,
            )

            # Collect events
            events = []
            async for event in orchestrator.run("test query"):
                events.append(event)

            # Should have analyzing event if Modal enabled
            event_types = [e.type for e in events]
            # Note: This test verifies the flow, actual Modal call is mocked

5.2 Integration Test (tests/integration/test_modal.py)

"""Integration tests for Modal code execution (requires Modal credentials)."""

import pytest

from src.utils.config import settings


@pytest.mark.integration
@pytest.mark.skipif(
    not settings.modal_available,
    reason="Modal credentials not configured"
)
class TestModalIntegration:
    """Integration tests for Modal (requires credentials)."""

    @pytest.mark.asyncio
    async def test_modal_executes_real_code(self) -> None:
        """Test actual code execution in Modal sandbox."""
        import asyncio
        from functools import partial

        from src.tools.code_execution import get_code_executor

        executor = get_code_executor()
        code = """
import pandas as pd
result = pd.DataFrame({'a': [1,2,3]})['a'].sum()
print(f"Sum: {result}")
"""

        loop = asyncio.get_running_loop()
        result = await loop.run_in_executor(
            None, partial(executor.execute, code, timeout=30)
        )

        assert result["success"]
        assert "Sum: 6" in result["stdout"]

    @pytest.mark.asyncio
    async def test_modal_blocks_network(self) -> None:
        """Verify network is blocked in sandbox."""
        import asyncio
        from functools import partial

        from src.tools.code_execution import get_code_executor

        executor = get_code_executor()
        code = """
import urllib.request
try:
    urllib.request.urlopen("https://google.com", timeout=2)
    print("NETWORK_ALLOWED")
except Exception:
    print("NETWORK_BLOCKED")
"""

        loop = asyncio.get_running_loop()
        result = await loop.run_in_executor(
            None, partial(executor.execute, code, timeout=30)
        )

        assert "NETWORK_BLOCKED" in result["stdout"]

6. Verification Commands

# 1. Set Modal credentials
export MODAL_TOKEN_ID=your-token-id
export MODAL_TOKEN_SECRET=your-token-secret

# Or via modal CLI
modal setup

# 2. Run unit tests
uv run pytest tests/unit/tools/test_modal_integration.py -v

# 3. Run verification script (proves sandbox works)
uv run python examples/modal_demo/verify_sandbox.py

# 4. Run full demo
uv run python examples/modal_demo/run_analysis.py "metformin alzheimer"

# 5. Run integration tests (requires Modal creds)
uv run pytest tests/integration/test_modal.py -v -m integration

# 6. Run full test suite
make check

7. Definition of Done

Phase 13 is COMPLETE when:

  • src/utils/config.py updated with enable_modal_analysis setting
  • src/orchestrator.py optionally calls AnalysisAgent
  • src/mcp_tools.py has analyze_hypothesis MCP tool
  • examples/modal_demo/run_analysis.py working demo
  • examples/modal_demo/verify_sandbox.py verification script
  • Unit tests in tests/unit/tools/test_modal_integration.py
  • Integration tests in tests/integration/test_modal.py
  • Verification script proves sandbox isolation
  • All unit tests pass
  • Lints pass

8. Demo Script for Judges

Show Modal Innovation

  1. Run verification script (proves sandbox):

    uv run python examples/modal_demo/verify_sandbox.py
    
    • Shows hostname is NOT local machine
    • Shows scientific libraries available
    • Shows network is BLOCKED (security)
    • Shows real statistics execution
  2. Run analysis demo:

    uv run python examples/modal_demo/run_analysis.py "metformin cancer"
    
    • Shows evidence gathering
    • Shows hypothesis generation
    • Shows code execution in Modal
    • Shows statistical verdict
  3. Show the key differentiator:

    "LLM-generated code executes in an isolated Modal container. This is enterprise-grade safety for AI-powered scientific computing."


9. Value Delivered

Before After
Code execution exists but unused Integrated into pipeline
No demo of sandbox isolation Verification script proves it
No MCP tool for analysis analyze_hypothesis MCP tool
No judge-friendly demo Clear demo script

Prize Impact:

  • With Modal Integration: Eligible for $2,500 Modal Innovation Award

10. Files to Create/Modify

File Action Purpose
src/utils/config.py MODIFY Add enable_modal_analysis
src/orchestrator.py MODIFY Add optional AnalysisAgent call
src/mcp_tools.py MODIFY Add analyze_hypothesis MCP tool
examples/modal_demo/run_analysis.py CREATE Demo script
examples/modal_demo/verify_sandbox.py CREATE Verification script
tests/unit/tools/test_modal_integration.py CREATE Unit tests
tests/integration/test_modal.py CREATE Integration tests

11. Architecture After Phase 13

User Query
    ↓
Orchestrator
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Search Phase                                                    β”‚
β”‚   PubMedTool β†’ ClinicalTrialsTool β†’ BioRxivTool                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Judge Phase                                                     β”‚
β”‚   JudgeHandler β†’ "sufficient" β†’ continue to synthesis          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓ (if enable_modal_analysis=True)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Analysis Phase (NEW)                                            β”‚
β”‚   HypothesisAgent β†’ Generate mechanistic hypotheses             β”‚
β”‚        ↓                                                        β”‚
β”‚   AnalysisAgent β†’ Generate Python code                          β”‚
β”‚        ↓                                                        β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚   β”‚         Modal Sandbox Container               β”‚              β”‚
β”‚   β”‚  - pandas, numpy, scipy, sklearn             β”‚              β”‚
β”‚   β”‚  - Network BLOCKED                           β”‚              β”‚
β”‚   β”‚  - Filesystem ISOLATED                       β”‚              β”‚
β”‚   β”‚  - Execute β†’ Return stdout                   β”‚              β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚        ↓                                                        β”‚
β”‚   AnalysisResult β†’ SUPPORTED/REFUTED/INCONCLUSIVE              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Report Phase                                                    β”‚
β”‚   ReportAgent β†’ Structured scientific report                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

This is the Modal-powered analytics stack.