Spaces:

DataQuests
/

DeepCritical

Running

App Files Files Community

VibecoderMcSwaggins commited on 12 days ago

Commit

5264b25

unverified ·

2 Parent(s): 8625ded f9cb2b7

Merge pull request #20 from The-Obstacle-Is-The-Way/feat/phase12-mcp-server

Browse files

Files changed (15) hide show

README.md +28 -5
docs/implementation/12_phase_mcp_server.md +832 -0
docs/implementation/13_phase_modal_integration.md +1195 -0
docs/implementation/14_phase_demo_submission.md +464 -0
docs/implementation/roadmap.md +40 -9
docs/pending/00_priority_summary.md +111 -0
docs/pending/01_hackathon_requirements.md +99 -0
docs/pending/02_mcp_server_integration.md +177 -0
docs/pending/03_modal_integration.md +158 -0
pyproject.toml +1 -1
src/app.py +63 -6
src/mcp_tools.py +156 -0
tests/integration/test_mcp_tools_live.py +24 -0
tests/unit/test_mcp_tools.py +200 -0
uv.lock +14 -8

README.md CHANGED Viewed

@@ -30,11 +30,35 @@ uv sync
 ```bash
 # Start the Gradio app
-uv run python -m src.app
 ```
 Open your browser to `http://localhost:7860`.
 ## Development
 ### Run Tests
@@ -53,13 +77,12 @@ make check
 DeepCritical uses a Vertical Slice Architecture:
-1.  **Search Slice**: Retrieving evidence from PubMed and the Web.
 2.  **Judge Slice**: Evaluating evidence quality using LLMs.
 3.  **Orchestrator Slice**: Managing the research loop and UI.
 Built with:
 - **PydanticAI**: For robust agent interactions.
 - **Gradio**: For the streaming user interface.
-- **PubMed**: For biomedical literature.
-- **DuckDuckGo**: For general web search.

 ```bash
 # Start the Gradio app
+uv run python src/app.py
 ```
 Open your browser to `http://localhost:7860`.
+### 3. Connect via MCP
+This application exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
+**MCP Server URL**: `http://localhost:7860/gradio_api/mcp/`
+**Claude Desktop Configuration**:
+Add this to your `claude_desktop_config.json`:
+```json
+{
+  "mcpServers": {
+    "deepcritical": {
+      "url": "http://localhost:7860/gradio_api/mcp/"
+    }
+  }
+}
+```
+**Available Tools**:
+- `search_pubmed`: Search peer-reviewed biomedical literature.
+- `search_clinical_trials`: Search ClinicalTrials.gov.
+- `search_biorxiv`: Search bioRxiv/medRxiv preprints.
+- `search_all`: Search all sources simultaneously.
 ## Development
 ### Run Tests
 DeepCritical uses a Vertical Slice Architecture:
+1.  **Search Slice**: Retrieving evidence from PubMed, ClinicalTrials.gov, and bioRxiv.
 2.  **Judge Slice**: Evaluating evidence quality using LLMs.
 3.  **Orchestrator Slice**: Managing the research loop and UI.
 Built with:
 - **PydanticAI**: For robust agent interactions.
 - **Gradio**: For the streaming user interface.
+- **PubMed, ClinicalTrials.gov, bioRxiv**: For biomedical data.
+- **MCP**: For universal tool access.

docs/implementation/12_phase_mcp_server.md ADDED Viewed

	@@ -0,0 +1,832 @@

+# Phase 12 Implementation Spec: MCP Server Integration
+**Goal**: Expose DeepCritical search tools as MCP servers for Track 2 compliance.
+**Philosophy**: "MCP is the bridge between tools and LLMs."
+**Prerequisite**: Phase 11 complete (all search tools working)
+**Priority**: P0 - REQUIRED FOR HACKATHON TRACK 2
+**Estimated Time**: 2-3 hours
+---
+## 1. Why MCP Server?
+### Hackathon Requirement
+| Requirement | Status Before | Status After |
+|-------------|---------------|--------------|
+| Must use MCP servers as tools | **MISSING** | **COMPLIANT** |
+| Autonomous Agent behavior | **Have it** | Have it |
+| Must be Gradio app | **Have it** | Have it |
+| Planning/reasoning/execution | **Have it** | Have it |
+**Bottom Line**: Without MCP server, we're disqualified from Track 2.
+### What MCP Enables
+```text
+Current State:
+  Our Tools → Called directly by Python code → Only our app can use them
+After MCP:
+  Our Tools → Exposed via MCP protocol → Claude Desktop, Cursor, ANY MCP client
+```
+---
+## 2. Implementation Options Analysis
+### Option A: Gradio MCP (Recommended)
+**Pros:**
+- Single parameter: `demo.launch(mcp_server=True)`
+- Already have Gradio app
+- Automatic tool schema generation from docstrings
+- Built into Gradio 5.0+
+**Cons:**
+- Requires Gradio 5.0+ with MCP extras
+- Must follow strict docstring format
+### Option B: Native MCP SDK (FastMCP)
+**Pros:**
+- More control over tool definitions
+- Explicit server configuration
+- Separate from UI concerns
+**Cons:**
+- Separate server process
+- More code to maintain
+- Additional dependency
+### Decision: **Gradio MCP (Option A)**
+Rationale:
+1. Already have Gradio app (`src/app.py`)
+2. Minimal code changes
+3. Judges will appreciate simplicity
+4. Follows hackathon's official Gradio guide
+---
+## 3. Technical Specification
+### 3.1 Dependencies
+```toml
+# pyproject.toml - add MCP extras
+dependencies = [
+    "gradio[mcp]>=5.0.0",  # Updated from gradio>=4.0
+    # ... existing deps
+]
+```
+### 3.2 MCP Tool Functions
+Each tool needs:
+1. **Type hints** on all parameters
+2. **Docstring** with Args section (Google style)
+3. **Return type** annotation
+4. **`api_name`** parameter for explicit endpoint naming
+```python
+async def search_pubmed(query: str, max_results: int = 10) -> str:
+    """Search PubMed for biomedical literature.
+    Args:
+        query: Search query for PubMed (e.g., "metformin alzheimer")
+        max_results: Maximum number of results to return (1-50)
+    Returns:
+        Formatted search results with titles, citations, and abstracts
+    """
+```
+### 3.3 MCP Server URL
+Once launched:
+```text
+http://localhost:7860/gradio_api/mcp/
+```
+Or on HuggingFace Spaces:
+```text
+https://[space-id].hf.space/gradio_api/mcp/
+```
+---
+## 4. Implementation
+### 4.1 MCP Tool Wrappers (`src/mcp_tools.py`)
+```python
+"""MCP tool wrappers for DeepCritical search tools.
+These functions expose our search tools via MCP protocol.
+Each function follows the MCP tool contract:
+- Full type hints
+- Google-style docstrings with Args section
+- Formatted string returns
+"""
+from src.tools.biorxiv import BioRxivTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.pubmed import PubMedTool
+# Singleton instances (avoid recreating on each call)
+_pubmed = PubMedTool()
+_trials = ClinicalTrialsTool()
+_biorxiv = BioRxivTool()
+async def search_pubmed(query: str, max_results: int = 10) -> str:
+    """Search PubMed for peer-reviewed biomedical literature.
+    Searches NCBI PubMed database for scientific papers matching your query.
+    Returns titles, authors, abstracts, and citation information.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "drug repurposing cancer")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted search results with paper titles, authors, dates, and abstracts
+    """
+    max_results = max(1, min(50, max_results))  # Clamp to valid range
+    results = await _pubmed.search(query, max_results)
+    if not results:
+        return f"No PubMed results found for: {query}"
+    formatted = [f"## PubMed Results for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**Authors**: {', '.join(evidence.citation.authors[:3])}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_clinical_trials(query: str, max_results: int = 10) -> str:
+    """Search ClinicalTrials.gov for clinical trial data.
+    Searches the ClinicalTrials.gov database for trials matching your query.
+    Returns trial titles, phases, status, conditions, and interventions.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "diabetes phase 3")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted clinical trial information with NCT IDs, phases, and status
+    """
+    max_results = max(1, min(50, max_results))
+    results = await _trials.search(query, max_results)
+    if not results:
+        return f"No clinical trials found for: {query}"
+    formatted = [f"## Clinical Trials for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_biorxiv(query: str, max_results: int = 10) -> str:
+    """Search bioRxiv/medRxiv for preprint research.
+    Searches bioRxiv and medRxiv preprint servers for cutting-edge research.
+    Note: Preprints are NOT peer-reviewed but contain the latest findings.
+    Args:
+        query: Search query (e.g., "metformin neuroprotection", "long covid treatment")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted preprint results with titles, authors, and abstracts
+    """
+    max_results = max(1, min(50, max_results))
+    results = await _biorxiv.search(query, max_results)
+    if not results:
+        return f"No bioRxiv/medRxiv preprints found for: {query}"
+    formatted = [f"## Preprint Results for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**Authors**: {', '.join(evidence.citation.authors[:3])}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_all_sources(query: str, max_per_source: int = 5) -> str:
+    """Search all biomedical sources simultaneously.
+    Performs parallel search across PubMed, ClinicalTrials.gov, and bioRxiv.
+    This is the most comprehensive search option for drug repurposing research.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "aspirin cancer prevention")
+        max_per_source: Maximum results per source (1-20, default 5)
+    Returns:
+        Combined results from all sources with source labels
+    """
+    import asyncio
+    max_per_source = max(1, min(20, max_per_source))
+    # Run all searches in parallel
+    pubmed_task = search_pubmed(query, max_per_source)
+    trials_task = search_clinical_trials(query, max_per_source)
+    biorxiv_task = search_biorxiv(query, max_per_source)
+    pubmed_results, trials_results, biorxiv_results = await asyncio.gather(
+        pubmed_task, trials_task, biorxiv_task, return_exceptions=True
+    )
+    formatted = [f"# Comprehensive Search: {query}\n"]
+    # Add each result section (handle exceptions gracefully)
+    if isinstance(pubmed_results, str):
+        formatted.append(pubmed_results)
+    else:
+        formatted.append(f"## PubMed\n*Error: {pubmed_results}*\n")
+    if isinstance(trials_results, str):
+        formatted.append(trials_results)
+    else:
+        formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
+    if isinstance(biorxiv_results, str):
+        formatted.append(biorxiv_results)
+    else:
+        formatted.append(f"## Preprints\n*Error: {biorxiv_results}*\n")
+    return "\n---\n".join(formatted)
+```
+### 4.2 Update Gradio App (`src/app.py`)
+```python
+"""Gradio UI for DeepCritical agent with MCP server support."""
+import os
+from collections.abc import AsyncGenerator
+from typing import Any
+import gradio as gr
+from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
+from src.mcp_tools import (
+    search_all_sources,
+    search_biorxiv,
+    search_clinical_trials,
+    search_pubmed,
+)
+from src.orchestrator_factory import create_orchestrator
+from src.tools.biorxiv import BioRxivTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.pubmed import PubMedTool
+from src.tools.search_handler import SearchHandler
+from src.utils.models import OrchestratorConfig
+# ... (existing configure_orchestrator and research_agent functions unchanged)
+def create_demo() -> Any:
+    """
+    Create the Gradio demo interface with MCP support.
+    Returns:
+        Configured Gradio Blocks interface with MCP server enabled
+    """
+    with gr.Blocks(
+        title="DeepCritical - Drug Repurposing Research Agent",
+        theme=gr.themes.Soft(),
+    ) as demo:
+        gr.Markdown("""
+        # DeepCritical
+        ## AI-Powered Drug Repurposing Research Agent
+        Ask questions about potential drug repurposing opportunities.
+        The agent searches PubMed, ClinicalTrials.gov, and bioRxiv/medRxiv preprints.
+        **Example questions:**
+        - "What drugs could be repurposed for Alzheimer's disease?"
+        - "Is metformin effective for cancer treatment?"
+        - "What existing medications show promise for Long COVID?"
+        """)
+        # Main chat interface (existing)
+        gr.ChatInterface(
+            fn=research_agent,
+            type="messages",
+            title="",
+            examples=[
+                "What drugs could be repurposed for Alzheimer's disease?",
+                "Is metformin effective for treating cancer?",
+                "What medications show promise for Long COVID treatment?",
+                "Can statins be repurposed for neurological conditions?",
+            ],
+            additional_inputs=[
+                gr.Radio(
+                    choices=["simple", "magentic"],
+                    value="simple",
+                    label="Orchestrator Mode",
+                    info="Simple: Linear (OpenAI/Anthropic) | Magentic: Multi-Agent (OpenAI)",
+                )
+            ],
+        )
+        # MCP Tool Interfaces (exposed via MCP protocol)
+        gr.Markdown("---\n## MCP Tools (Also Available via Claude Desktop)")
+        with gr.Tab("PubMed Search"):
+            gr.Interface(
+                fn=search_pubmed,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="metformin alzheimer"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_pubmed",
+            )
+        with gr.Tab("Clinical Trials"):
+            gr.Interface(
+                fn=search_clinical_trials,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="diabetes phase 3"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_clinical_trials",
+            )
+        with gr.Tab("Preprints"):
+            gr.Interface(
+                fn=search_biorxiv,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="long covid treatment"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_biorxiv",
+            )
+        with gr.Tab("Search All"):
+            gr.Interface(
+                fn=search_all_sources,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="metformin cancer"),
+                    gr.Slider(1, 20, value=5, step=1, label="Max Per Source"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_all",
+            )
+        gr.Markdown("""
+        ---
+        **Note**: This is a research tool and should not be used for medical decisions.
+        Always consult healthcare professionals for medical advice.
+        Built with PydanticAI + PubMed, ClinicalTrials.gov & bioRxiv
+        **MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
+        """)
+    return demo
+def main() -> None:
+    """Run the Gradio app with MCP server enabled."""
+    demo = create_demo()
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,
+        mcp_server=True,  # Enable MCP server
+    )
+if __name__ == "__main__":
+    main()
+```
+---
+## 5. TDD Test Suite
+### 5.1 Unit Tests (`tests/unit/test_mcp_tools.py`)
+```python
+"""Unit tests for MCP tool wrappers."""
+from unittest.mock import AsyncMock, patch
+import pytest
+from src.mcp_tools import (
+    search_all_sources,
+    search_biorxiv,
+    search_clinical_trials,
+    search_pubmed,
+)
+from src.utils.models import Citation, Evidence
+@pytest.fixture
+def mock_evidence() -> Evidence:
+    """Sample evidence for testing."""
+    return Evidence(
+        content="Metformin shows neuroprotective effects in preclinical models.",
+        citation=Citation(
+            source="pubmed",
+            title="Metformin and Alzheimer's Disease",
+            url="https://pubmed.ncbi.nlm.nih.gov/12345678/",
+            date="2024-01-15",
+            authors=["Smith J", "Jones M", "Brown K"],
+        ),
+        relevance=0.85,
+    )
+class TestSearchPubMed:
+    """Tests for search_pubmed MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_pubmed("metformin alzheimer", 10)
+            assert isinstance(result, str)
+            assert "PubMed Results" in result
+            assert "Metformin and Alzheimer's Disease" in result
+            assert "Smith J" in result
+    @pytest.mark.asyncio
+    async def test_clamps_max_results(self) -> None:
+        """Should clamp max_results to valid range (1-50)."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[])
+            # Test lower bound
+            await search_pubmed("test", 0)
+            mock_tool.search.assert_called_with("test", 1)
+            # Test upper bound
+            await search_pubmed("test", 100)
+            mock_tool.search.assert_called_with("test", 50)
+    @pytest.mark.asyncio
+    async def test_handles_no_results(self) -> None:
+        """Should return appropriate message when no results."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[])
+            result = await search_pubmed("xyznonexistent", 10)
+            assert "No PubMed results found" in result
+class TestSearchClinicalTrials:
+    """Tests for search_clinical_trials MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        mock_evidence.citation.source = "clinicaltrials"  # type: ignore
+        with patch("src.mcp_tools._trials") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_clinical_trials("diabetes", 10)
+            assert isinstance(result, str)
+            assert "Clinical Trials" in result
+class TestSearchBiorxiv:
+    """Tests for search_biorxiv MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        mock_evidence.citation.source = "biorxiv"  # type: ignore
+        with patch("src.mcp_tools._biorxiv") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_biorxiv("preprint search", 10)
+            assert isinstance(result, str)
+            assert "Preprint Results" in result
+class TestSearchAllSources:
+    """Tests for search_all_sources MCP tool."""
+    @pytest.mark.asyncio
+    async def test_combines_all_sources(self, mock_evidence: Evidence) -> None:
+        """Should combine results from all sources."""
+        with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
+             patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
+             patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv:
+            mock_pubmed.return_value = "## PubMed Results"
+            mock_trials.return_value = "## Clinical Trials"
+            mock_biorxiv.return_value = "## Preprints"
+            result = await search_all_sources("metformin", 5)
+            assert "Comprehensive Search" in result
+            assert "PubMed" in result
+            assert "Clinical Trials" in result
+            assert "Preprints" in result
+    @pytest.mark.asyncio
+    async def test_handles_partial_failures(self) -> None:
+        """Should handle partial failures gracefully."""
+        with patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed, \
+             patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials, \
+             patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv:
+            mock_pubmed.return_value = "## PubMed Results"
+            mock_trials.side_effect = Exception("API Error")
+            mock_biorxiv.return_value = "## Preprints"
+            result = await search_all_sources("metformin", 5)
+            # Should still contain working sources
+            assert "PubMed" in result
+            assert "Preprints" in result
+            # Should show error for failed source
+            assert "Error" in result
+class TestMCPDocstrings:
+    """Tests that docstrings follow MCP format."""
+    def test_search_pubmed_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_pubmed.__doc__ is not None
+        assert "Args:" in search_pubmed.__doc__
+        assert "query:" in search_pubmed.__doc__
+        assert "max_results:" in search_pubmed.__doc__
+        assert "Returns:" in search_pubmed.__doc__
+    def test_search_clinical_trials_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_clinical_trials.__doc__ is not None
+        assert "Args:" in search_clinical_trials.__doc__
+    def test_search_biorxiv_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_biorxiv.__doc__ is not None
+        assert "Args:" in search_biorxiv.__doc__
+    def test_search_all_sources_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_all_sources.__doc__ is not None
+        assert "Args:" in search_all_sources.__doc__
+class TestMCPTypeHints:
+    """Tests that type hints are complete for MCP."""
+    def test_search_pubmed_type_hints(self) -> None:
+        """All parameters and return must have type hints."""
+        import inspect
+        sig = inspect.signature(search_pubmed)
+        # Check parameter hints
+        assert sig.parameters["query"].annotation == str
+        assert sig.parameters["max_results"].annotation == int
+        # Check return hint
+        assert sig.return_annotation == str
+    def test_search_clinical_trials_type_hints(self) -> None:
+        """All parameters and return must have type hints."""
+        import inspect
+        sig = inspect.signature(search_clinical_trials)
+        assert sig.parameters["query"].annotation == str
+        assert sig.parameters["max_results"].annotation == int
+        assert sig.return_annotation == str
+```
+### 5.2 Integration Test (`tests/integration/test_mcp_server.py`)
+```python
+"""Integration tests for MCP server functionality."""
+import pytest
+class TestMCPServerIntegration:
+    """Integration tests for MCP server (requires running app)."""
+    @pytest.mark.integration
+    @pytest.mark.asyncio
+    async def test_mcp_tools_work_end_to_end(self) -> None:
+        """Test that MCP tools execute real searches."""
+        from src.mcp_tools import search_pubmed
+        result = await search_pubmed("metformin diabetes", 3)
+        assert isinstance(result, str)
+        assert "PubMed Results" in result
+        # Should have actual content (not just "no results")
+        assert len(result) > 100
+```
+---
+## 6. Claude Desktop Configuration
+### 6.1 Local Development
+```json
+// ~/.config/claude/claude_desktop_config.json (Linux/Mac)
+// %APPDATA%\Claude\claude_desktop_config.json (Windows)
+{
+  "mcpServers": {
+    "deepcritical": {
+      "url": "http://localhost:7860/gradio_api/mcp/"
+    }
+  }
+}
+```
+### 6.2 HuggingFace Spaces
+```json
+{
+  "mcpServers": {
+    "deepcritical": {
+      "url": "https://MCP-1st-Birthday-deepcritical.hf.space/gradio_api/mcp/"
+    }
+  }
+}
+```
+### 6.3 Private Spaces (with auth)
+```json
+{
+  "mcpServers": {
+    "deepcritical": {
+      "url": "https://your-space.hf.space/gradio_api/mcp/",
+      "headers": {
+        "Authorization": "Bearer hf_xxxxxxxxxxxxx"
+      }
+    }
+  }
+}
+```
+---
+## 7. Verification Commands
+```bash
+# 1. Install MCP extras
+uv add "gradio[mcp]>=5.0.0"
+# 2. Run unit tests
+uv run pytest tests/unit/test_mcp_tools.py -v
+# 3. Run full test suite
+make check
+# 4. Start server with MCP
+uv run python src/app.py
+# 5. Verify MCP schema (in another terminal)
+curl http://localhost:7860/gradio_api/mcp/schema
+# 6. Test with MCP Inspector
+npx @anthropic/mcp-inspector http://localhost:7860/gradio_api/mcp/
+# 7. Integration test (requires running server)
+uv run pytest tests/integration/test_mcp_server.py -v -m integration
+```
+---
+## 8. Definition of Done
+Phase 12 is **COMPLETE** when:
+- [ ] `src/mcp_tools.py` created with all 4 MCP tools
+- [ ] `src/app.py` updated with `mcp_server=True`
+- [ ] Unit tests in `tests/unit/test_mcp_tools.py`
+- [ ] Integration test in `tests/integration/test_mcp_server.py`
+- [ ] `pyproject.toml` updated with `gradio[mcp]`
+- [ ] MCP schema accessible at `/gradio_api/mcp/schema`
+- [ ] Claude Desktop can connect and use tools
+- [ ] All unit tests pass
+- [ ] Lints pass
+---
+## 9. Demo Script for Judges
+### Show MCP Integration Works
+1. **Start the server**:
+   ```bash
+   uv run python src/app.py
+   ```
+2. **Show Claude Desktop using our tools**:
+   - Open Claude Desktop with DeepCritical MCP configured
+   - Ask: "Search PubMed for metformin Alzheimer's"
+   - Show real results appearing
+   - Ask: "Now search clinical trials for the same"
+   - Show combined analysis
+3. **Show MCP Inspector**:
+   ```bash
+   npx @anthropic/mcp-inspector http://localhost:7860/gradio_api/mcp/
+   ```
+   - Show all 4 tools listed
+   - Execute `search_pubmed` from inspector
+   - Show results
+---
+## 10. Value Delivered
+| Before | After |
+|--------|-------|
+| Tools only usable in our app | Tools usable by ANY MCP client |
+| Not Track 2 compliant | **FULLY TRACK 2 COMPLIANT** |
+| Can't use with Claude Desktop | Full Claude Desktop integration |
+**Prize Impact**:
+- Without MCP: **Disqualified from Track 2**
+- With MCP: **Eligible for $2,500 1st place**
+---
+## 11. Files to Create/Modify
+| File | Action | Purpose |
+|------|--------|---------|
+| `src/mcp_tools.py` | CREATE | MCP tool wrapper functions |
+| `src/app.py` | MODIFY | Add `mcp_server=True`, add tool tabs |
+| `pyproject.toml` | MODIFY | Add `gradio[mcp]>=5.0.0` |
+| `tests/unit/test_mcp_tools.py` | CREATE | Unit tests for MCP tools |
+| `tests/integration/test_mcp_server.py` | CREATE | Integration tests |
+| `README.md` | MODIFY | Add MCP usage instructions |
+---
+## 12. Architecture After Phase 12
+```text
+┌────────────────────────────────────────────────────────────────┐
+│                      Claude Desktop / Cursor                   │
+│                           (MCP Client)                         │
+└─────────────────────────────┬──────────────────────────────────┘
+                              │ MCP Protocol
+                              ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                        Gradio MCP Server                        │
+│                  /gradio_api/mcp/                               │
+│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌─────────┐ │
+│  │search_pubmed │ │search_trials │ │search_biorxiv│ │search_  │ │
+│  │              │ │              │ │              │ │all      │ │
+│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └────┬────┘ │
+└─────────┼────────────────┼────────────────┼──────────────┼──────┘
+          │                │                │              │
+          ▼                ▼                ▼              ▼
+   ┌──────────┐     ┌──────────┐     ┌──────────┐    (calls all)
+   │PubMedTool│     │Trials    │     │BioRxiv   │
+   │          │     │Tool      │     │Tool      │
+   └──────────┘     └──────────┘     └──────────┘
+```
+**This is the MCP compliance stack.**

docs/implementation/13_phase_modal_integration.md ADDED Viewed

	@@ -0,0 +1,1195 @@

+# Phase 13 Implementation Spec: Modal Pipeline Integration
+**Goal**: Wire existing Modal code execution into the agent pipeline.
+**Philosophy**: "Sandboxed execution makes AI-generated code trustworthy."
+**Prerequisite**: Phase 12 complete (MCP server working)
+**Priority**: P1 - HIGH VALUE ($2,500 Modal Innovation Award)
+**Estimated Time**: 2-3 hours
+---
+## 1. Why Modal Integration?
+### Current State Analysis
+Mario already implemented `src/tools/code_execution.py`:
+| Component | Status | Notes |
+|-----------|--------|-------|
+| `ModalCodeExecutor` class | Built | Executes Python in Modal sandbox |
+| `SANDBOX_LIBRARIES` | Defined | pandas, numpy, scipy, etc. |
+| `execute()` method | Implemented | Stdout/stderr capture |
+| `execute_with_return()` | Implemented | Returns `result` variable |
+| `AnalysisAgent` | Built | Uses Modal for statistical analysis |
+| **Pipeline Integration** | **MISSING** | Not wired into main orchestrator |
+### What's Missing
+```text
+Current Flow:
+  User Query → Orchestrator → Search → Judge → [Report] → Done
+With Modal:
+  User Query → Orchestrator → Search → Judge → [Analysis*] → Report → Done
+                                                    ↓
+                                          Modal Sandbox Execution
+```
+*The AnalysisAgent exists but is NOT called by either orchestrator.
+---
+## 2. Critical Dependency Analysis
+### The Problem (Senior Feedback)
+```python
+# src/agents/analysis_agent.py - Line 8
+from agent_framework import (
+    AgentRunResponse,
+    BaseAgent,
+    ...
+)
+```
+```toml
+# pyproject.toml - agent-framework is OPTIONAL
+[project.optional-dependencies]
+magentic = [
+    "agent-framework-core",
+]
+```
+**If we import `AnalysisAgent` in the simple orchestrator without the `magentic` extra installed, the app CRASHES on startup.**
+### The SOLID Solution
+**Single Responsibility Principle**: Decouple Modal execution logic from `agent_framework`.
+```text
+BEFORE (Coupled):
+  AnalysisAgent (requires agent_framework)
+       ↓
+  ModalCodeExecutor
+AFTER (Decoupled):
+  StatisticalAnalyzer (no agent_framework dependency)  ← Simple mode uses this
+       ↓
+  ModalCodeExecutor
+       ↑
+  AnalysisAgent (wraps StatisticalAnalyzer)  ← Magentic mode uses this
+```
+**Key insight**: Create `src/services/statistical_analyzer.py` with ZERO agent_framework imports.
+---
+## 3. Prize Opportunity
+### Modal Innovation Award: $2,500
+**Judging Criteria**:
+1. **Sandbox Isolation** - Code runs in container, not local
+2. **Scientific Computing** - Real pandas/scipy analysis
+3. **Safety** - Can't access local filesystem
+4. **Speed** - Modal's fast cold starts
+### What We Need to Show
+```python
+# LLM generates analysis code
+code = """
+import pandas as pd
+import scipy.stats as stats
+data = pd.DataFrame({
+    'study': ['Study1', 'Study2', 'Study3'],
+    'effect_size': [0.45, 0.52, 0.38],
+    'sample_size': [120, 85, 200]
+})
+weighted_mean = (data['effect_size'] * data['sample_size']).sum() / data['sample_size'].sum()
+t_stat, p_value = stats.ttest_1samp(data['effect_size'], 0)
+print(f"Weighted Effect Size: {weighted_mean:.3f}")
+print(f"P-value: {p_value:.4f}")
+result = "SUPPORTED" if p_value < 0.05 else "INCONCLUSIVE"
+"""
+# Executed SAFELY in Modal sandbox
+executor = get_code_executor()
+output = executor.execute(code)  # Runs in isolated container!
+```
+---
+## 4. Technical Specification
+### 4.1 Dependencies
+```toml
+# pyproject.toml - NO CHANGES to dependencies
+# StatisticalAnalyzer uses only:
+#   - pydantic-ai (already in main deps)
+#   - modal (already in main deps)
+#   - src.tools.code_execution (no agent_framework)
+```
+### 4.2 Environment Variables
+```bash
+# .env
+MODAL_TOKEN_ID=your-token-id
+MODAL_TOKEN_SECRET=your-token-secret
+```
+### 4.3 Integration Points
+| Integration Point | File | Change Required |
+|-------------------|------|-----------------|
+| New Service | `src/services/statistical_analyzer.py` | CREATE (no agent_framework) |
+| Simple Orchestrator | `src/orchestrator.py` | Use `StatisticalAnalyzer` |
+| Config | `src/utils/config.py` | Add `enable_modal_analysis` setting |
+| AnalysisAgent | `src/agents/analysis_agent.py` | Refactor to wrap `StatisticalAnalyzer` |
+| MCP Tool | `src/mcp_tools.py` | Add `analyze_hypothesis` tool |
+---
+## 5. Implementation
+### 5.1 Configuration Update (`src/utils/config.py`)
+```python
+class Settings(BaseSettings):
+    # ... existing settings ...
+    # Modal Configuration
+    modal_token_id: str | None = None
+    modal_token_secret: str | None = None
+    enable_modal_analysis: bool = False  # Opt-in for hackathon demo
+    @property
+    def modal_available(self) -> bool:
+        """Check if Modal credentials are configured."""
+        return bool(self.modal_token_id and self.modal_token_secret)
+```
+### 5.2 StatisticalAnalyzer Service (`src/services/statistical_analyzer.py`)
+**This is the key fix - NO agent_framework imports.**
+```python
+"""Statistical analysis service using Modal code execution.
+This module provides Modal-based statistical analysis WITHOUT depending on
+agent_framework. This allows it to be used in the simple orchestrator mode
+without requiring the magentic optional dependency.
+The AnalysisAgent (in src/agents/) wraps this service for magentic mode.
+"""
+import asyncio
+import re
+from functools import partial
+from typing import Any
+from pydantic import BaseModel, Field
+from pydantic_ai import Agent
+from src.agent_factory.judges import get_model
+from src.tools.code_execution import (
+    CodeExecutionError,
+    get_code_executor,
+    get_sandbox_library_prompt,
+)
+from src.utils.models import Evidence
+class AnalysisResult(BaseModel):
+    """Result of statistical analysis."""
+    verdict: str = Field(
+        description="SUPPORTED, REFUTED, or INCONCLUSIVE",
+    )
+    confidence: float = Field(ge=0.0, le=1.0, description="Confidence in verdict (0-1)")
+    statistical_evidence: str = Field(
+        description="Summary of statistical findings from code execution"
+    )
+    code_generated: str = Field(description="Python code that was executed")
+    execution_output: str = Field(description="Output from code execution")
+    key_findings: list[str] = Field(default_factory=list, description="Key takeaways")
+    limitations: list[str] = Field(default_factory=list, description="Limitations")
+class StatisticalAnalyzer:
+    """Performs statistical analysis using Modal code execution.
+    This service:
+    1. Generates Python code for statistical analysis using LLM
+    2. Executes code in Modal sandbox
+    3. Interprets results
+    4. Returns verdict (SUPPORTED/REFUTED/INCONCLUSIVE)
+    Note: This class has NO agent_framework dependency, making it safe
+    to use in the simple orchestrator without the magentic extra.
+    """
+    def __init__(self) -> None:
+        """Initialize the analyzer."""
+        self._code_executor: Any = None
+        self._agent: Agent[None, str] | None = None
+    def _get_code_executor(self) -> Any:
+        """Lazy initialization of code executor."""
+        if self._code_executor is None:
+            self._code_executor = get_code_executor()
+        return self._code_executor
+    def _get_agent(self) -> Agent[None, str]:
+        """Lazy initialization of LLM agent for code generation."""
+        if self._agent is None:
+            library_versions = get_sandbox_library_prompt()
+            self._agent = Agent(
+                model=get_model(),
+                output_type=str,
+                system_prompt=f"""You are a biomedical data scientist.
+Generate Python code to analyze research evidence and test hypotheses.
+Guidelines:
+1. Use pandas, numpy, scipy.stats for analysis
+2. Print clear, interpretable results
+3. Include statistical tests (t-tests, chi-square, etc.)
+4. Calculate effect sizes and confidence intervals
+5. Keep code concise (<50 lines)
+6. Set 'result' variable to SUPPORTED, REFUTED, or INCONCLUSIVE
+Available libraries:
+{library_versions}
+Output format: Return ONLY executable Python code, no explanations.""",
+            )
+        return self._agent
+    async def analyze(
+        self,
+        query: str,
+        evidence: list[Evidence],
+        hypothesis: dict[str, Any] | None = None,
+    ) -> AnalysisResult:
+        """Run statistical analysis on evidence.
+        Args:
+            query: The research question
+            evidence: List of Evidence objects to analyze
+            hypothesis: Optional hypothesis dict with drug, target, pathway, effect
+        Returns:
+            AnalysisResult with verdict and statistics
+        """
+        # Build analysis prompt
+        evidence_summary = self._summarize_evidence(evidence[:10])
+        hypothesis_text = ""
+        if hypothesis:
+            hypothesis_text = f"""
+Hypothesis: {hypothesis.get('drug', 'Unknown')} → {hypothesis.get('target', '?')} → {hypothesis.get('pathway', '?')} → {hypothesis.get('effect', '?')}
+Confidence: {hypothesis.get('confidence', 0.5):.0%}
+"""
+        prompt = f"""Generate Python code to statistically analyze:
+**Research Question**: {query}
+{hypothesis_text}
+**Evidence Summary**:
+{evidence_summary}
+Generate executable Python code to analyze this evidence."""
+        try:
+            # Generate code
+            agent = self._get_agent()
+            code_result = await agent.run(prompt)
+            generated_code = code_result.output
+            # Execute in Modal sandbox
+            loop = asyncio.get_running_loop()
+            executor = self._get_code_executor()
+            execution = await loop.run_in_executor(
+                None, partial(executor.execute, generated_code, timeout=120)
+            )
+            if not execution["success"]:
+                return AnalysisResult(
+                    verdict="INCONCLUSIVE",
+                    confidence=0.0,
+                    statistical_evidence=f"Execution failed: {execution['error']}",
+                    code_generated=generated_code,
+                    execution_output=execution.get("stderr", ""),
+                    key_findings=[],
+                    limitations=["Code execution failed"],
+                )
+            # Interpret results
+            return self._interpret_results(generated_code, execution)
+        except CodeExecutionError as e:
+            return AnalysisResult(
+                verdict="INCONCLUSIVE",
+                confidence=0.0,
+                statistical_evidence=str(e),
+                code_generated="",
+                execution_output="",
+                key_findings=[],
+                limitations=[f"Analysis error: {e}"],
+            )
+    def _summarize_evidence(self, evidence: list[Evidence]) -> str:
+        """Summarize evidence for code generation prompt."""
+        if not evidence:
+            return "No evidence available."
+        lines = []
+        for i, ev in enumerate(evidence[:5], 1):
+            lines.append(f"{i}. {ev.content[:200]}...")
+            lines.append(f"   Source: {ev.citation.title}")
+            lines.append(f"   Relevance: {ev.relevance:.0%}\n")
+        return "\n".join(lines)
+    def _interpret_results(
+        self,
+        code: str,
+        execution: dict[str, Any],
+    ) -> AnalysisResult:
+        """Interpret code execution results."""
+        stdout = execution["stdout"]
+        stdout_upper = stdout.upper()
+        # Extract verdict with robust word-boundary matching
+        verdict = "INCONCLUSIVE"
+        if re.search(r"\bSUPPORTED\b", stdout_upper) and not re.search(
+            r"\b(?:NOT|UN)SUPPORTED\b", stdout_upper
+        ):
+            verdict = "SUPPORTED"
+        elif re.search(r"\bREFUTED\b", stdout_upper):
+            verdict = "REFUTED"
+        # Extract key findings
+        key_findings = []
+        for line in stdout.split("\n"):
+            line_lower = line.lower()
+            if any(kw in line_lower for kw in ["p-value", "significant", "effect", "mean"]):
+                key_findings.append(line.strip())
+        # Calculate confidence from p-values
+        confidence = self._calculate_confidence(stdout)
+        return AnalysisResult(
+            verdict=verdict,
+            confidence=confidence,
+            statistical_evidence=stdout.strip(),
+            code_generated=code,
+            execution_output=stdout,
+            key_findings=key_findings[:5],
+            limitations=[
+                "Analysis based on summary data only",
+                "Limited to available evidence",
+                "Statistical tests assume data independence",
+            ],
+        )
+    def _calculate_confidence(self, output: str) -> float:
+        """Calculate confidence based on statistical results."""
+        p_values = re.findall(r"p[-\s]?value[:\s]+(\d+\.?\d*)", output.lower())
+        if p_values:
+            try:
+                min_p = min(float(p) for p in p_values)
+                if min_p < 0.001:
+                    return 0.95
+                elif min_p < 0.01:
+                    return 0.90
+                elif min_p < 0.05:
+                    return 0.80
+                else:
+                    return 0.60
+            except ValueError:
+                pass
+        return 0.70  # Default
+# Singleton for reuse
+_analyzer: StatisticalAnalyzer | None = None
+def get_statistical_analyzer() -> StatisticalAnalyzer:
+    """Get or create singleton StatisticalAnalyzer instance."""
+    global _analyzer
+    if _analyzer is None:
+        _analyzer = StatisticalAnalyzer()
+    return _analyzer
+```
+### 5.3 Simple Orchestrator Update (`src/orchestrator.py`)
+**Uses `StatisticalAnalyzer` directly - NO agent_framework import.**
+```python
+"""Main orchestrator with optional Modal analysis."""
+from src.utils.config import settings
+# ... existing imports ...
+class Orchestrator:
+    """Search-Judge-Analyze orchestration loop."""
+    def __init__(
+        self,
+        search_handler: SearchHandlerProtocol,
+        judge_handler: JudgeHandlerProtocol,
+        config: OrchestratorConfig | None = None,
+        enable_analysis: bool = False,  # New parameter
+    ) -> None:
+        self.search = search_handler
+        self.judge = judge_handler
+        self.config = config or OrchestratorConfig()
+        self.history: list[dict[str, Any]] = []
+        self._enable_analysis = enable_analysis and settings.modal_available
+        # Lazy-load analysis (NO agent_framework dependency!)
+        self._analyzer: Any = None
+    def _get_analyzer(self) -> Any:
+        """Lazy initialization of StatisticalAnalyzer.
+        Note: This imports from src.services, NOT src.agents,
+        so it works without the magentic optional dependency.
+        """
+        if self._analyzer is None:
+            from src.services.statistical_analyzer import get_statistical_analyzer
+            self._analyzer = get_statistical_analyzer()
+        return self._analyzer
+    async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
+        """Main orchestration loop with optional Modal analysis."""
+        # ... existing search/judge loop ...
+        # After judge says "synthesize", optionally run analysis
+        if self._enable_analysis and assessment.recommendation == "synthesize":
+            yield AgentEvent(
+                type="analyzing",
+                message="Running statistical analysis in Modal sandbox...",
+                data={},
+                iteration=iteration,
+            )
+            try:
+                analyzer = self._get_analyzer()
+                # Run Modal analysis (no agent_framework needed!)
+                analysis_result = await analyzer.analyze(
+                    query=query,
+                    evidence=all_evidence,
+                    hypothesis=None,  # Could add hypothesis generation later
+                )
+                yield AgentEvent(
+                    type="analysis_complete",
+                    message=f"Analysis verdict: {analysis_result.verdict}",
+                    data=analysis_result.model_dump(),
+                    iteration=iteration,
+                )
+            except Exception as e:
+                yield AgentEvent(
+                    type="error",
+                    message=f"Modal analysis failed: {e}",
+                    data={"error": str(e)},
+                    iteration=iteration,
+                )
+        # Continue to synthesis...
+```
+### 5.4 Refactor AnalysisAgent (`src/agents/analysis_agent.py`)
+**Wrap `StatisticalAnalyzer` for magentic mode.**
+```python
+"""Analysis agent for statistical analysis using Modal code execution.
+This agent wraps StatisticalAnalyzer for use in magentic multi-agent mode.
+The core logic is in src/services/statistical_analyzer.py to avoid
+coupling agent_framework to the simple orchestrator.
+"""
+from collections.abc import AsyncIterable
+from typing import TYPE_CHECKING, Any
+from agent_framework import (
+    AgentRunResponse,
+    AgentRunResponseUpdate,
+    AgentThread,
+    BaseAgent,
+    ChatMessage,
+    Role,
+)
+from src.services.statistical_analyzer import (
+    AnalysisResult,
+    get_statistical_analyzer,
+)
+from src.utils.models import Evidence
+if TYPE_CHECKING:
+    from src.services.embeddings import EmbeddingService
+class AnalysisAgent(BaseAgent):  # type: ignore[misc]
+    """Wraps StatisticalAnalyzer for magentic multi-agent mode."""
+    def __init__(
+        self,
+        evidence_store: dict[str, Any],
+        embedding_service: "EmbeddingService | None" = None,
+    ) -> None:
+        super().__init__(
+            name="AnalysisAgent",
+            description="Performs statistical analysis using Modal sandbox",
+        )
+        self._evidence_store = evidence_store
+        self._embeddings = embedding_service
+        self._analyzer = get_statistical_analyzer()
+    async def run(
+        self,
+        messages: str | ChatMessage | list[str] | list[ChatMessage] | None = None,
+        *,
+        thread: AgentThread | None = None,
+        **kwargs: Any,
+    ) -> AgentRunResponse:
+        """Analyze evidence and return verdict."""
+        query = self._extract_query(messages)
+        hypotheses = self._evidence_store.get("hypotheses", [])
+        evidence = self._evidence_store.get("current", [])
+        if not evidence:
+            return self._error_response("No evidence available.")
+        # Get primary hypothesis if available
+        hypothesis_dict = None
+        if hypotheses:
+            h = hypotheses[0]
+            hypothesis_dict = {
+                "drug": getattr(h, "drug", "Unknown"),
+                "target": getattr(h, "target", "?"),
+                "pathway": getattr(h, "pathway", "?"),
+                "effect": getattr(h, "effect", "?"),
+                "confidence": getattr(h, "confidence", 0.5),
+            }
+        # Delegate to StatisticalAnalyzer
+        result = await self._analyzer.analyze(
+            query=query,
+            evidence=evidence,
+            hypothesis=hypothesis_dict,
+        )
+        # Store in shared context
+        self._evidence_store["analysis"] = result.model_dump()
+        # Format response
+        response_text = self._format_response(result)
+        return AgentRunResponse(
+            messages=[ChatMessage(role=Role.ASSISTANT, text=response_text)],
+            response_id=f"analysis-{result.verdict.lower()}",
+            additional_properties={"analysis": result.model_dump()},
+        )
+    def _format_response(self, result: AnalysisResult) -> str:
+        """Format analysis result as markdown."""
+        lines = [
+            "## Statistical Analysis Complete\n",
+            f"### Verdict: **{result.verdict}**",
+            f"**Confidence**: {result.confidence:.0%}\n",
+            "### Key Findings",
+        ]
+        for finding in result.key_findings:
+            lines.append(f"- {finding}")
+        lines.extend([
+            "\n### Statistical Evidence",
+            "```",
+            result.statistical_evidence,
+            "```",
+        ])
+        return "\n".join(lines)
+    def _error_response(self, message: str) -> AgentRunResponse:
+        """Create error response."""
+        return AgentRunResponse(
+            messages=[ChatMessage(role=Role.ASSISTANT, text=f"**Error**: {message}")],
+            response_id="analysis-error",
+        )
+    def _extract_query(
+        self, messages: str | ChatMessage | list[str] | list[ChatMessage] | None
+    ) -> str:
+        """Extract query from messages."""
+        if isinstance(messages, str):
+            return messages
+        elif isinstance(messages, ChatMessage):
+            return messages.text or ""
+        elif isinstance(messages, list):
+            for msg in reversed(messages):
+                if isinstance(msg, ChatMessage) and msg.role == Role.USER:
+                    return msg.text or ""
+                elif isinstance(msg, str):
+                    return msg
+        return ""
+    async def run_stream(
+        self,
+        messages: str | ChatMessage | list[str] | list[ChatMessage] | None = None,
+        *,
+        thread: AgentThread | None = None,
+        **kwargs: Any,
+    ) -> AsyncIterable[AgentRunResponseUpdate]:
+        """Streaming wrapper."""
+        result = await self.run(messages, thread=thread, **kwargs)
+        yield AgentRunResponseUpdate(messages=result.messages, response_id=result.response_id)
+```
+### 5.5 MCP Tool for Modal Analysis (`src/mcp_tools.py`)
+Add to existing MCP tools:
+```python
+async def analyze_hypothesis(
+    drug: str,
+    condition: str,
+    evidence_summary: str,
+) -> str:
+    """Perform statistical analysis of drug repurposing hypothesis using Modal.
+    Executes AI-generated Python code in a secure Modal sandbox to analyze
+    the statistical evidence for a drug repurposing hypothesis.
+    Args:
+        drug: The drug being evaluated (e.g., "metformin")
+        condition: The target condition (e.g., "Alzheimer's disease")
+        evidence_summary: Summary of evidence to analyze
+    Returns:
+        Analysis result with verdict (SUPPORTED/REFUTED/INCONCLUSIVE) and statistics
+    """
+    from src.services.statistical_analyzer import get_statistical_analyzer
+    from src.utils.config import settings
+    from src.utils.models import Citation, Evidence
+    if not settings.modal_available:
+        return "Error: Modal credentials not configured. Set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET."
+    # Create evidence from summary
+    evidence = [
+        Evidence(
+            content=evidence_summary,
+            citation=Citation(
+                source="pubmed",
+                title=f"Evidence for {drug} in {condition}",
+                url="https://example.com",
+                date="2024-01-01",
+                authors=["User Provided"],
+            ),
+            relevance=0.9,
+        )
+    ]
+    analyzer = get_statistical_analyzer()
+    result = await analyzer.analyze(
+        query=f"Can {drug} treat {condition}?",
+        evidence=evidence,
+        hypothesis={"drug": drug, "target": "unknown", "pathway": "unknown", "effect": condition},
+    )
+    return f"""## Statistical Analysis: {drug} for {condition}
+### Verdict: **{result.verdict}**
+**Confidence**: {result.confidence:.0%}
+### Key Findings
+{chr(10).join(f"- {f}" for f in result.key_findings) or "- No specific findings extracted"}
+### Execution Output
+```
+{result.execution_output}
+```
+### Generated Code
+```python
+{result.code_generated}
+```
+**Executed in Modal Sandbox** - Isolated, secure, reproducible.
+"""
+```
+### 5.6 Demo Scripts
+#### `examples/modal_demo/verify_sandbox.py`
+```python
+#!/usr/bin/env python3
+"""Verify that Modal sandbox is properly isolated.
+This script proves to judges that code runs in Modal, not locally.
+NO agent_framework dependency - uses only src.tools.code_execution.
+Usage:
+    uv run python examples/modal_demo/verify_sandbox.py
+"""
+import asyncio
+from functools import partial
+from src.tools.code_execution import get_code_executor
+from src.utils.config import settings
+async def main() -> None:
+    """Verify Modal sandbox isolation."""
+    if not settings.modal_available:
+        print("Error: Modal credentials not configured.")
+        print("Set MODAL_TOKEN_ID and MODAL_TOKEN_SECRET in .env")
+        return
+    executor = get_code_executor()
+    loop = asyncio.get_running_loop()
+    print("=" * 60)
+    print("Modal Sandbox Isolation Verification")
+    print("=" * 60 + "\n")
+    # Test 1: Hostname
+    print("Test 1: Check hostname (should NOT be your machine)")
+    code1 = "import socket; print(f'Hostname: {socket.gethostname()}')"
+    result1 = await loop.run_in_executor(None, partial(executor.execute, code1))
+    print(f"  {result1['stdout'].strip()}\n")
+    # Test 2: Scientific libraries
+    print("Test 2: Verify scientific libraries")
+    code2 = """
+import pandas as pd
+import numpy as np
+import scipy
+print(f"pandas: {pd.__version__}")
+print(f"numpy: {np.__version__}")
+print(f"scipy: {scipy.__version__}")
+"""
+    result2 = await loop.run_in_executor(None, partial(executor.execute, code2))
+    print(f"  {result2['stdout'].strip()}\n")
+    # Test 3: Network blocked
+    print("Test 3: Verify network isolation")
+    code3 = """
+import urllib.request
+try:
+    urllib.request.urlopen("https://google.com", timeout=2)
+    print("Network: ALLOWED (unexpected!)")
+except Exception:
+    print("Network: BLOCKED (as expected)")
+"""
+    result3 = await loop.run_in_executor(None, partial(executor.execute, code3))
+    print(f"  {result3['stdout'].strip()}\n")
+    # Test 4: Real statistics
+    print("Test 4: Execute statistical analysis")
+    code4 = """
+import pandas as pd
+import scipy.stats as stats
+data = pd.DataFrame({'effect': [0.42, 0.38, 0.51]})
+mean = data['effect'].mean()
+t_stat, p_val = stats.ttest_1samp(data['effect'], 0)
+print(f"Mean Effect: {mean:.3f}")
+print(f"P-value: {p_val:.4f}")
+print(f"Verdict: {'SUPPORTED' if p_val < 0.05 else 'INCONCLUSIVE'}")
+"""
+    result4 = await loop.run_in_executor(None, partial(executor.execute, code4))
+    print(f"  {result4['stdout'].strip()}\n")
+    print("=" * 60)
+    print("All tests complete - Modal sandbox verified!")
+    print("=" * 60)
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+#### `examples/modal_demo/run_analysis.py`
+```python
+#!/usr/bin/env python3
+"""Demo: Modal-powered statistical analysis.
+This script uses StatisticalAnalyzer directly (NO agent_framework dependency).
+Usage:
+    uv run python examples/modal_demo/run_analysis.py "metformin alzheimer"
+"""
+import argparse
+import asyncio
+import os
+import sys
+from src.services.statistical_analyzer import get_statistical_analyzer
+from src.tools.pubmed import PubMedTool
+from src.utils.config import settings
+async def main() -> None:
+    """Run the Modal analysis demo."""
+    parser = argparse.ArgumentParser(description="Modal Analysis Demo")
+    parser.add_argument("query", help="Research query")
+    args = parser.parse_args()
+    if not settings.modal_available:
+        print("Error: Modal credentials not configured.")
+        sys.exit(1)
+    if not (os.getenv("OPENAI_API_KEY") or os.getenv("ANTHROPIC_API_KEY")):
+        print("Error: No LLM API key found.")
+        sys.exit(1)
+    print(f"\n{'=' * 60}")
+    print("DeepCritical Modal Analysis Demo")
+    print(f"Query: {args.query}")
+    print(f"{'=' * 60}\n")
+    # Step 1: Gather Evidence
+    print("Step 1: Gathering evidence from PubMed...")
+    pubmed = PubMedTool()
+    evidence = await pubmed.search(args.query, max_results=5)
+    print(f"  Found {len(evidence)} papers\n")
+    # Step 2: Run Modal Analysis
+    print("Step 2: Running statistical analysis in Modal sandbox...")
+    analyzer = get_statistical_analyzer()
+    result = await analyzer.analyze(query=args.query, evidence=evidence)
+    # Step 3: Display Results
+    print("\n" + "=" * 60)
+    print("ANALYSIS RESULTS")
+    print("=" * 60)
+    print(f"\nVerdict: {result.verdict}")
+    print(f"Confidence: {result.confidence:.0%}")
+    print("\nKey Findings:")
+    for finding in result.key_findings:
+        print(f"  - {finding}")
+    print("\n[Demo Complete - Code executed in Modal, not locally]")
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+---
+## 6. TDD Test Suite
+### 6.1 Unit Tests (`tests/unit/services/test_statistical_analyzer.py`)
+```python
+"""Unit tests for StatisticalAnalyzer service."""
+from unittest.mock import AsyncMock, MagicMock, patch
+import pytest
+from src.services.statistical_analyzer import (
+    AnalysisResult,
+    StatisticalAnalyzer,
+    get_statistical_analyzer,
+)
+from src.utils.models import Citation, Evidence
+@pytest.fixture
+def sample_evidence() -> list[Evidence]:
+    """Sample evidence for testing."""
+    return [
+        Evidence(
+            content="Metformin shows effect size of 0.45.",
+            citation=Citation(
+                source="pubmed",
+                title="Metformin Study",
+                url="https://pubmed.ncbi.nlm.nih.gov/12345/",
+                date="2024-01-15",
+                authors=["Smith J"],
+            ),
+            relevance=0.9,
+        )
+    ]
+class TestStatisticalAnalyzer:
+    """Tests for StatisticalAnalyzer (no agent_framework dependency)."""
+    def test_no_agent_framework_import(self) -> None:
+        """StatisticalAnalyzer must NOT import agent_framework."""
+        import src.services.statistical_analyzer as module
+        # Check module doesn't import agent_framework
+        source = open(module.__file__).read()
+        assert "agent_framework" not in source
+        assert "BaseAgent" not in source
+    @pytest.mark.asyncio
+    async def test_analyze_returns_result(
+        self, sample_evidence: list[Evidence]
+    ) -> None:
+        """analyze() should return AnalysisResult."""
+        analyzer = StatisticalAnalyzer()
+        with patch.object(analyzer, "_get_agent") as mock_agent, \
+             patch.object(analyzer, "_get_code_executor") as mock_executor:
+            # Mock LLM
+            mock_agent.return_value.run = AsyncMock(
+                return_value=MagicMock(output="print('SUPPORTED')")
+            )
+            # Mock Modal
+            mock_executor.return_value.execute.return_value = {
+                "stdout": "SUPPORTED\np-value: 0.01",
+                "stderr": "",
+                "success": True,
+            }
+            result = await analyzer.analyze("test query", sample_evidence)
+            assert isinstance(result, AnalysisResult)
+            assert result.verdict == "SUPPORTED"
+    def test_singleton(self) -> None:
+        """get_statistical_analyzer should return singleton."""
+        a1 = get_statistical_analyzer()
+        a2 = get_statistical_analyzer()
+        assert a1 is a2
+class TestAnalysisResult:
+    """Tests for AnalysisResult model."""
+    def test_verdict_values(self) -> None:
+        """Verdict should be one of the expected values."""
+        for verdict in ["SUPPORTED", "REFUTED", "INCONCLUSIVE"]:
+            result = AnalysisResult(
+                verdict=verdict,
+                confidence=0.8,
+                statistical_evidence="test",
+                code_generated="print('test')",
+                execution_output="test",
+            )
+            assert result.verdict == verdict
+    def test_confidence_bounds(self) -> None:
+        """Confidence must be 0.0-1.0."""
+        with pytest.raises(ValueError):
+            AnalysisResult(
+                verdict="SUPPORTED",
+                confidence=1.5,  # Invalid
+                statistical_evidence="test",
+                code_generated="test",
+                execution_output="test",
+            )
+```
+### 6.2 Integration Test (`tests/integration/test_modal.py`)
+```python
+"""Integration tests for Modal (requires credentials)."""
+import pytest
+from src.utils.config import settings
+@pytest.mark.integration
+@pytest.mark.skipif(not settings.modal_available, reason="Modal not configured")
+class TestModalIntegration:
+    """Integration tests requiring Modal credentials."""
+    @pytest.mark.asyncio
+    async def test_sandbox_executes_code(self) -> None:
+        """Modal sandbox should execute Python code."""
+        import asyncio
+        from functools import partial
+        from src.tools.code_execution import get_code_executor
+        executor = get_code_executor()
+        code = "import pandas as pd; print(pd.DataFrame({'a': [1,2,3]})['a'].sum())"
+        loop = asyncio.get_running_loop()
+        result = await loop.run_in_executor(
+            None, partial(executor.execute, code, timeout=30)
+        )
+        assert result["success"]
+        assert "6" in result["stdout"]
+    @pytest.mark.asyncio
+    async def test_statistical_analyzer_works(self) -> None:
+        """StatisticalAnalyzer should work end-to-end."""
+        from src.services.statistical_analyzer import get_statistical_analyzer
+        from src.utils.models import Citation, Evidence
+        evidence = [
+            Evidence(
+                content="Drug shows 40% improvement in trial.",
+                citation=Citation(
+                    source="pubmed",
+                    title="Test",
+                    url="https://test.com",
+                    date="2024-01-01",
+                    authors=["Test"],
+                ),
+                relevance=0.9,
+            )
+        ]
+        analyzer = get_statistical_analyzer()
+        result = await analyzer.analyze("test drug efficacy", evidence)
+        assert result.verdict in ["SUPPORTED", "REFUTED", "INCONCLUSIVE"]
+        assert 0.0 <= result.confidence <= 1.0
+```
+---
+## 7. Verification Commands
+```bash
+# 1. Verify NO agent_framework in StatisticalAnalyzer
+grep -r "agent_framework" src/services/statistical_analyzer.py
+# Should return nothing!
+# 2. Run unit tests (no Modal needed)
+uv run pytest tests/unit/services/test_statistical_analyzer.py -v
+# 3. Run verification script (requires Modal)
+uv run python examples/modal_demo/verify_sandbox.py
+# 4. Run analysis demo (requires Modal + LLM)
+uv run python examples/modal_demo/run_analysis.py "metformin alzheimer"
+# 5. Run integration tests
+uv run pytest tests/integration/test_modal.py -v -m integration
+# 6. Full test suite
+make check
+```
+---
+## 8. Definition of Done
+Phase 13 is **COMPLETE** when:
+- [ ] `src/services/statistical_analyzer.py` created (NO agent_framework)
+- [ ] `src/utils/config.py` has `enable_modal_analysis` setting
+- [ ] `src/orchestrator.py` uses `StatisticalAnalyzer` directly
+- [ ] `src/agents/analysis_agent.py` refactored to wrap `StatisticalAnalyzer`
+- [ ] `src/mcp_tools.py` has `analyze_hypothesis` tool
+- [ ] `examples/modal_demo/verify_sandbox.py` working
+- [ ] `examples/modal_demo/run_analysis.py` working
+- [ ] Unit tests pass WITHOUT magentic extra installed
+- [ ] Integration tests pass WITH Modal credentials
+- [ ] All lints pass
+---
+## 9. Architecture After Phase 13
+```text
+┌─────────────────────────────────────────────────────────────────┐
+│                        MCP Clients                              │
+│              (Claude Desktop, Cursor, etc.)                     │
+└───────────────────────────┬─────────────────────────────────────┘
+                            │ MCP Protocol
+                            ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                     Gradio App + MCP Server                     │
+│  ┌──────────────────────────────────────────────────────────┐   │
+│  │  MCP Tools: search_pubmed, search_trials, search_biorxiv │   │
+│  │             search_all, analyze_hypothesis               │   │
+│  └──────────────────────────────────────────────────────────┘   │
+└───────────────────────────┬─────────────────────────────────────┘
+                            │
+        ┌───────────────────┴───────────────────┐
+        │                                       │
+        ▼                                       ▼
+┌───────────────────────┐            ┌───────────────────────────┐
+│   Simple Orchestrator │            │   Magentic Orchestrator   │
+│  (no agent_framework) │            │   (with agent_framework)  │
+│                       │            │                           │
+│  SearchHandler        │            │  SearchAgent              │
+│  JudgeHandler         │            │  JudgeAgent               │
+│  StatisticalAnalyzer ─┼────────────┼→ AnalysisAgent ───────────┤
+│                       │            │  (wraps StatisticalAnalyzer)
+└───────────┬───────────┘            └───────────────────────────┘
+            │
+            ▼
+┌──────────────────────────────────────────────────────────────────┐
+│                    StatisticalAnalyzer                           │
+│              (src/services/statistical_analyzer.py)              │
+│                    NO agent_framework dependency                 │
+│                                                                  │
+│  1. Generate code with pydantic-ai                               │
+│  2. Execute in Modal sandbox                                     │
+│  3. Return AnalysisResult                                        │
+└───────────────────────────┬──────────────────────────────────────┘
+                            │
+                            ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                       Modal Sandbox                             │
+│  ┌─────────────────────────────────────────────────────────┐    │
+│  │  - pandas, numpy, scipy, sklearn, statsmodels           │    │
+│  │  - Network: BLOCKED                                     │    │
+│  │  - Filesystem: ISOLATED                                 │    │
+│  │  - Timeout: ENFORCED                                    │    │
+│  └─────────────────────────────────────────────────────────┘    │
+└───────────────────────────────────────────────────────────���─────┘
+```
+**This is the dependency-safe Modal stack.**
+---
+## 10. Files Summary
+| File | Action | Purpose |
+|------|--------|---------|
+| `src/services/statistical_analyzer.py` | **CREATE** | Core analysis (no agent_framework) |
+| `src/utils/config.py` | MODIFY | Add `enable_modal_analysis` |
+| `src/orchestrator.py` | MODIFY | Use `StatisticalAnalyzer` |
+| `src/agents/analysis_agent.py` | MODIFY | Wrap `StatisticalAnalyzer` |
+| `src/mcp_tools.py` | MODIFY | Add `analyze_hypothesis` |
+| `examples/modal_demo/verify_sandbox.py` | CREATE | Sandbox verification |
+| `examples/modal_demo/run_analysis.py` | CREATE | Demo script |
+| `tests/unit/services/test_statistical_analyzer.py` | CREATE | Unit tests |
+| `tests/integration/test_modal.py` | CREATE | Integration tests |
+**Key Fix**: `StatisticalAnalyzer` has ZERO agent_framework imports, making it safe for the simple orchestrator.

docs/implementation/14_phase_demo_submission.md ADDED Viewed

	@@ -0,0 +1,464 @@

+# Phase 14 Implementation Spec: Demo Video & Hackathon Submission
+**Goal**: Create compelling demo video and complete hackathon submission.
+**Philosophy**: "Ship it with style."
+**Prerequisite**: Phases 12-13 complete (MCP + Modal working)
+**Priority**: P0 - REQUIRED FOR SUBMISSION
+**Deadline**: November 30, 2025 11:59 PM UTC
+**Estimated Time**: 2-3 hours
+---
+## 1. Submission Requirements
+### MCP's 1st Birthday Hackathon Checklist
+| Requirement | Status | Action |
+|-------------|--------|--------|
+| HuggingFace Space in `MCP-1st-Birthday` org | Pending | Transfer or create |
+| Track tag in README.md | Pending | Add tag |
+| Social media post link | Pending | Create post |
+| Demo video (1-5 min) | Pending | Record |
+| Team members registered | Pending | Verify |
+| Original work (Nov 14-30) | **DONE** | All commits in range |
+### Track 2: MCP in Action - Tags
+```yaml
+# Add to HuggingFace Space README.md
+tags:
+  - mcp-in-action-track-enterprise   # Healthcare/enterprise focus
+```
+---
+## 2. Prize Eligibility Summary
+### After Phases 12-13
+| Award | Amount | Eligible | Requirements Met |
+|-------|--------|----------|------------------|
+| Track 2: MCP in Action (1st) | $2,500 | **YES** | MCP server working |
+| Modal Innovation | $2,500 | **YES** | Sandbox demo ready |
+| LlamaIndex | $1,000 | **YES** | Using RAG |
+| Community Choice | $1,000 | Possible | Need great demo |
+| **Total Potential** | **$7,000** | | |
+---
+## 3. Demo Video Specification
+### 3.1 Duration & Format
+- **Length**: 3-4 minutes (sweet spot)
+- **Format**: Screen recording + voice-over
+- **Resolution**: 1080p minimum
+- **Audio**: Clear narration, no background music
+### 3.2 Recommended Tools
+| Tool | Purpose | Notes |
+|------|---------|-------|
+| OBS Studio | Screen recording | Free, cross-platform |
+| Loom | Quick recording | Good for demos |
+| QuickTime | Mac screen recording | Built-in |
+| DaVinci Resolve | Editing | Free, professional |
+### 3.3 Demo Script (4 minutes)
+```markdown
+## Section 1: Hook (30 seconds)
+[Show Gradio UI]
+"DeepCritical is an AI-powered drug repurposing research agent.
+It searches peer-reviewed literature, clinical trials, and cutting-edge preprints
+to find new uses for existing drugs."
+"Let me show you how it works."
+---
+## Section 2: Core Functionality (60 seconds)
+[Type query: "Can metformin treat Alzheimer's disease?"]
+"When I ask about metformin for Alzheimer's, DeepCritical:
+1. Searches PubMed for peer-reviewed papers
+2. Queries ClinicalTrials.gov for active trials
+3. Scans bioRxiv for the latest preprints"
+[Show search results streaming]
+"It then uses an LLM to assess the evidence quality and
+synthesize findings into a structured research report."
+[Show final report]
+---
+## Section 3: MCP Integration (60 seconds)
+[Switch to Claude Desktop]
+"What makes DeepCritical unique is full MCP integration.
+These same tools are available to any MCP client."
+[Show Claude Desktop with DeepCritical tools]
+"I can ask Claude: 'Search PubMed for aspirin cancer prevention'"
+[Show results appearing in Claude Desktop]
+"The agent uses our MCP server to search real biomedical databases."
+[Show MCP Inspector briefly]
+"Here's the MCP schema - four tools exposed for any AI to use."
+---
+## Section 4: Modal Innovation (45 seconds)
+[Run verify_sandbox.py]
+"For statistical analysis, we use Modal for secure code execution."
+[Show sandbox verification output]
+"Notice the hostname is NOT my machine - code runs in an isolated container.
+Network is blocked. The AI can't reach the internet from the sandbox."
+[Run analysis demo]
+"Modal executes LLM-generated statistical code safely,
+returning verdicts like SUPPORTED, REFUTED, or INCONCLUSIVE."
+---
+## Section 5: Close (45 seconds)
+[Return to Gradio UI]
+"DeepCritical brings together:
+- Three biomedical data sources
+- MCP protocol for universal tool access
+- Modal sandboxes for safe code execution
+- LlamaIndex for semantic search
+All in a beautiful Gradio interface."
+"Check out the code on GitHub, try it on HuggingFace Spaces,
+and let us know what you think."
+"Thanks for watching!"
+[Show links: GitHub, HuggingFace, Team names]
+```
+---
+## 4. HuggingFace Space Configuration
+### 4.1 Space README.md
+```markdown
+---
+title: DeepCritical
+emoji: 🧬
+colorFrom: blue
+colorTo: purple
+sdk: gradio
+sdk_version: "5.0.0"
+app_file: src/app.py
+pinned: false
+license: mit
+tags:
+  - mcp-in-action-track-enterprise
+  - mcp-hackathon
+  - drug-repurposing
+  - biomedical-ai
+  - pydantic-ai
+  - llamaindex
+  - modal
+---
+# DeepCritical
+AI-Powered Drug Repurposing Research Agent
+## Features
+- **Multi-Source Search**: PubMed, ClinicalTrials.gov, bioRxiv/medRxiv
+- **MCP Integration**: Use our tools from Claude Desktop or any MCP client
+- **Modal Sandbox**: Secure execution of AI-generated statistical code
+- **LlamaIndex RAG**: Semantic search and evidence synthesis
+## MCP Tools
+Connect to our MCP server at:
+```
+https://MCP-1st-Birthday-deepcritical.hf.space/gradio_api/mcp/
+```
+Available tools:
+- `search_pubmed` - Search peer-reviewed biomedical literature
+- `search_clinical_trials` - Search ClinicalTrials.gov
+- `search_biorxiv` - Search bioRxiv/medRxiv preprints
+- `search_all` - Search all sources simultaneously
+## Team
+- The-Obstacle-Is-The-Way
+- MarioAderman
+## Links
+- [GitHub Repository](https://github.com/The-Obstacle-Is-The-Way/DeepCritical-1)
+- [Demo Video](link-to-video)
+```
+### 4.2 Environment Variables (Secrets)
+Set in HuggingFace Space settings:
+```
+OPENAI_API_KEY=sk-...
+ANTHROPIC_API_KEY=sk-ant-...
+NCBI_API_KEY=...
+MODAL_TOKEN_ID=...
+MODAL_TOKEN_SECRET=...
+```
+---
+## 5. Social Media Post
+### Twitter/X Template
+```
+🧬 Excited to submit DeepCritical to MCP's 1st Birthday Hackathon!
+An AI agent that:
+✅ Searches PubMed, ClinicalTrials.gov & bioRxiv
+✅ Exposes tools via MCP protocol
+✅ Runs statistical code in Modal sandboxes
+✅ Uses LlamaIndex for semantic search
+Try it: [HuggingFace link]
+Demo: [Video link]
+#MCPHackathon #AIAgents #DrugRepurposing @huggingface @AnthropicAI
+```
+### LinkedIn Template
+```
+Thrilled to share DeepCritical - our submission to MCP's 1st Birthday Hackathon!
+🔬 What it does:
+DeepCritical is an AI-powered drug repurposing research agent that searches
+peer-reviewed literature, clinical trials, and preprints to find new uses
+for existing drugs.
+🛠️ Technical highlights:
+• Full MCP integration - tools work with Claude Desktop
+• Modal sandboxes for secure AI-generated code execution
+• LlamaIndex RAG for semantic evidence search
+• Three biomedical data sources in parallel
+Built with PydanticAI, Gradio, and deployed on HuggingFace Spaces.
+Try it: [link]
+Watch the demo: [link]
+#ArtificialIntelligence #Healthcare #DrugDiscovery #MCP #Hackathon
+```
+---
+## 6. Pre-Submission Checklist
+### 6.1 Code Quality
+```bash
+# Run all checks
+make check
+# Expected output:
+# ✅ Linting passed (ruff)
+# ✅ Type checking passed (mypy)
+# ✅ All 80+ tests passed (pytest)
+```
+### 6.2 Documentation
+- [ ] README.md updated with MCP instructions
+- [ ] All demo scripts have docstrings
+- [ ] Example files work end-to-end
+- [ ] CLAUDE.md is current
+### 6.3 Deployment Verification
+```bash
+# Test locally
+uv run python src/app.py
+# Visit http://localhost:7860
+# Test MCP schema
+curl http://localhost:7860/gradio_api/mcp/schema
+# Test Modal (if configured)
+uv run python examples/modal_demo/verify_sandbox.py
+```
+### 6.4 HuggingFace Space
+- [ ] Space created in `MCP-1st-Birthday` organization
+- [ ] Secrets configured (API keys)
+- [ ] App starts without errors
+- [ ] MCP endpoint accessible
+- [ ] Track tag in README
+---
+## 7. Recording Checklist
+### Before Recording
+- [ ] Close unnecessary apps/notifications
+- [ ] Clear browser history/tabs
+- [ ] Test all demos work
+- [ ] Prepare terminal windows
+- [ ] Write down talking points
+### During Recording
+- [ ] Speak clearly and at moderate pace
+- [ ] Pause briefly between sections
+- [ ] Show your face? (optional, adds personality)
+- [ ] Don't rush - 3-4 min is enough time
+### After Recording
+- [ ] Watch playback for errors
+- [ ] Trim dead air at start/end
+- [ ] Add title/end cards
+- [ ] Export at 1080p
+- [ ] Upload to YouTube/Loom
+---
+## 8. Submission Steps
+### Step 1: Finalize Code
+```bash
+# Ensure clean state
+git status
+make check
+# Push to GitHub
+git push origin main
+# Sync to HuggingFace
+git push huggingface-upstream main
+```
+### Step 2: Verify HuggingFace Space
+1. Visit Space URL
+2. Test the chat interface
+3. Test MCP endpoint: `/gradio_api/mcp/schema`
+4. Verify README has track tag
+### Step 3: Record Demo Video
+1. Follow script from Section 3.3
+2. Edit and export
+3. Upload to YouTube (unlisted) or Loom
+4. Copy shareable link
+### Step 4: Create Social Post
+1. Write post (see templates)
+2. Include video link
+3. Tag relevant accounts
+4. Post and copy link
+### Step 5: Submit
+1. Ensure Space is in `MCP-1st-Birthday` org
+2. Verify track tag in README
+3. Submit entry (check hackathon page for form)
+4. Include all links
+---
+## 9. Verification Commands
+```bash
+# 1. Full test suite
+make check
+# 2. Start local server
+uv run python src/app.py
+# 3. Verify MCP works
+curl http://localhost:7860/gradio_api/mcp/schema | jq
+# 4. Test with MCP Inspector
+npx @anthropic/mcp-inspector http://localhost:7860/gradio_api/mcp/
+# 5. Run Modal verification
+uv run python examples/modal_demo/verify_sandbox.py
+# 6. Run full demo
+uv run python examples/orchestrator_demo/run_agent.py "metformin alzheimer"
+```
+---
+## 10. Definition of Done
+Phase 14 is **COMPLETE** when:
+- [ ] Demo video recorded (3-4 min)
+- [ ] Video uploaded (YouTube/Loom)
+- [ ] Social media post created with link
+- [ ] HuggingFace Space in `MCP-1st-Birthday` org
+- [ ] Track tag in Space README
+- [ ] All team members registered
+- [ ] Entry submitted before deadline
+- [ ] Confirmation received
+---
+## 11. Timeline
+| Task | Time | Deadline |
+|------|------|----------|
+| Phase 12: MCP Server | 2-3 hours | Nov 28 |
+| Phase 13: Modal Integration | 2-3 hours | Nov 29 |
+| Phase 14: Demo & Submit | 2-3 hours | Nov 30 |
+| **Buffer** | ~24 hours | Before 11:59 PM UTC |
+---
+## 12. Contact & Support
+### Hackathon Resources
+- Discord: `#agents-mcp-hackathon-winter25`
+- HuggingFace: [MCP-1st-Birthday org](https://huggingface.co/MCP-1st-Birthday)
+- MCP Docs: [modelcontextprotocol.io](https://modelcontextprotocol.io/)
+### Team Communication
+- Coordinate on final review
+- Agree on who submits
+- Celebrate when done! 🎉
+---
+**Good luck! Ship it with confidence.**

docs/implementation/roadmap.md CHANGED Viewed

@@ -41,7 +41,9 @@ src/
 ├── tools/                      # Search tools
 │   ├── __init__.py
 │   ├── pubmed.py               # PubMed E-utilities tool
-│   ├── websearch.py            # DuckDuckGo search tool
 │   └── search_handler.py       # Orchestrates multiple tools
 ├── prompts/                    # Prompt templates
 │   ├── __init__.py
@@ -61,7 +63,8 @@ tests/
 ├── unit/
 │   ├── tools/
 │   │   ├── test_pubmed.py
-│   │   ├── test_websearch.py
 │   │   └── test_search_handler.py
 │   ├── agent_factory/
 │   │   └── test_judges.py
@@ -183,6 +186,8 @@ Structured Research Report
 ## Spec Documents
 1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)** ✅
 2. **[Phase 2 Spec: Search Slice](02_phase_search.md)** ✅
 3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)** ✅
@@ -191,9 +196,18 @@ Structured Research Report
 6. **[Phase 6 Spec: Embeddings & Semantic Search](06_phase_embeddings.md)** ✅
 7. **[Phase 7 Spec: Hypothesis Agent](07_phase_hypothesis.md)** ✅
 8. **[Phase 8 Spec: Report Agent](08_phase_report.md)** ✅
-9. **[Phase 9 Spec: Remove DuckDuckGo](09_phase_source_cleanup.md)** 📝
-10. **[Phase 10 Spec: ClinicalTrials.gov](10_phase_clinicaltrials.md)** 📝
-11. **[Phase 11 Spec: bioRxiv Preprints](11_phase_biorxiv.md)** 📝
 ---
@@ -209,8 +223,25 @@ Structured Research Report
 | Phase 6: Embeddings | ✅ COMPLETE | Semantic search + ChromaDB |
 | Phase 7: Hypothesis | ✅ COMPLETE | Mechanistic reasoning chains |
 | Phase 8: Report | ✅ COMPLETE | Structured scientific reports |
-| Phase 9: Source Cleanup | 📝 SPEC READY | Remove DuckDuckGo |
-| Phase 10: ClinicalTrials | 📝 SPEC READY | ClinicalTrials.gov API |
-| Phase 11: bioRxiv | 📝 SPEC READY | Preprint search |
-*Phases 1-8 COMPLETE. Phases 9-11 will add multi-source credibility.*

 ├── tools/                      # Search tools
 │   ├── __init__.py
 │   ├── pubmed.py               # PubMed E-utilities tool
+│   ├── clinicaltrials.py       # ClinicalTrials.gov API
+│   ├── biorxiv.py              # bioRxiv/medRxiv preprints
+│   ├── code_execution.py       # Modal sandbox execution
 │   └── search_handler.py       # Orchestrates multiple tools
 ├── prompts/                    # Prompt templates
 │   ├── __init__.py
 ├── unit/
 │   ├── tools/
 │   │   ├── test_pubmed.py
+│   │   ├── test_clinicaltrials.py
+│   │   ├── test_biorxiv.py
 │   │   └── test_search_handler.py
 │   ├── agent_factory/
 │   │   └── test_judges.py
 ## Spec Documents
+### Core Platform (Phases 1-8)
 1. **[Phase 1 Spec: Foundation](01_phase_foundation.md)** ✅
 2. **[Phase 2 Spec: Search Slice](02_phase_search.md)** ✅
 3. **[Phase 3 Spec: Judge Slice](03_phase_judge.md)** ✅
 6. **[Phase 6 Spec: Embeddings & Semantic Search](06_phase_embeddings.md)** ✅
 7. **[Phase 7 Spec: Hypothesis Agent](07_phase_hypothesis.md)** ✅
 8. **[Phase 8 Spec: Report Agent](08_phase_report.md)** ✅
+### Multi-Source Search (Phases 9-11)
+9. **[Phase 9 Spec: Remove DuckDuckGo](09_phase_source_cleanup.md)** ✅
+10. **[Phase 10 Spec: ClinicalTrials.gov](10_phase_clinicaltrials.md)** ✅
+11. **[Phase 11 Spec: bioRxiv Preprints](11_phase_biorxiv.md)** ✅
+### Hackathon Integration (Phases 12-14)
+12. **[Phase 12 Spec: MCP Server](12_phase_mcp_server.md)** ✅ COMPLETE
+13. **[Phase 13 Spec: Modal Pipeline](13_phase_modal_integration.md)** 📝 P1 - $2,500
+14. **[Phase 14 Spec: Demo & Submission](14_phase_demo_submission.md)** 📝 P0 - REQUIRED
 ---
 | Phase 6: Embeddings | ✅ COMPLETE | Semantic search + ChromaDB |
 | Phase 7: Hypothesis | ✅ COMPLETE | Mechanistic reasoning chains |
 | Phase 8: Report | ✅ COMPLETE | Structured scientific reports |
+| Phase 9: Source Cleanup | ✅ COMPLETE | Remove DuckDuckGo |
+| Phase 10: ClinicalTrials | ✅ COMPLETE | ClinicalTrials.gov API |
+| Phase 11: bioRxiv | ✅ COMPLETE | Preprint search |
+| Phase 12: MCP Server | ✅ COMPLETE | MCP protocol integration |
+| Phase 13: Modal Pipeline | 📝 SPEC READY | Sandboxed code execution |
+| Phase 14: Demo & Submit | 📝 SPEC READY | Hackathon submission |
+*Phases 1-12 COMPLETE. Phases 13-14 for hackathon prizes.*
+---
+## Hackathon Prize Potential
+| Award | Amount | Requirement | Phase |
+|-------|--------|-------------|-------|
+| Track 2: MCP in Action (1st) | $2,500 | MCP server working | 12 |
+| Modal Innovation | $2,500 | Sandbox demo ready | 13 |
+| LlamaIndex | $1,000 | Using RAG | ✅ Done |
+| Community Choice | $1,000 | Great demo video | 14 |
+| **Total Potential** | **$7,000** | | |
+**Deadline: November 30, 2025 11:59 PM UTC**

docs/pending/00_priority_summary.md ADDED Viewed

	@@ -0,0 +1,111 @@

+# DeepCritical Hackathon Priority Summary
+## 4 Days Left (Deadline: Nov 30, 2025 11:59 PM UTC)
+---
+## Git Contribution Analysis
+```text
+The-Obstacle-Is-The-Way: 20+ commits (Phases 1-11, all demos, all fixes)
+MarioAderman:            3 commits (Modal, LlamaIndex, PubMed fix)
+JJ (Maintainer):         0 code commits (merge button only)
+```
+**Conclusion:** You built 90%+ of this codebase.
+---
+## Current Stack (What We Have)
+| Component | Status | Files |
+|-----------|--------|-------|
+| PubMed Search | ✅ Working | `src/tools/pubmed.py` |
+| ClinicalTrials Search | ✅ Working | `src/tools/clinicaltrials.py` |
+| bioRxiv Search | ✅ Working | `src/tools/biorxiv.py` |
+| Search Handler | ✅ Working | `src/tools/search_handler.py` |
+| Embeddings/ChromaDB | ✅ Working | `src/services/embeddings.py` |
+| LlamaIndex RAG | ✅ Working | `src/services/llamaindex_rag.py` |
+| Hypothesis Agent | ✅ Working | `src/agents/hypothesis_agent.py` |
+| Report Agent | ✅ Working | `src/agents/report_agent.py` |
+| Judge Agent | ✅ Working | `src/agents/judge_agent.py` |
+| Orchestrator | ✅ Working | `src/orchestrator.py` |
+| Gradio UI | ✅ Working | `src/app.py` |
+| Modal Code Execution | ⚠️ Built, not wired | `src/tools/code_execution.py` |
+| **MCP Server** | ✅ **Working** | `src/mcp_tools.py`, `src/app.py` |
+---
+## What's Required for Track 2 (MCP in Action)
+| Requirement | Have It? | Priority |
+|-------------|----------|----------|
+| Autonomous agent behavior | ✅ Yes | - |
+| Must use MCP servers as tools | ✅ **YES** | Done (Phase 12) |
+| Must be Gradio app | ✅ Yes | - |
+| Planning/reasoning/execution | ✅ Yes | - |
+**Bottom Line:** ✅ MCP server implemented in Phase 12. Track 2 compliant.
+---
+## 3 Things To Do (In Order)
+### 1. MCP Server (P0 - Required) ✅ DONE
+- **Files:** `src/mcp_tools.py`, `src/app.py`
+- **Status:** Implemented in Phase 12
+- **Doc:** `02_mcp_server_integration.md`
+- **Endpoint:** `/gradio_api/mcp/`
+### 2. Modal Wiring (P1 - $2,500 Prize)
+- **File:** Update `src/agents/analysis_agent.py`
+- **Time:** 2-3 hours
+- **Doc:** `03_modal_integration.md`
+- **Why:** Modal Innovation Award is $2,500
+### 3. Demo Video + Submission (P0 - Required)
+- **Time:** 1-2 hours
+- **Why:** Required for all submissions
+---
+## Submission Checklist
+- [ ] Space in MCP-1st-Birthday org
+- [ ] Tag: `mcp-in-action-track-enterprise`
+- [ ] Social media post link
+- [ ] Demo video (1-5 min)
+- [ ] MCP server working
+- [ ] All tests passing
+---
+## Prize Math
+| Award | Amount | Eligible? |
+|-------|--------|-----------|
+| Track 2 1st Place | $2,500 | If MCP works |
+| Modal Innovation | $2,500 | If Modal wired |
+| LlamaIndex | $1,000 | Yes (have it) |
+| Community Choice | $1,000 | Maybe |
+| **Total Potential** | **$7,000** | With MCP + Modal |
+---
+## Next Actions
+```bash
+# 1. MCP Server - DONE ✅
+uv run python src/app.py  # Starts Gradio with MCP at /gradio_api/mcp/
+# 2. Test MCP works
+curl http://localhost:7860/gradio_api/mcp/schema | jq
+# 3. Wire Modal into pipeline
+# (see 03_modal_integration.md)
+# 4. Record demo video
+# 5. Submit to MCP-1st-Birthday org
+```

docs/pending/01_hackathon_requirements.md ADDED Viewed

	@@ -0,0 +1,99 @@

+# MCP's 1st Birthday Hackathon - Requirements Analysis
+> **✅ MCP Server implemented in Phase 12** - Track 2 compliant
+## Deadline: November 30, 2025 11:59 PM UTC
+---
+## Track Selection: MCP in Action (Track 2)
+DeepCritical fits **Track 2: MCP in Action** - AI agent applications.
+### Required Tags (pick one)
+```yaml
+tags:
+  - mcp-in-action-track-enterprise   # Drug repurposing = enterprise/healthcare
+  # OR
+  - mcp-in-action-track-consumer     # If targeting patients/consumers
+```
+### Track 2 Requirements
+| Requirement | DeepCritical Status | Action Needed |
+|-------------|---------------------|---------------|
+| Autonomous Agent behavior | ✅ Have it | Search-Judge-Synthesize loop |
+| Must use MCP servers as tools | ✅ **DONE** | `src/mcp_tools.py` |
+| Must be a Gradio app | ✅ Have it | `src/app.py` |
+| Planning, reasoning, execution | ✅ Have it | Orchestrator + Judge |
+| Context Engineering / RAG | ✅ Have it | LlamaIndex + ChromaDB |
+---
+## Prize Opportunities
+### Current Eligibility vs With MCP Integration
+| Award | Prize | Current | With MCP |
+|-------|-------|---------|----------|
+| MCP in Action (1st) | $2,500 | ✅ Eligible | ✅ STRONGER |
+| Modal Innovation | $2,500 | ❌ Not using | ✅ ELIGIBLE (code execution) |
+| Blaxel Choice | $2,500 | ❌ Not using | ⚠️ Could integrate |
+| LlamaIndex | $1,000 | ✅ Using (Mario's code) | ✅ ELIGIBLE |
+| Google Gemini | $10K credits | ❌ Not using | ⚠️ Could add |
+| Community Choice | $1,000 | ⚠️ Possible | ✅ Better demo helps |
+| **TOTAL POTENTIAL** | | ~$2,500 | **$8,500+** |
+---
+## Submission Checklist
+- [ ] HuggingFace Space in `MCP-1st-Birthday` organization
+- [ ] Track tags in Space README.md
+- [ ] Social media post link (X, LinkedIn)
+- [ ] Demo video (1-5 minutes)
+- [ ] All team members registered
+- [ ] Original work (Nov 14-30)
+---
+## Priority Integration Order
+### P0 - MUST HAVE (Required for Track 2)
+1. **MCP Server Wrapper** - Expose search tools as MCP servers
+   - See: `02_mcp_server_integration.md`
+### P1 - HIGH VALUE ($2,500 each)
+2. **Modal Integration** - Already have code, need to wire up
+   - See: `03_modal_integration.md`
+### P2 - NICE TO HAVE
+3. **Blaxel** - MCP hosting platform (if time permits)
+4. **Gemini API** - Add as LLM option for Google prize
+---
+## What MCP Actually Means for Us
+MCP (Model Context Protocol) is Anthropic's standard for connecting AI to tools.
+**Current state:**
+- We have `PubMedTool`, `ClinicalTrialsTool`, `BioRxivTool`
+- They're Python classes with `search()` methods
+**What we need:**
+- Wrap these as MCP servers
+- So Claude Desktop, Cursor, or any MCP client can use them
+**Why this matters:**
+- Judges will test if our tools work with Claude Desktop
+- No MCP = disqualified from Track 2
+---
+## Reference Links
+- [Hackathon Page](https://huggingface.co/MCP-1st-Birthday)
+- [MCP Documentation](https://modelcontextprotocol.io/)
+- [Gradio MCP Guide](https://www.gradio.app/guides/building-mcp-server-with-gradio)
+- [Discord: #agents-mcp-hackathon-winter25](https://discord.gg/huggingface)

docs/pending/02_mcp_server_integration.md ADDED Viewed

	@@ -0,0 +1,177 @@

+# MCP Server Integration
+## Priority: P0 - REQUIRED FOR TRACK 2
+> **✅ STATUS: IMPLEMENTED** - See `src/mcp_tools.py` and `src/app.py`
+> MCP endpoint: `/gradio_api/mcp/`
+---
+## What We Need
+Expose our search tools as MCP servers so Claude Desktop/Cursor can use them.
+### Current Tools to Expose
+| Tool | File | MCP Tool Name |
+|------|------|---------------|
+| PubMed Search | `src/tools/pubmed.py` | `search_pubmed` |
+| ClinicalTrials Search | `src/tools/clinicaltrials.py` | `search_clinical_trials` |
+| bioRxiv Search | `src/tools/biorxiv.py` | `search_biorxiv` |
+| Combined Search | `src/tools/search_handler.py` | `search_all_sources` |
+---
+## Implementation Options
+### Option 1: Gradio MCP (Recommended)
+Gradio 5.0+ can expose any Gradio app as an MCP server automatically.
+```python
+# src/mcp_server.py
+import gradio as gr
+from src.tools.pubmed import PubMedTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.biorxiv import BioRxivTool
+pubmed = PubMedTool()
+trials = ClinicalTrialsTool()
+biorxiv = BioRxivTool()
+async def search_pubmed(query: str, max_results: int = 10) -> str:
+    """Search PubMed for biomedical literature."""
+    results = await pubmed.search(query, max_results)
+    return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
+async def search_clinical_trials(query: str, max_results: int = 10) -> str:
+    """Search ClinicalTrials.gov for clinical trial data."""
+    results = await trials.search(query, max_results)
+    return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
+async def search_biorxiv(query: str, max_results: int = 10) -> str:
+    """Search bioRxiv/medRxiv for preprints."""
+    results = await biorxiv.search(query, max_results)
+    return "\n\n".join([f"**{e.citation.title}**\n{e.content}" for e in results])
+# Create Gradio interface
+demo = gr.Interface(
+    fn=[search_pubmed, search_clinical_trials, search_biorxiv],
+    inputs=[gr.Textbox(label="Query"), gr.Number(label="Max Results", value=10)],
+    outputs=gr.Textbox(label="Results"),
+)
+# Launch as MCP server
+if __name__ == "__main__":
+    demo.launch(mcp_server=True)  # Gradio 5.0+ feature
+```
+### Option 2: Native MCP SDK
+Use the official MCP Python SDK:
+```bash
+uv add mcp
+```
+```python
+# src/mcp_server.py
+from mcp.server import Server
+from mcp.types import Tool, TextContent
+from src.tools.pubmed import PubMedTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.biorxiv import BioRxivTool
+server = Server("deepcritical-research")
+@server.tool()
+async def search_pubmed(query: str, max_results: int = 10) -> list[TextContent]:
+    """Search PubMed for biomedical literature on drug repurposing."""
+    tool = PubMedTool()
+    results = await tool.search(query, max_results)
+    return [TextContent(type="text", text=e.content) for e in results]
+@server.tool()
+async def search_clinical_trials(query: str, max_results: int = 10) -> list[TextContent]:
+    """Search ClinicalTrials.gov for clinical trials."""
+    tool = ClinicalTrialsTool()
+    results = await tool.search(query, max_results)
+    return [TextContent(type="text", text=e.content) for e in results]
+@server.tool()
+async def search_biorxiv(query: str, max_results: int = 10) -> list[TextContent]:
+    """Search bioRxiv/medRxiv for preprints (not peer-reviewed)."""
+    tool = BioRxivTool()
+    results = await tool.search(query, max_results)
+    return [TextContent(type="text", text=e.content) for e in results]
+if __name__ == "__main__":
+    server.run()
+```
+---
+## Claude Desktop Configuration
+After implementing, users add to `claude_desktop_config.json`:
+```json
+{
+  "mcpServers": {
+    "deepcritical": {
+      "command": "uv",
+      "args": ["run", "python", "src/mcp_server.py"],
+      "cwd": "/path/to/DeepCritical-1"
+    }
+  }
+}
+```
+---
+## Testing MCP Server
+1. Start the MCP server (via Gradio app):
+```bash
+uv run python src/app.py
+```
+2. Check MCP schema:
+```bash
+curl http://localhost:7860/gradio_api/mcp/schema | jq
+```
+3. Test with MCP Inspector:
+```bash
+npx @anthropic/mcp-inspector http://localhost:7860/gradio_api/mcp/sse
+```
+4. Verify tools appear and work
+---
+## Demo Video Script
+For the hackathon submission video:
+1. Show Claude Desktop with DeepCritical MCP tools
+2. Ask: "Search PubMed for metformin Alzheimer's"
+3. Show real results appearing
+4. Ask: "Now search clinical trials for the same"
+5. Show combined analysis
+This proves MCP integration works.
+---
+## Files Created
+- [x] `src/mcp_tools.py` - MCP tool wrapper functions
+- [x] `src/app.py` - Gradio app with `mcp_server=True`
+- [x] `tests/unit/test_mcp_tools.py` - Unit tests
+- [x] `tests/integration/test_mcp_tools_live.py` - Integration tests
+- [x] `README.md` - Updated with MCP usage instructions

docs/pending/03_modal_integration.md ADDED Viewed

	@@ -0,0 +1,158 @@

+# Modal Integration
+## Priority: P1 - HIGH VALUE ($2,500 Modal Innovation Award)
+---
+## What Modal Is For
+Modal provides serverless GPU/CPU compute. For DeepCritical:
+### Current Use Case (Mario's Code)
+- `src/tools/code_execution.py` - Run LLM-generated analysis code in sandboxes
+- Scientific computing (pandas, scipy, numpy) in isolated containers
+### Potential Additional Use Cases
+| Use Case | Benefit | Complexity |
+|----------|---------|------------|
+| Code Execution Sandbox | Run statistical analysis safely | ✅ Already built |
+| LLM Inference | Run local models (no API costs) | Medium |
+| Batch Processing | Process many papers in parallel | Medium |
+| Embedding Generation | GPU-accelerated embeddings | Low |
+---
+## Current State
+Mario implemented `src/tools/code_execution.py`:
+```python
+# Already exists - ModalCodeExecutor
+executor = get_code_executor()
+result = executor.execute("""
+import pandas as pd
+import numpy as np
+# LLM-generated statistical analysis
+""")
+```
+### What's Missing
+1. **Not wired into the main pipeline** - The executor exists but isn't used
+2. **No Modal tokens configured** - Needs MODAL_TOKEN_ID/MODAL_TOKEN_SECRET
+3. **No demo showing it works** - Judges need to see it
+---
+## Integration Plan
+### Step 1: Wire Into Agent Pipeline
+Add a `StatisticalAnalyzer` service that uses Modal:
+```python
+# src/services/statistical_analyzer.py
+import asyncio
+from src.tools.code_execution import get_code_executor
+class StatisticalAnalyzer:
+    """Run statistical analysis on evidence using Modal sandbox."""
+    async def analyze(self, evidence: list[Evidence], query: str) -> str:
+        # 1. LLM generates analysis code
+        code = await self._generate_analysis_code(evidence, query)
+        # 2. Execute in Modal sandbox (run sync executor in thread pool)
+        executor = get_code_executor()
+        loop = asyncio.get_event_loop()
+        result = await loop.run_in_executor(None, executor.execute, code)
+        # 3. Return results
+        return result["stdout"]
+```
+### Step 2: Add to Orchestrator
+```python
+# In orchestrator, after gathering evidence:
+if settings.enable_modal_analysis:
+    analysis_agent = AnalysisAgent()
+    stats_results = await analysis_agent.analyze(evidence, query)
+```
+### Step 3: Create Demo
+```python
+# examples/modal_demo/run_analysis.py
+"""Demo: Modal-powered statistical analysis of drug evidence."""
+# Show:
+# 1. Gather evidence from PubMed
+# 2. Generate analysis code with LLM
+# 3. Execute in Modal sandbox
+# 4. Return statistical insights
+```
+---
+## Modal Setup
+### 1. Install Modal CLI
+```bash
+pip install modal
+modal setup  # Authenticates with Modal
+```
+### 2. Set Environment Variables
+```bash
+# In .env
+MODAL_TOKEN_ID=your-token-id
+MODAL_TOKEN_SECRET=your-token-secret
+```
+### 3. Deploy (Optional)
+```bash
+modal deploy src/tools/code_execution.py
+```
+---
+## What to Show Judges
+For the Modal Innovation Award ($2,500):
+1. **Sandbox Isolation** - Code runs in container, not local
+2. **Scientific Computing** - Real pandas/scipy analysis
+3. **Safety** - Can't access local filesystem
+4. **Speed** - Modal's fast cold starts
+### Demo Script
+```bash
+# Run the Modal verification script
+uv run python examples/modal_demo/verify_sandbox.py
+```
+This proves code runs in Modal, not locally.
+---
+## Files to Update
+- [ ] Wire `code_execution.py` into pipeline
+- [ ] Create `src/agents/analysis_agent.py`
+- [ ] Update `examples/modal_demo/` with working demo
+- [ ] Add Modal setup to README
+- [ ] Test with real Modal account
+---
+## Cost Estimate
+Modal pricing for our use case:
+- CPU sandbox: ~$0.0001 per execution
+- For demo/judging: < $1 total
+- Free tier: 30 hours/month
+Not a cost concern.

pyproject.toml CHANGED Viewed

@@ -17,7 +17,7 @@ dependencies = [
     "beautifulsoup4>=4.12", # HTML parsing
     "xmltodict>=0.13", # PubMed XML -> dict
     # UI
-    "gradio>=5.0", # Chat interface
     # Utils
     "python-dotenv>=1.0", # .env loading
     "tenacity>=8.2", # Retry logic

     "beautifulsoup4>=4.12", # HTML parsing
     "xmltodict>=0.13", # PubMed XML -> dict
     # UI
+    "gradio[mcp]>=5.0.0", # Chat interface
     # Utils
     "python-dotenv>=1.0", # .env loading
     "tenacity>=8.2", # Retry logic

src/app.py CHANGED Viewed

@@ -1,4 +1,4 @@
-"""Gradio UI for DeepCritical agent."""
 import os
 from collections.abc import AsyncGenerator
@@ -7,6 +7,12 @@ from typing import Any
 import gradio as gr
 from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
 from src.orchestrator_factory import create_orchestrator
 from src.tools.biorxiv import BioRxivTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
@@ -115,10 +121,10 @@ async def research_agent(
 def create_demo() -> Any:
     """
-    Create the Gradio demo interface.
     Returns:
-        Configured Gradio Blocks interface
     """
     with gr.Blocks(
         title="DeepCritical - Drug Repurposing Research Agent",
@@ -137,9 +143,10 @@ def create_demo() -> Any:
         - "What existing medications show promise for Long COVID?"
         """)
         gr.ChatInterface(
             fn=research_agent,
-            type="messages",
             title="",
             examples=[
                 "What drugs could be repurposed for Alzheimer's disease?",
@@ -157,24 +164,74 @@ def create_demo() -> Any:
             ],
         )
         gr.Markdown("""
         ---
         **Note**: This is a research tool and should not be used for medical decisions.
         Always consult healthcare professionals for medical advice.
-        Built with 🤖 PydanticAI + 🔬 PubMed, ClinicalTrials.gov & bioRxiv
         """)
     return demo
 def main() -> None:
-    """Run the Gradio app."""
     demo = create_demo()
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
         share=False,
     )

+"""Gradio UI for DeepCritical agent with MCP server support."""
 import os
 from collections.abc import AsyncGenerator
 import gradio as gr
 from src.agent_factory.judges import JudgeHandler, MockJudgeHandler
+from src.mcp_tools import (
+    search_all_sources,
+    search_biorxiv,
+    search_clinical_trials,
+    search_pubmed,
+)
 from src.orchestrator_factory import create_orchestrator
 from src.tools.biorxiv import BioRxivTool
 from src.tools.clinicaltrials import ClinicalTrialsTool
 def create_demo() -> Any:
     """
+    Create the Gradio demo interface with MCP support.
     Returns:
+        Configured Gradio Blocks interface with MCP server enabled
     """
     with gr.Blocks(
         title="DeepCritical - Drug Repurposing Research Agent",
         - "What existing medications show promise for Long COVID?"
         """)
+        # Main chat interface (existing)
         gr.ChatInterface(
             fn=research_agent,
+            type="messages",  # type: ignore
             title="",
             examples=[
                 "What drugs could be repurposed for Alzheimer's disease?",
             ],
         )
+        # MCP Tool Interfaces (exposed via MCP protocol)
+        gr.Markdown("---\n## MCP Tools (Also Available via Claude Desktop)")
+        with gr.Tab("PubMed Search"):
+            gr.Interface(
+                fn=search_pubmed,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="metformin alzheimer"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_pubmed",
+            )
+        with gr.Tab("Clinical Trials"):
+            gr.Interface(
+                fn=search_clinical_trials,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="diabetes phase 3"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_clinical_trials",
+            )
+        with gr.Tab("Preprints"):
+            gr.Interface(
+                fn=search_biorxiv,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="long covid treatment"),
+                    gr.Slider(1, 50, value=10, step=1, label="Max Results"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_biorxiv",
+            )
+        with gr.Tab("Search All"):
+            gr.Interface(
+                fn=search_all_sources,
+                inputs=[
+                    gr.Textbox(label="Query", placeholder="metformin cancer"),
+                    gr.Slider(1, 20, value=5, step=1, label="Max Per Source"),
+                ],
+                outputs=gr.Markdown(label="Results"),
+                api_name="search_all",
+            )
         gr.Markdown("""
         ---
         **Note**: This is a research tool and should not be used for medical decisions.
         Always consult healthcare professionals for medical advice.
+        Built with PydanticAI + PubMed, ClinicalTrials.gov & bioRxiv
+        **MCP Server**: Available at `/gradio_api/mcp/` for Claude Desktop integration
         """)
     return demo
 def main() -> None:
+    """Run the Gradio app with MCP server enabled."""
     demo = create_demo()
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
         share=False,
+        mcp_server=True,  # Enable MCP server
     )

src/mcp_tools.py ADDED Viewed

	@@ -0,0 +1,156 @@

+"""MCP tool wrappers for DeepCritical search tools.
+These functions expose our search tools via MCP protocol.
+Each function follows the MCP tool contract:
+- Full type hints
+- Google-style docstrings with Args section
+- Formatted string returns
+"""
+from src.tools.biorxiv import BioRxivTool
+from src.tools.clinicaltrials import ClinicalTrialsTool
+from src.tools.pubmed import PubMedTool
+# Singleton instances (avoid recreating on each call)
+_pubmed = PubMedTool()
+_trials = ClinicalTrialsTool()
+_biorxiv = BioRxivTool()
+async def search_pubmed(query: str, max_results: int = 10) -> str:
+    """Search PubMed for peer-reviewed biomedical literature.
+    Searches NCBI PubMed database for scientific papers matching your query.
+    Returns titles, authors, abstracts, and citation information.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "drug repurposing cancer")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted search results with paper titles, authors, dates, and abstracts
+    """
+    max_results = max(1, min(50, max_results))  # Clamp to valid range
+    results = await _pubmed.search(query, max_results)
+    if not results:
+        return f"No PubMed results found for: {query}"
+    formatted = [f"## PubMed Results for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**Authors**: {', '.join(evidence.citation.authors[:3])}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_clinical_trials(query: str, max_results: int = 10) -> str:
+    """Search ClinicalTrials.gov for clinical trial data.
+    Searches the ClinicalTrials.gov database for trials matching your query.
+    Returns trial titles, phases, status, conditions, and interventions.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "diabetes phase 3")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted clinical trial information with NCT IDs, phases, and status
+    """
+    max_results = max(1, min(50, max_results))
+    results = await _trials.search(query, max_results)
+    if not results:
+        return f"No clinical trials found for: {query}"
+    formatted = [f"## Clinical Trials for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_biorxiv(query: str, max_results: int = 10) -> str:
+    """Search bioRxiv/medRxiv for preprint research.
+    Searches bioRxiv and medRxiv preprint servers for cutting-edge research.
+    Note: Preprints are NOT peer-reviewed but contain the latest findings.
+    Args:
+        query: Search query (e.g., "metformin neuroprotection", "long covid treatment")
+        max_results: Maximum results to return (1-50, default 10)
+    Returns:
+        Formatted preprint results with titles, authors, and abstracts
+    """
+    max_results = max(1, min(50, max_results))
+    results = await _biorxiv.search(query, max_results)
+    if not results:
+        return f"No bioRxiv/medRxiv preprints found for: {query}"
+    formatted = [f"## Preprint Results for: {query}\n"]
+    for i, evidence in enumerate(results, 1):
+        formatted.append(f"### {i}. {evidence.citation.title}")
+        formatted.append(f"**Authors**: {', '.join(evidence.citation.authors[:3])}")
+        formatted.append(f"**Date**: {evidence.citation.date}")
+        formatted.append(f"**URL**: {evidence.citation.url}")
+        formatted.append(f"\n{evidence.content}\n")
+    return "\n".join(formatted)
+async def search_all_sources(query: str, max_per_source: int = 5) -> str:
+    """Search all biomedical sources simultaneously.
+    Performs parallel search across PubMed, ClinicalTrials.gov, and bioRxiv.
+    This is the most comprehensive search option for drug repurposing research.
+    Args:
+        query: Search query (e.g., "metformin alzheimer", "aspirin cancer prevention")
+        max_per_source: Maximum results per source (1-20, default 5)
+    Returns:
+        Combined results from all sources with source labels
+    """
+    import asyncio
+    max_per_source = max(1, min(20, max_per_source))
+    # Run all searches in parallel
+    pubmed_task = search_pubmed(query, max_per_source)
+    trials_task = search_clinical_trials(query, max_per_source)
+    biorxiv_task = search_biorxiv(query, max_per_source)
+    pubmed_results, trials_results, biorxiv_results = await asyncio.gather(
+        pubmed_task, trials_task, biorxiv_task, return_exceptions=True
+    )
+    formatted = [f"# Comprehensive Search: {query}\n"]
+    # Add each result section (handle exceptions gracefully)
+    if isinstance(pubmed_results, str):
+        formatted.append(pubmed_results)
+    else:
+        formatted.append(f"## PubMed\n*Error: {pubmed_results}*\n")
+    if isinstance(trials_results, str):
+        formatted.append(trials_results)
+    else:
+        formatted.append(f"## Clinical Trials\n*Error: {trials_results}*\n")
+    if isinstance(biorxiv_results, str):
+        formatted.append(biorxiv_results)
+    else:
+        formatted.append(f"## Preprints\n*Error: {biorxiv_results}*\n")
+    return "\n---\n".join(formatted)

tests/integration/test_mcp_tools_live.py ADDED Viewed

	@@ -0,0 +1,24 @@

+"""Integration tests for MCP tool wrappers with live API calls."""
+import pytest
+class TestMCPToolsLive:
+    """Integration tests for MCP tools against live APIs (PubMed, etc.)."""
+    @pytest.mark.integration
+    @pytest.mark.asyncio
+    async def test_mcp_tools_work_end_to_end(self) -> None:
+        """Test that MCP tools execute real searches."""
+        from src.mcp_tools import search_pubmed
+        result = await search_pubmed("metformin diabetes", 3)
+        assert isinstance(result, str)
+        assert "PubMed Results" in result
+        # Should have actual content (not just "no results")
+        # Typical queries should return something.
+        # The wrapper returns "No PubMed results found" string if empty.
+        if "No PubMed results found" not in result:
+            assert len(result) > 10

tests/unit/test_mcp_tools.py ADDED Viewed

	@@ -0,0 +1,200 @@

+"""Unit tests for MCP tool wrappers."""
+from unittest.mock import AsyncMock, patch
+import pytest
+from src.mcp_tools import (
+    search_all_sources,
+    search_biorxiv,
+    search_clinical_trials,
+    search_pubmed,
+)
+from src.utils.models import Citation, Evidence
+@pytest.fixture
+def mock_evidence() -> Evidence:
+    """Sample evidence for testing."""
+    return Evidence(
+        content="Metformin shows neuroprotective effects in preclinical models.",
+        citation=Citation(
+            source="pubmed",
+            title="Metformin and Alzheimer's Disease",
+            url="https://pubmed.ncbi.nlm.nih.gov/12345678/",
+            date="2024-01-15",
+            authors=["Smith J", "Jones M", "Brown K"],
+        ),
+        relevance=0.85,
+    )
+class TestSearchPubMed:
+    """Tests for search_pubmed MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_pubmed("metformin alzheimer", 10)
+            assert isinstance(result, str)
+            assert "PubMed Results" in result
+            assert "Metformin and Alzheimer's Disease" in result
+            assert "Smith J" in result
+    @pytest.mark.asyncio
+    async def test_clamps_max_results(self) -> None:
+        """Should clamp max_results to valid range (1-50)."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[])
+            # Test lower bound
+            await search_pubmed("test", 0)
+            mock_tool.search.assert_called_with("test", 1)
+            # Test upper bound
+            await search_pubmed("test", 100)
+            mock_tool.search.assert_called_with("test", 50)
+    @pytest.mark.asyncio
+    async def test_handles_no_results(self) -> None:
+        """Should return appropriate message when no results."""
+        with patch("src.mcp_tools._pubmed") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[])
+            result = await search_pubmed("xyznonexistent", 10)
+            assert "No PubMed results found" in result
+class TestSearchClinicalTrials:
+    """Tests for search_clinical_trials MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        mock_evidence.citation.source = "clinicaltrials"  # type: ignore
+        with patch("src.mcp_tools._trials") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_clinical_trials("diabetes", 10)
+            assert isinstance(result, str)
+            assert "Clinical Trials" in result
+class TestSearchBiorxiv:
+    """Tests for search_biorxiv MCP tool."""
+    @pytest.mark.asyncio
+    async def test_returns_formatted_string(self, mock_evidence: Evidence) -> None:
+        """Should return formatted markdown string."""
+        mock_evidence.citation.source = "biorxiv"  # type: ignore
+        with patch("src.mcp_tools._biorxiv") as mock_tool:
+            mock_tool.search = AsyncMock(return_value=[mock_evidence])
+            result = await search_biorxiv("preprint search", 10)
+            assert isinstance(result, str)
+            assert "Preprint Results" in result
+class TestSearchAllSources:
+    """Tests for search_all_sources MCP tool."""
+    @pytest.mark.asyncio
+    async def test_combines_all_sources(self, mock_evidence: Evidence) -> None:
+        """Should combine results from all sources."""
+        with (
+            patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed,
+            patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials,
+            patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv,
+        ):
+            mock_pubmed.return_value = "## PubMed Results"
+            mock_trials.return_value = "## Clinical Trials"
+            mock_biorxiv.return_value = "## Preprints"
+            result = await search_all_sources("metformin", 5)
+            assert "Comprehensive Search" in result
+            assert "PubMed" in result
+            assert "Clinical Trials" in result
+            assert "Preprints" in result
+    @pytest.mark.asyncio
+    async def test_handles_partial_failures(self) -> None:
+        """Should handle partial failures gracefully."""
+        with (
+            patch("src.mcp_tools.search_pubmed", new_callable=AsyncMock) as mock_pubmed,
+            patch("src.mcp_tools.search_clinical_trials", new_callable=AsyncMock) as mock_trials,
+            patch("src.mcp_tools.search_biorxiv", new_callable=AsyncMock) as mock_biorxiv,
+        ):
+            mock_pubmed.return_value = "## PubMed Results"
+            mock_trials.side_effect = Exception("API Error")
+            mock_biorxiv.return_value = "## Preprints"
+            result = await search_all_sources("metformin", 5)
+            # Should still contain working sources
+            assert "PubMed" in result
+            assert "Preprints" in result
+            # Should show error for failed source
+            assert "Error" in result
+class TestMCPDocstrings:
+    """Tests that docstrings follow MCP format."""
+    def test_search_pubmed_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_pubmed.__doc__ is not None
+        assert "Args:" in search_pubmed.__doc__
+        assert "query:" in search_pubmed.__doc__
+        assert "max_results:" in search_pubmed.__doc__
+        assert "Returns:" in search_pubmed.__doc__
+    def test_search_clinical_trials_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_clinical_trials.__doc__ is not None
+        assert "Args:" in search_clinical_trials.__doc__
+    def test_search_biorxiv_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_biorxiv.__doc__ is not None
+        assert "Args:" in search_biorxiv.__doc__
+    def test_search_all_sources_has_args_section(self) -> None:
+        """Docstring must have Args section for MCP schema generation."""
+        assert search_all_sources.__doc__ is not None
+        assert "Args:" in search_all_sources.__doc__
+class TestMCPTypeHints:
+    """Tests that type hints are complete for MCP."""
+    def test_search_pubmed_type_hints(self) -> None:
+        """All parameters and return must have type hints."""
+        import inspect
+        sig = inspect.signature(search_pubmed)
+        # Check parameter hints
+        assert sig.parameters["query"].annotation is str
+        assert sig.parameters["max_results"].annotation is int
+        # Check return hint
+        assert sig.return_annotation is str
+    def test_search_clinical_trials_type_hints(self) -> None:
+        """All parameters and return must have type hints."""
+        import inspect
+        sig = inspect.signature(search_clinical_trials)
+        assert sig.parameters["query"].annotation is str
+        assert sig.parameters["max_results"].annotation is int
+        assert sig.return_annotation is str

uv.lock CHANGED Viewed

@@ -1063,7 +1063,7 @@ source = { editable = "." }
 dependencies = [
     { name = "anthropic" },
     { name = "beautifulsoup4" },
-    { name = "gradio" },
     { name = "httpx" },
     { name = "openai" },
     { name = "pydantic" },
@@ -1111,7 +1111,7 @@ requires-dist = [
     { name = "beautifulsoup4", specifier = ">=4.12" },
     { name = "chromadb", marker = "extra == 'embeddings'", specifier = ">=0.4.0" },
     { name = "chromadb", marker = "extra == 'modal'", specifier = ">=0.4.0" },
-    { name = "gradio", specifier = ">=5.0" },
     { name = "httpx", specifier = ">=0.27" },
     { name = "llama-index", marker = "extra == 'modal'", specifier = ">=0.11.0" },
     { name = "llama-index-embeddings-openai", marker = "extra == 'modal'" },
@@ -1568,7 +1568,7 @@ wheels = [
 [[package]]
 name = "gradio"
-version = "5.50.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "aiofiles" },
@@ -1592,7 +1592,6 @@ dependencies = [
     { name = "pydub" },
     { name = "python-multipart" },
     { name = "pyyaml" },
-    { name = "ruff" },
     { name = "safehttpx" },
     { name = "semantic-version" },
     { name = "starlette" },
@@ -1601,13 +1600,20 @@ dependencies = [
     { name = "typing-extensions" },
     { name = "uvicorn" },
 ]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/22/04/8daf96bd6d2470f03e2a15a9fc900c7ecf6549619173f16c5944c7ec15a7/gradio-5.50.0-py3-none-any.whl", hash = "sha256:d06770d57cdda9b703ef9cf767ac93a890a0e12d82679a310eef74203a3673f4", size = 63530991 },
 ]
 [[package]]
 name = "gradio-client"
-version = "1.14.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "fsspec" },
@@ -1615,10 +1621,10 @@ dependencies = [
     { name = "huggingface-hub" },
     { name = "packaging" },
     { name = "typing-extensions" },
-    { name = "websockets" },
 ]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/be/8a/f2a47134c5b5a7f3bad27eae749589a80d81efaaad8f59af47c136712bf6/gradio_client-1.14.0-py3-none-any.whl", hash = "sha256:9a2f5151978411e0f8b55a2d38cddd0a94491851149d14db4af96f5a09774825", size = 325555 },
 ]
 [[package]]

 dependencies = [
     { name = "anthropic" },
     { name = "beautifulsoup4" },
+    { name = "gradio", extra = ["mcp"] },
     { name = "httpx" },
     { name = "openai" },
     { name = "pydantic" },
     { name = "beautifulsoup4", specifier = ">=4.12" },
     { name = "chromadb", marker = "extra == 'embeddings'", specifier = ">=0.4.0" },
     { name = "chromadb", marker = "extra == 'modal'", specifier = ">=0.4.0" },
+    { name = "gradio", extras = ["mcp"], specifier = ">=5.0.0" },
     { name = "httpx", specifier = ">=0.27" },
     { name = "llama-index", marker = "extra == 'modal'", specifier = ">=0.11.0" },
     { name = "llama-index-embeddings-openai", marker = "extra == 'modal'" },
 [[package]]
 name = "gradio"
+version = "6.0.1"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "aiofiles" },
     { name = "pydub" },
     { name = "python-multipart" },
     { name = "pyyaml" },
     { name = "safehttpx" },
     { name = "semantic-version" },
     { name = "starlette" },
     { name = "typing-extensions" },
     { name = "uvicorn" },
 ]
+sdist = { url = "https://files.pythonhosted.org/packages/65/13/f2bfe1237b8700f63e21c5e39f2843ac8346f7ba4525b582f30f40249863/gradio-6.0.1.tar.gz", hash = "sha256:5d02e6ac34c67aea26b938b8628c8f9f504871392e71f2db559ab8d6799bdf69", size = 36440914 }
 wheels = [
+    { url = "https://files.pythonhosted.org/packages/09/21/27ae5f4b2191a5d58707fc610e67453781a2b948a675a7cf06c99497ffa1/gradio-6.0.1-py3-none-any.whl", hash = "sha256:0f98dc8b414a3f3773cbf3caf5a354507c8ae309ed8266e2f30ca9fa53f379b8", size = 21559963 },
+]
+[package.optional-dependencies]
+mcp = [
+    { name = "mcp" },
+    { name = "pydantic" },
 ]
 [[package]]
 name = "gradio-client"
+version = "2.0.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "fsspec" },
     { name = "huggingface-hub" },
     { name = "packaging" },
     { name = "typing-extensions" },
 ]
+sdist = { url = "https://files.pythonhosted.org/packages/cf/0a/906062fe0577c62ea6e14044ba74268ff9266fdc75d0e69257bddb7400b3/gradio_client-2.0.0.tar.gz", hash = "sha256:56b462183cb8741bd3e69b21db7d3b62c5abb03c2c2bb925223f1eb18f950e89", size = 315906 }
 wheels = [
+    { url = "https://files.pythonhosted.org/packages/07/5b/789403564754f1eba0273400c1cea2c155f984d82458279154977a088509/gradio_client-2.0.0-py3-none-any.whl", hash = "sha256:77bedf20edcc232d8e7986c1a22165b2bbca1c7c7df10ba808a093d5180dae18", size = 315180 },
 ]
 [[package]]