File size: 3,715 Bytes
b4aa4ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# Deployment Guide
## Launching DeepCritical: Gradio, MCP, & Modal

---

## Overview

DeepCritical is designed for a multi-platform deployment strategy to maximize hackathon impact:

1. **HuggingFace Spaces**: Host the Gradio UI (User Interface).
2. **MCP Server**: Expose research tools to Claude Desktop/Agents.
3. **Modal (Optional)**: Run heavy inference or local LLMs if API costs are prohibitive.

---

## 1. HuggingFace Spaces (Gradio UI)

**Goal**: A public URL where judges/users can try the research agent.

### Prerequisites
- HuggingFace Account
- `gradio` installed (`uv add gradio`)

### Steps

1. **Create Space**:
   - Go to HF Spaces -> Create New Space.
   - SDK: **Gradio**.
   - Hardware: **CPU Basic** (Free) is sufficient (since we use APIs).

2. **Prepare Files**:
   - Ensure `app.py` contains the Gradio interface construction.
   - Ensure `requirements.txt` or `pyproject.toml` lists all dependencies.

3. **Secrets**:
   - Go to Space Settings -> **Repository secrets**.
   - Add `ANTHROPIC_API_KEY` (or your chosen LLM provider key).
   - Add `BRAVE_API_KEY` (for web search).

4. **Deploy**:
   - Push code to the Space's git repo.
   - Watch "Build" logs.

### Streaming Optimization
Ensure `app.py` uses generator functions for the chat interface to prevent timeouts:
```python
# app.py
def predict(message, history):
    agent = ResearchAgent()
    for update in agent.research_stream(message):
        yield update
```

---

## 2. MCP Server Deployment

**Goal**: Allow other agents (like Claude Desktop) to use our PubMed/Research tools directly.

### Local Usage (Claude Desktop)

1. **Install**:
   ```bash
   uv sync
   ```

2. **Configure Claude Desktop**:
   Edit `~/Library/Application Support/Claude/claude_desktop_config.json`:
   ```json
   {
     "mcpServers": {
       "deepcritical": {
         "command": "uv",
         "args": ["run", "fastmcp", "run", "src/mcp_servers/pubmed_server.py"],
         "cwd": "/absolute/path/to/DeepCritical"
       }
     }
   }
   ```

3. **Restart Claude**: You should see a πŸ”Œ icon indicating connected tools.

### Remote Deployment (Smithery/Glama)
*Target for "MCP Track" bonus points.*

1. **Dockerize**: Create a `Dockerfile` for the MCP server.
   ```dockerfile
   FROM python:3.11-slim
   COPY . /app
   RUN pip install fastmcp httpx
   CMD ["fastmcp", "run", "src/mcp_servers/pubmed_server.py", "--transport", "sse"]
   ```
   *Note: Use SSE transport for remote/HTTP servers.*

2. **Deploy**: Host on Fly.io or Railway.

---

## 3. Modal (GPU/Heavy Compute)

**Goal**: Run a local LLM (e.g., Llama-3-70B) or handle massive parallel searches if APIs are too slow/expensive.

### Setup
1. **Install**: `uv add modal`
2. **Auth**: `modal token new`

### Logic
Instead of calling Anthropic API, we call a Modal function:

```python
# src/llm/modal_client.py
import modal

stub = modal.Stub("deepcritical-inference")

@stub.function(gpu="A100")
def generate_text(prompt: str):
    # Load vLLM or similar
    ...
```

### When to use?
- **Hackathon Demo**: Stick to Anthropic/OpenAI APIs for speed/reliability.
- **Production/Stretch**: Use Modal if you hit rate limits or want to show off "Open Source Models" capability.

---

## Deployment Checklist

### Pre-Flight
- [ ] Run `pytest -m unit` to ensure logic is sound.
- [ ] Run `pytest -m e2e` (one pass) to verify APIs connect.
- [ ] Check `requirements.txt` matches `pyproject.toml`.

### Secrets Management
- [ ] **NEVER** commit `.env` files.
- [ ] Verify keys are added to HF Space settings.

### Post-Launch
- [ ] Test the live URL.
- [ ] Verify "Stop" button in Gradio works (interrupts the agent).
- [ ] Record a walkthrough video (crucial for hackathon submission).