Spaces:
Running
Running
Commit
·
2e4a760
1
Parent(s):
420d8ba
fix: wire EmbeddingService to simple orchestrator + improve search quality
Browse filesMajor fixes:
- Wire EmbeddingService to simple orchestrator for semantic deduplication
(was built but not connected - see docs/bugs/005)
- Expand BioRxiv stop words (~100) and require minimum 2 term matches
to filter out irrelevant papers
- Fix MockJudgeHandler to return honest message instead of garbage
drug candidates extracted via broken heuristics
The simple orchestrator now uses local sentence-transformers for
semantic deduplication without requiring any API keys.
Bug documentation added in docs/bugs/005_services_not_integrated.md
- docs/bugs/004_gradio_intermittent_loading.md +0 -44
- docs/bugs/005_services_not_integrated.md +142 -0
- src/agent_factory/judges.py +7 -32
- src/orchestrator.py +45 -3
- src/tools/biorxiv.py +214 -6
docs/bugs/004_gradio_intermittent_loading.md
DELETED
|
@@ -1,44 +0,0 @@
|
|
| 1 |
-
# Bug Report: Intermittent Gradio UI Loading (Hydration/Timeout)
|
| 2 |
-
|
| 3 |
-
## 1. Symptoms
|
| 4 |
-
- **Intermittent Loading**: The UI sometimes fails to load, showing a blank screen or a "Connection Error" toast.
|
| 5 |
-
- **Refresh Required**: Users often have to hard refresh the page (Ctrl+Shift+R) multiple times to get the UI to appear.
|
| 6 |
-
- **Mobile vs. Desktop**: The issue appears to be more prevalent or noticeable on Desktop Web than on Mobile Web (possibly due to network conditions, caching, or layout differences).
|
| 7 |
-
- **Environment**: HuggingFace Spaces (Docker SDK).
|
| 8 |
-
|
| 9 |
-
## 2. Root Cause Analysis
|
| 10 |
-
|
| 11 |
-
Based on research into Gradio 5.x/6.x behavior on HuggingFace Spaces, this is likely due to a combination of:
|
| 12 |
-
|
| 13 |
-
### A. SSR (Server-Side Rendering) Hydration Mismatch
|
| 14 |
-
Gradio 5+ introduced Server-Side Rendering (SSR) to improve initial load performance. However, on HuggingFace Spaces (which uses an iframe), there can be race conditions where the server-rendered HTML doesn't match what the client-side JavaScript expects, causing a "Hydration Error". When this happens, the React/Svelte frontend crashes silently or enters an inconsistent state, requiring a full refresh.
|
| 15 |
-
|
| 16 |
-
### B. WebSocket Timeouts
|
| 17 |
-
HuggingFace Spaces enforces strict timeouts for WebSocket connections. If the app takes too long to initialize (e.g., loading heavy libraries or models), the initial handshake may fail.
|
| 18 |
-
- *Mitigation*: Our app is relatively lightweight on startup (lazy loading models), so this is secondary, but network latency can trigger it.
|
| 19 |
-
|
| 20 |
-
### C. Browser Caching
|
| 21 |
-
Aggressive browser caching of the main bundle can sometimes cause version mismatches if the Space was recently rebuilt/redeployed.
|
| 22 |
-
|
| 23 |
-
## 3. Proposed Solution
|
| 24 |
-
|
| 25 |
-
### Immediate Fix: Disable SSR
|
| 26 |
-
Forcing Client-Side Rendering (CSR) eliminates the hydration mismatch entirely. While this theoretically slightly slows down the "First Contentful Paint", it is much more robust for dynamic apps inside iframes.
|
| 27 |
-
|
| 28 |
-
**Change in `src/app.py`:**
|
| 29 |
-
```python
|
| 30 |
-
demo.launch(
|
| 31 |
-
# ... other args ...
|
| 32 |
-
ssr_mode=False, # Force Client-Side Rendering to fix hydration issues
|
| 33 |
-
)
|
| 34 |
-
```
|
| 35 |
-
|
| 36 |
-
### Secondary Fixes (If needed)
|
| 37 |
-
- **Increase Concurrency Limits**: Ensure `max_threads` is sufficient if many users connect at once.
|
| 38 |
-
- **Health Check**: Add a simple lightweight endpoint to keep the Space "warm" if it sleeps aggressively.
|
| 39 |
-
|
| 40 |
-
## 4. Verification Plan
|
| 41 |
-
1. Apply `ssr_mode=False` to `src/app.py`.
|
| 42 |
-
2. Deploy to HuggingFace Spaces (`fix/gradio-ui-final` branch).
|
| 43 |
-
3. Test on Desktop (Chrome Incognito, Firefox) and Mobile.
|
| 44 |
-
4. Verify no "Connection Error" toasts appear on initial load.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
docs/bugs/005_services_not_integrated.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Bug 005: Embedding Services Built But Not Wired to Default Orchestrator
|
| 2 |
+
|
| 3 |
+
**Date:** November 26, 2025
|
| 4 |
+
**Severity:** CRITICAL
|
| 5 |
+
**Status:** Open
|
| 6 |
+
|
| 7 |
+
## 1. The Problem
|
| 8 |
+
|
| 9 |
+
Two complete semantic search services exist but are **NOT USED** by the default orchestrator:
|
| 10 |
+
|
| 11 |
+
| Service | Location | Status |
|
| 12 |
+
| ------- | -------- | ------ |
|
| 13 |
+
| EmbeddingService | `src/services/embeddings.py` | BUILT, not wired to simple mode |
|
| 14 |
+
| LlamaIndexRAGService | `src/services/llamaindex_rag.py` | BUILT, not wired to simple mode |
|
| 15 |
+
|
| 16 |
+
## 2. Root Cause: Two Orchestrators
|
| 17 |
+
|
| 18 |
+
```
|
| 19 |
+
┌─────────────────────────────────────────────────────────────────┐
|
| 20 |
+
│ orchestrator.py (SIMPLE MODE - DEFAULT) │
|
| 21 |
+
│ - Basic search → judge → loop │
|
| 22 |
+
│ - NO embeddings │
|
| 23 |
+
│ - NO semantic search │
|
| 24 |
+
│ - Hand-rolled keyword matching │
|
| 25 |
+
└─────────────────────────────────────────────────────────────────┘
|
| 26 |
+
|
| 27 |
+
┌─────────────────────────────────────────────────────────────────┐
|
| 28 |
+
│ orchestrator_magentic.py (MAGENTIC MODE) │
|
| 29 |
+
│ - Multi-agent architecture │
|
| 30 |
+
│ - USES EmbeddingService │
|
| 31 |
+
│ - USES semantic search │
|
| 32 |
+
│ - Requires agent-framework (optional dep) │
|
| 33 |
+
│ - OpenAI only │
|
| 34 |
+
└─────────────────────────────────────────────────────────────────┘
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
**The UI defaults to simple mode**, which bypasses all the semantic search infrastructure.
|
| 38 |
+
|
| 39 |
+
## 3. What's Built (Not Wired)
|
| 40 |
+
|
| 41 |
+
### EmbeddingService (NO API KEY NEEDED)
|
| 42 |
+
|
| 43 |
+
```python
|
| 44 |
+
# src/services/embeddings.py
|
| 45 |
+
class EmbeddingService:
|
| 46 |
+
async def embed(text) -> list[float]
|
| 47 |
+
async def search_similar(query) -> list[dict] # SEMANTIC SEARCH
|
| 48 |
+
async def deduplicate(evidence) -> list # DEDUPLICATION
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
- Uses local sentence-transformers
|
| 52 |
+
- ChromaDB vector store
|
| 53 |
+
- **Works without API keys**
|
| 54 |
+
|
| 55 |
+
### LlamaIndexRAGService
|
| 56 |
+
|
| 57 |
+
```python
|
| 58 |
+
# src/services/llamaindex_rag.py
|
| 59 |
+
class LlamaIndexRAGService:
|
| 60 |
+
def ingest_evidence(evidence_list)
|
| 61 |
+
def retrieve(query) -> list[dict] # Semantic retrieval
|
| 62 |
+
def query(query_str) -> str # Synthesized response
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
## 4. Where Services ARE Used
|
| 66 |
+
|
| 67 |
+
```
|
| 68 |
+
src/orchestrator_magentic.py ← Uses EmbeddingService
|
| 69 |
+
src/agents/search_agent.py ← Uses EmbeddingService
|
| 70 |
+
src/agents/report_agent.py ← Uses EmbeddingService
|
| 71 |
+
src/agents/hypothesis_agent.py ← Uses EmbeddingService
|
| 72 |
+
src/agents/analysis_agent.py ← Uses EmbeddingService
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
All in magentic mode agents, NOT in simple orchestrator.
|
| 76 |
+
|
| 77 |
+
## 5. The Fix Options
|
| 78 |
+
|
| 79 |
+
### Option A: Add Embeddings to Simple Orchestrator (RECOMMENDED)
|
| 80 |
+
|
| 81 |
+
Modify `src/orchestrator.py` to optionally use EmbeddingService:
|
| 82 |
+
|
| 83 |
+
```python
|
| 84 |
+
class Orchestrator:
|
| 85 |
+
def __init__(self, ..., use_embeddings: bool = True):
|
| 86 |
+
if use_embeddings:
|
| 87 |
+
from src.services.embeddings import get_embedding_service
|
| 88 |
+
self.embeddings = get_embedding_service()
|
| 89 |
+
else:
|
| 90 |
+
self.embeddings = None
|
| 91 |
+
|
| 92 |
+
async def run(self, query):
|
| 93 |
+
# ... search phase ...
|
| 94 |
+
|
| 95 |
+
if self.embeddings:
|
| 96 |
+
# Semantic ranking
|
| 97 |
+
all_evidence = await self._rank_by_relevance(all_evidence, query)
|
| 98 |
+
# Deduplication
|
| 99 |
+
all_evidence = await self.embeddings.deduplicate(all_evidence)
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
### Option B: Make Magentic Mode Default
|
| 103 |
+
|
| 104 |
+
Change app.py to default to "magentic" mode when deps available.
|
| 105 |
+
|
| 106 |
+
### Option C: Merge Best of Both
|
| 107 |
+
|
| 108 |
+
Create a new orchestrator that:
|
| 109 |
+
- Has the simplicity of simple mode
|
| 110 |
+
- Uses embeddings for ranking/dedup
|
| 111 |
+
- Doesn't require agent-framework
|
| 112 |
+
|
| 113 |
+
## 6. Implementation Plan
|
| 114 |
+
|
| 115 |
+
### Phase 1: Wire EmbeddingService to Simple Orchestrator
|
| 116 |
+
|
| 117 |
+
1. Import EmbeddingService in orchestrator.py
|
| 118 |
+
2. Add semantic ranking after search
|
| 119 |
+
3. Add deduplication before judge
|
| 120 |
+
4. Test end-to-end
|
| 121 |
+
|
| 122 |
+
### Phase 2: Add Relevance to Evidence
|
| 123 |
+
|
| 124 |
+
1. Use embedding similarity as relevance score
|
| 125 |
+
2. Sort evidence by relevance
|
| 126 |
+
3. Only send top-K to judge
|
| 127 |
+
|
| 128 |
+
## 7. Files to Modify
|
| 129 |
+
|
| 130 |
+
```
|
| 131 |
+
src/orchestrator.py ← Add embedding integration
|
| 132 |
+
src/orchestrator_factory.py ← Pass embeddings flag
|
| 133 |
+
src/app.py ← Enable embeddings by default
|
| 134 |
+
```
|
| 135 |
+
|
| 136 |
+
## 8. Success Criteria
|
| 137 |
+
|
| 138 |
+
- [ ] Default mode uses semantic search
|
| 139 |
+
- [ ] Evidence ranked by relevance
|
| 140 |
+
- [ ] Duplicates removed
|
| 141 |
+
- [ ] No new API keys required (sentence-transformers is local)
|
| 142 |
+
- [ ] Magentic mode still works as before
|
src/agent_factory/judges.py
CHANGED
|
@@ -178,38 +178,13 @@ class MockJudgeHandler:
|
|
| 178 |
return findings if findings else ["No specific findings extracted (demo mode)"]
|
| 179 |
|
| 180 |
def _extract_drug_candidates(self, question: str, evidence: list[Evidence]) -> list[str]:
|
| 181 |
-
"""Extract
|
| 182 |
-
#
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
# Skip common words, keep potential drug names
|
| 189 |
-
if len(word) > 3 and word not in {
|
| 190 |
-
"what", "which", "could", "drugs", "drug", "medications",
|
| 191 |
-
"medicine", "treat", "treatment", "help", "best", "effective",
|
| 192 |
-
"repurposed", "repurposing", "disease", "condition", "therapy",
|
| 193 |
-
}:
|
| 194 |
-
# Capitalize as potential drug name
|
| 195 |
-
candidates.add(word.capitalize())
|
| 196 |
-
|
| 197 |
-
# Extract from evidence titles (look for capitalized terms)
|
| 198 |
-
for e in evidence[:10]:
|
| 199 |
-
words = e.citation.title.split()
|
| 200 |
-
for word in words:
|
| 201 |
-
# Look for capitalized words that might be drug names
|
| 202 |
-
cleaned = word.strip(".,;:()[]")
|
| 203 |
-
if (
|
| 204 |
-
len(cleaned) > 3
|
| 205 |
-
and cleaned[0].isupper()
|
| 206 |
-
and cleaned.lower() not in {"the", "and", "for", "with", "from"}
|
| 207 |
-
):
|
| 208 |
-
candidates.add(cleaned)
|
| 209 |
-
|
| 210 |
-
# Return top candidates or placeholder
|
| 211 |
-
candidate_list = list(candidates)[:5]
|
| 212 |
-
return candidate_list if candidate_list else ["See evidence below for potential candidates"]
|
| 213 |
|
| 214 |
async def assess(
|
| 215 |
self,
|
|
|
|
| 178 |
return findings if findings else ["No specific findings extracted (demo mode)"]
|
| 179 |
|
| 180 |
def _extract_drug_candidates(self, question: str, evidence: list[Evidence]) -> list[str]:
|
| 181 |
+
"""Extract drug candidates - demo mode returns honest message."""
|
| 182 |
+
# Don't attempt heuristic extraction - it produces garbage like "Oral", "Kidney"
|
| 183 |
+
# Real drug extraction requires LLM analysis
|
| 184 |
+
return [
|
| 185 |
+
"Drug identification requires AI analysis",
|
| 186 |
+
"Enter API key above for full results",
|
| 187 |
+
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 188 |
|
| 189 |
async def assess(
|
| 190 |
self,
|
src/orchestrator.py
CHANGED
|
@@ -43,6 +43,7 @@ class Orchestrator:
|
|
| 43 |
judge_handler: JudgeHandlerProtocol,
|
| 44 |
config: OrchestratorConfig | None = None,
|
| 45 |
enable_analysis: bool = False,
|
|
|
|
| 46 |
):
|
| 47 |
"""
|
| 48 |
Initialize the orchestrator.
|
|
@@ -52,15 +53,18 @@ class Orchestrator:
|
|
| 52 |
judge_handler: Handler for assessing evidence
|
| 53 |
config: Optional configuration (uses defaults if not provided)
|
| 54 |
enable_analysis: Whether to perform statistical analysis (if Modal available)
|
|
|
|
| 55 |
"""
|
| 56 |
self.search = search_handler
|
| 57 |
self.judge = judge_handler
|
| 58 |
self.config = config or OrchestratorConfig()
|
| 59 |
self.history: list[dict[str, Any]] = []
|
| 60 |
self._enable_analysis = enable_analysis and settings.modal_available
|
|
|
|
| 61 |
|
| 62 |
-
# Lazy-load
|
| 63 |
self._analyzer: Any = None
|
|
|
|
| 64 |
|
| 65 |
def _get_analyzer(self) -> Any:
|
| 66 |
"""Lazy initialization of StatisticalAnalyzer.
|
|
@@ -74,6 +78,41 @@ class Orchestrator:
|
|
| 74 |
self._analyzer = get_statistical_analyzer()
|
| 75 |
return self._analyzer
|
| 76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
async def _run_analysis_phase(
|
| 78 |
self, query: str, evidence: list[Evidence], iteration: int
|
| 79 |
) -> AsyncGenerator[AgentEvent, None]:
|
|
@@ -114,7 +153,7 @@ class Orchestrator:
|
|
| 114 |
iteration=iteration,
|
| 115 |
)
|
| 116 |
|
| 117 |
-
async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:
|
| 118 |
"""
|
| 119 |
Run the agent loop for a query.
|
| 120 |
|
|
@@ -171,11 +210,14 @@ class Orchestrator:
|
|
| 171 |
# Should not happen with return_exceptions=True but safe fallback
|
| 172 |
errors.append(f"Unknown result type for '{q}': {type(result)}")
|
| 173 |
|
| 174 |
-
# Deduplicate evidence by URL
|
| 175 |
seen_urls = {e.citation.url for e in all_evidence}
|
| 176 |
unique_new = [e for e in new_evidence if e.citation.url not in seen_urls]
|
| 177 |
all_evidence.extend(unique_new)
|
| 178 |
|
|
|
|
|
|
|
|
|
|
| 179 |
yield AgentEvent(
|
| 180 |
type="search_complete",
|
| 181 |
message=f"Found {len(unique_new)} new sources ({len(all_evidence)} total)",
|
|
|
|
| 43 |
judge_handler: JudgeHandlerProtocol,
|
| 44 |
config: OrchestratorConfig | None = None,
|
| 45 |
enable_analysis: bool = False,
|
| 46 |
+
enable_embeddings: bool = True,
|
| 47 |
):
|
| 48 |
"""
|
| 49 |
Initialize the orchestrator.
|
|
|
|
| 53 |
judge_handler: Handler for assessing evidence
|
| 54 |
config: Optional configuration (uses defaults if not provided)
|
| 55 |
enable_analysis: Whether to perform statistical analysis (if Modal available)
|
| 56 |
+
enable_embeddings: Whether to use semantic search for ranking/dedup
|
| 57 |
"""
|
| 58 |
self.search = search_handler
|
| 59 |
self.judge = judge_handler
|
| 60 |
self.config = config or OrchestratorConfig()
|
| 61 |
self.history: list[dict[str, Any]] = []
|
| 62 |
self._enable_analysis = enable_analysis and settings.modal_available
|
| 63 |
+
self._enable_embeddings = enable_embeddings
|
| 64 |
|
| 65 |
+
# Lazy-load services
|
| 66 |
self._analyzer: Any = None
|
| 67 |
+
self._embeddings: Any = None
|
| 68 |
|
| 69 |
def _get_analyzer(self) -> Any:
|
| 70 |
"""Lazy initialization of StatisticalAnalyzer.
|
|
|
|
| 78 |
self._analyzer = get_statistical_analyzer()
|
| 79 |
return self._analyzer
|
| 80 |
|
| 81 |
+
def _get_embeddings(self) -> Any:
|
| 82 |
+
"""Lazy initialization of EmbeddingService.
|
| 83 |
+
|
| 84 |
+
Uses local sentence-transformers - NO API key required.
|
| 85 |
+
"""
|
| 86 |
+
if self._embeddings is None and self._enable_embeddings:
|
| 87 |
+
try:
|
| 88 |
+
from src.services.embeddings import get_embedding_service
|
| 89 |
+
|
| 90 |
+
self._embeddings = get_embedding_service()
|
| 91 |
+
logger.info("Embedding service enabled for semantic ranking")
|
| 92 |
+
except Exception as e:
|
| 93 |
+
logger.warning("Embeddings unavailable, using basic ranking", error=str(e))
|
| 94 |
+
self._enable_embeddings = False
|
| 95 |
+
return self._embeddings
|
| 96 |
+
|
| 97 |
+
async def _deduplicate_and_rank(self, evidence: list[Evidence], query: str) -> list[Evidence]:
|
| 98 |
+
"""Use embeddings to deduplicate and rank evidence by relevance."""
|
| 99 |
+
embeddings = self._get_embeddings()
|
| 100 |
+
if not embeddings or not evidence:
|
| 101 |
+
return evidence
|
| 102 |
+
|
| 103 |
+
try:
|
| 104 |
+
# Deduplicate using semantic similarity
|
| 105 |
+
unique_evidence: list[Evidence] = await embeddings.deduplicate(evidence, threshold=0.85)
|
| 106 |
+
logger.info(
|
| 107 |
+
"Deduplicated evidence",
|
| 108 |
+
before=len(evidence),
|
| 109 |
+
after=len(unique_evidence),
|
| 110 |
+
)
|
| 111 |
+
return unique_evidence
|
| 112 |
+
except Exception as e:
|
| 113 |
+
logger.warning("Deduplication failed, using original", error=str(e))
|
| 114 |
+
return evidence
|
| 115 |
+
|
| 116 |
async def _run_analysis_phase(
|
| 117 |
self, query: str, evidence: list[Evidence], iteration: int
|
| 118 |
) -> AsyncGenerator[AgentEvent, None]:
|
|
|
|
| 153 |
iteration=iteration,
|
| 154 |
)
|
| 155 |
|
| 156 |
+
async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]: # noqa: PLR0915
|
| 157 |
"""
|
| 158 |
Run the agent loop for a query.
|
| 159 |
|
|
|
|
| 210 |
# Should not happen with return_exceptions=True but safe fallback
|
| 211 |
errors.append(f"Unknown result type for '{q}': {type(result)}")
|
| 212 |
|
| 213 |
+
# Deduplicate evidence by URL (fast, basic)
|
| 214 |
seen_urls = {e.citation.url for e in all_evidence}
|
| 215 |
unique_new = [e for e in new_evidence if e.citation.url not in seen_urls]
|
| 216 |
all_evidence.extend(unique_new)
|
| 217 |
|
| 218 |
+
# Semantic deduplication and ranking (if embeddings available)
|
| 219 |
+
all_evidence = await self._deduplicate_and_rank(all_evidence, query)
|
| 220 |
+
|
| 221 |
yield AgentEvent(
|
| 222 |
type="search_complete",
|
| 223 |
message=f"Found {len(unique_new)} new sources ({len(all_evidence)} total)",
|
src/tools/biorxiv.py
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
|
| 3 |
import re
|
| 4 |
from datetime import datetime, timedelta
|
| 5 |
-
from typing import Any
|
| 6 |
|
| 7 |
import httpx
|
| 8 |
from tenacity import retry, stop_after_attempt, wait_exponential
|
|
@@ -20,6 +20,211 @@ class BioRxivTool:
|
|
| 20 |
# Fetch papers from last N days
|
| 21 |
DEFAULT_DAYS = 90
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
def __init__(self, server: str = DEFAULT_SERVER, days: int = DEFAULT_DAYS) -> None:
|
| 24 |
"""
|
| 25 |
Initialize bioRxiv tool.
|
|
@@ -81,12 +286,11 @@ class BioRxivTool:
|
|
| 81 |
return [self._paper_to_evidence(paper) for paper in matching]
|
| 82 |
|
| 83 |
def _extract_terms(self, query: str) -> list[str]:
|
| 84 |
-
"""Extract search terms from query."""
|
| 85 |
# Simple tokenization, lowercase
|
| 86 |
terms = re.findall(r"\b\w+\b", query.lower())
|
| 87 |
-
# Filter out
|
| 88 |
-
|
| 89 |
-
return [t for t in terms if t not in stop_words and len(t) > 2]
|
| 90 |
|
| 91 |
def _filter_by_keywords(
|
| 92 |
self, papers: list[dict[str, Any]], terms: list[str], max_results: int
|
|
@@ -94,6 +298,9 @@ class BioRxivTool:
|
|
| 94 |
"""Filter papers that contain query terms in title or abstract."""
|
| 95 |
scored_papers = []
|
| 96 |
|
|
|
|
|
|
|
|
|
|
| 97 |
for paper in papers:
|
| 98 |
title = paper.get("title", "").lower()
|
| 99 |
abstract = paper.get("abstract", "").lower()
|
|
@@ -102,7 +309,8 @@ class BioRxivTool:
|
|
| 102 |
# Count matching terms
|
| 103 |
matches = sum(1 for term in terms if term in text)
|
| 104 |
|
| 105 |
-
|
|
|
|
| 106 |
scored_papers.append((matches, paper))
|
| 107 |
|
| 108 |
# Sort by match count (descending)
|
|
|
|
| 2 |
|
| 3 |
import re
|
| 4 |
from datetime import datetime, timedelta
|
| 5 |
+
from typing import Any, ClassVar
|
| 6 |
|
| 7 |
import httpx
|
| 8 |
from tenacity import retry, stop_after_attempt, wait_exponential
|
|
|
|
| 20 |
# Fetch papers from last N days
|
| 21 |
DEFAULT_DAYS = 90
|
| 22 |
|
| 23 |
+
# Comprehensive stop words list - these are too common to be useful for filtering
|
| 24 |
+
STOP_WORDS: ClassVar[set[str]] = {
|
| 25 |
+
# Articles and prepositions
|
| 26 |
+
"the",
|
| 27 |
+
"a",
|
| 28 |
+
"an",
|
| 29 |
+
"in",
|
| 30 |
+
"on",
|
| 31 |
+
"at",
|
| 32 |
+
"to",
|
| 33 |
+
"for",
|
| 34 |
+
"of",
|
| 35 |
+
"with",
|
| 36 |
+
"by",
|
| 37 |
+
"from",
|
| 38 |
+
"as",
|
| 39 |
+
"into",
|
| 40 |
+
"through",
|
| 41 |
+
"during",
|
| 42 |
+
"before",
|
| 43 |
+
"after",
|
| 44 |
+
"above",
|
| 45 |
+
"below",
|
| 46 |
+
"between",
|
| 47 |
+
"under",
|
| 48 |
+
"about",
|
| 49 |
+
"against",
|
| 50 |
+
"among",
|
| 51 |
+
# Conjunctions
|
| 52 |
+
"and",
|
| 53 |
+
"or",
|
| 54 |
+
"but",
|
| 55 |
+
"nor",
|
| 56 |
+
"so",
|
| 57 |
+
"yet",
|
| 58 |
+
"both",
|
| 59 |
+
"either",
|
| 60 |
+
"neither",
|
| 61 |
+
# Pronouns
|
| 62 |
+
"i",
|
| 63 |
+
"you",
|
| 64 |
+
"he",
|
| 65 |
+
"she",
|
| 66 |
+
"it",
|
| 67 |
+
"we",
|
| 68 |
+
"they",
|
| 69 |
+
"me",
|
| 70 |
+
"him",
|
| 71 |
+
"her",
|
| 72 |
+
"us",
|
| 73 |
+
"them",
|
| 74 |
+
"my",
|
| 75 |
+
"your",
|
| 76 |
+
"his",
|
| 77 |
+
"its",
|
| 78 |
+
"our",
|
| 79 |
+
"their",
|
| 80 |
+
"this",
|
| 81 |
+
"that",
|
| 82 |
+
"these",
|
| 83 |
+
"those",
|
| 84 |
+
"which",
|
| 85 |
+
"who",
|
| 86 |
+
"whom",
|
| 87 |
+
"whose",
|
| 88 |
+
"what",
|
| 89 |
+
"whatever",
|
| 90 |
+
# Question words
|
| 91 |
+
"when",
|
| 92 |
+
"where",
|
| 93 |
+
"why",
|
| 94 |
+
"how",
|
| 95 |
+
# Modal and auxiliary verbs
|
| 96 |
+
"is",
|
| 97 |
+
"are",
|
| 98 |
+
"was",
|
| 99 |
+
"were",
|
| 100 |
+
"be",
|
| 101 |
+
"been",
|
| 102 |
+
"being",
|
| 103 |
+
"am",
|
| 104 |
+
"have",
|
| 105 |
+
"has",
|
| 106 |
+
"had",
|
| 107 |
+
"having",
|
| 108 |
+
"do",
|
| 109 |
+
"does",
|
| 110 |
+
"did",
|
| 111 |
+
"doing",
|
| 112 |
+
"will",
|
| 113 |
+
"would",
|
| 114 |
+
"shall",
|
| 115 |
+
"should",
|
| 116 |
+
"can",
|
| 117 |
+
"could",
|
| 118 |
+
"may",
|
| 119 |
+
"might",
|
| 120 |
+
"must",
|
| 121 |
+
"need",
|
| 122 |
+
"ought",
|
| 123 |
+
# Common verbs
|
| 124 |
+
"get",
|
| 125 |
+
"got",
|
| 126 |
+
"make",
|
| 127 |
+
"made",
|
| 128 |
+
"take",
|
| 129 |
+
"taken",
|
| 130 |
+
"give",
|
| 131 |
+
"given",
|
| 132 |
+
"go",
|
| 133 |
+
"went",
|
| 134 |
+
"gone",
|
| 135 |
+
"come",
|
| 136 |
+
"came",
|
| 137 |
+
"see",
|
| 138 |
+
"saw",
|
| 139 |
+
"seen",
|
| 140 |
+
"know",
|
| 141 |
+
"knew",
|
| 142 |
+
"known",
|
| 143 |
+
"think",
|
| 144 |
+
"thought",
|
| 145 |
+
"find",
|
| 146 |
+
"found",
|
| 147 |
+
"show",
|
| 148 |
+
"shown",
|
| 149 |
+
"showed",
|
| 150 |
+
"use",
|
| 151 |
+
"used",
|
| 152 |
+
"using",
|
| 153 |
+
# Generic scientific terms (too common to filter on)
|
| 154 |
+
# Note: Keep medical terms like treatment, disease, drug - meaningful for queries
|
| 155 |
+
"study",
|
| 156 |
+
"studies",
|
| 157 |
+
"studied",
|
| 158 |
+
"result",
|
| 159 |
+
"results",
|
| 160 |
+
"method",
|
| 161 |
+
"methods",
|
| 162 |
+
"analysis",
|
| 163 |
+
"data",
|
| 164 |
+
"group",
|
| 165 |
+
"groups",
|
| 166 |
+
"research",
|
| 167 |
+
"findings",
|
| 168 |
+
"significant",
|
| 169 |
+
"associated",
|
| 170 |
+
"compared",
|
| 171 |
+
"observed",
|
| 172 |
+
"reported",
|
| 173 |
+
"participants",
|
| 174 |
+
"sample",
|
| 175 |
+
"samples",
|
| 176 |
+
# Other common words
|
| 177 |
+
"also",
|
| 178 |
+
"however",
|
| 179 |
+
"therefore",
|
| 180 |
+
"thus",
|
| 181 |
+
"although",
|
| 182 |
+
"because",
|
| 183 |
+
"since",
|
| 184 |
+
"while",
|
| 185 |
+
"if",
|
| 186 |
+
"then",
|
| 187 |
+
"than",
|
| 188 |
+
"such",
|
| 189 |
+
"same",
|
| 190 |
+
"different",
|
| 191 |
+
"other",
|
| 192 |
+
"another",
|
| 193 |
+
"each",
|
| 194 |
+
"every",
|
| 195 |
+
"all",
|
| 196 |
+
"any",
|
| 197 |
+
"some",
|
| 198 |
+
"no",
|
| 199 |
+
"not",
|
| 200 |
+
"only",
|
| 201 |
+
"just",
|
| 202 |
+
"more",
|
| 203 |
+
"most",
|
| 204 |
+
"less",
|
| 205 |
+
"least",
|
| 206 |
+
"very",
|
| 207 |
+
"much",
|
| 208 |
+
"many",
|
| 209 |
+
"few",
|
| 210 |
+
"new",
|
| 211 |
+
"old",
|
| 212 |
+
"first",
|
| 213 |
+
"last",
|
| 214 |
+
"next",
|
| 215 |
+
"previous",
|
| 216 |
+
"high",
|
| 217 |
+
"low",
|
| 218 |
+
"large",
|
| 219 |
+
"small",
|
| 220 |
+
"long",
|
| 221 |
+
"short",
|
| 222 |
+
"good",
|
| 223 |
+
"well",
|
| 224 |
+
"better",
|
| 225 |
+
"best",
|
| 226 |
+
}
|
| 227 |
+
|
| 228 |
def __init__(self, server: str = DEFAULT_SERVER, days: int = DEFAULT_DAYS) -> None:
|
| 229 |
"""
|
| 230 |
Initialize bioRxiv tool.
|
|
|
|
| 286 |
return [self._paper_to_evidence(paper) for paper in matching]
|
| 287 |
|
| 288 |
def _extract_terms(self, query: str) -> list[str]:
|
| 289 |
+
"""Extract meaningful search terms from query."""
|
| 290 |
# Simple tokenization, lowercase
|
| 291 |
terms = re.findall(r"\b\w+\b", query.lower())
|
| 292 |
+
# Filter out stop words and short terms
|
| 293 |
+
return [t for t in terms if t not in self.STOP_WORDS and len(t) > 2]
|
|
|
|
| 294 |
|
| 295 |
def _filter_by_keywords(
|
| 296 |
self, papers: list[dict[str, Any]], terms: list[str], max_results: int
|
|
|
|
| 298 |
"""Filter papers that contain query terms in title or abstract."""
|
| 299 |
scored_papers = []
|
| 300 |
|
| 301 |
+
# Require at least 2 matching terms, or all terms if fewer than 2
|
| 302 |
+
min_matches = min(2, len(terms)) if terms else 1
|
| 303 |
+
|
| 304 |
for paper in papers:
|
| 305 |
title = paper.get("title", "").lower()
|
| 306 |
abstract = paper.get("abstract", "").lower()
|
|
|
|
| 309 |
# Count matching terms
|
| 310 |
matches = sum(1 for term in terms if term in text)
|
| 311 |
|
| 312 |
+
# Only include papers meeting minimum match threshold
|
| 313 |
+
if matches >= min_matches:
|
| 314 |
scored_papers.append((matches, paper))
|
| 315 |
|
| 316 |
# Sort by match count (descending)
|