Commit
Β·
7e1184a
1
Parent(s):
65d200f
docs: P1 bug for uninterpretable chain-of-thought events (#106)
Browse filesDocumented issue where Advanced Mode exposes raw internal framework
events from agent-framework-core to users:
- Manager (user_task), (task_ledger), (instruction) are internal
- Hard truncation at 200 chars makes messages uninterpretable
- All mapped to "judging" type incorrectly
Root cause: _process_event() in advanced.py doesn't filter or
transform MagenticOrchestratorMessageEvent events.
Fixes: Filter internal events or transform to user-friendly messages.
docs/bugs/ACTIVE_BUGS.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# Active Bugs
|
| 2 |
|
| 3 |
-
> Last updated: 2025-12-01 (
|
| 4 |
>
|
| 5 |
> **Note:** Completed bug docs archived to `docs/bugs/archive/`
|
| 6 |
> **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
|
|
@@ -13,6 +13,29 @@ _No active P0 bugs._
|
|
| 13 |
|
| 14 |
## P1 - Important
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
### P1 - Memory Layer Not Integrated (Post-Hackathon)
|
| 17 |
**Issue:** [#73](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/73)
|
| 18 |
**Spec:** [SPEC_08_INTEGRATE_MEMORY_LAYER.md](../specs/SPEC_08_INTEGRATE_MEMORY_LAYER.md)
|
|
|
|
| 1 |
# Active Bugs
|
| 2 |
|
| 3 |
+
> Last updated: 2025-12-01 (02:50 PST)
|
| 4 |
>
|
| 5 |
> **Note:** Completed bug docs archived to `docs/bugs/archive/`
|
| 6 |
> **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
|
|
|
|
| 13 |
|
| 14 |
## P1 - Important
|
| 15 |
|
| 16 |
+
### P1 - Advanced Mode Exposes Uninterpretable Chain-of-Thought
|
| 17 |
+
**Issue:** [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
|
| 18 |
+
**File:** [P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md](P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md)
|
| 19 |
+
**Found:** 2025-12-01 (Manual Testing)
|
| 20 |
+
|
| 21 |
+
**Problem:** Advanced orchestrator exposes raw internal framework events to users:
|
| 22 |
+
- `Manager (user_task): Research sexual health and wellness interventions for...`
|
| 23 |
+
- `Manager (task_ledger): We are working to address...`
|
| 24 |
+
- `Manager (instruction): Conduct targeted searches on PubMed...`
|
| 25 |
+
|
| 26 |
+
These are framework-internal bookkeeping truncated at 200 chars, making them uninterpretable.
|
| 27 |
+
|
| 28 |
+
**Root Cause:** `_process_event()` in `advanced.py` doesn't filter or transform `MagenticOrchestratorMessageEvent` events from `agent-framework-core`.
|
| 29 |
+
|
| 30 |
+
**Solution Options:**
|
| 31 |
+
1. Filter internal events (`user_task`, `task_ledger`, `instruction`)
|
| 32 |
+
2. Transform to user-friendly messages ("Manager assigning search task...")
|
| 33 |
+
3. Add verbose mode for debugging
|
| 34 |
+
|
| 35 |
+
**Status:** Open
|
| 36 |
+
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
### P1 - Memory Layer Not Integrated (Post-Hackathon)
|
| 40 |
**Issue:** [#73](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/73)
|
| 41 |
**Spec:** [SPEC_08_INTEGRATE_MEMORY_LAYER.md](../specs/SPEC_08_INTEGRATE_MEMORY_LAYER.md)
|
docs/bugs/P1_ADVANCED_MODE_UNINTERPRETABLE_CHAIN_OF_THOUGHT.md
ADDED
|
@@ -0,0 +1,172 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# P1: Advanced Mode Exposes Uninterpretable Chain-of-Thought Events
|
| 2 |
+
|
| 3 |
+
**Priority**: P1 (UX Degradation)
|
| 4 |
+
**Component**: `src/orchestrators/advanced.py`
|
| 5 |
+
**Status**: Open
|
| 6 |
+
**Issue**: [#106](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/106)
|
| 7 |
+
**Created**: 2025-12-01
|
| 8 |
+
|
| 9 |
+
## Summary
|
| 10 |
+
|
| 11 |
+
The Advanced orchestrator exposes raw internal framework events from `agent-framework-core` directly to users. These events contain internal manager bookkeeping (task assignments, ledgers, instructions) that are:
|
| 12 |
+
|
| 13 |
+
1. Truncated mid-sentence at 200 characters
|
| 14 |
+
2. Use internal framework terminology (`user_task`, `task_ledger`, `instruction`)
|
| 15 |
+
3. Shown with misleading "JUDGING" event type
|
| 16 |
+
4. Not meaningful to end users
|
| 17 |
+
|
| 18 |
+
## Example of Bad Output
|
| 19 |
+
|
| 20 |
+
```
|
| 21 |
+
π§ **JUDGING**: Manager (user_task): Research sexual health and wellness interventions for: sildenafil mechanism ##...
|
| 22 |
+
|
| 23 |
+
π§ **JUDGING**: Manager (task_ledger): We are working to address the following user request: Research sexual healt...
|
| 24 |
+
|
| 25 |
+
π§ **JUDGING**: Manager (instruction): Conduct targeted searches on PubMed, ClinicalTrials.gov, and Europe PMC to ga...
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
Users see:
|
| 29 |
+
- Raw internal prompts being passed between manager and agents
|
| 30 |
+
- Truncated text that cuts off mid-word ("healt...", "ga...")
|
| 31 |
+
- Technical jargon ("task_ledger") with no context
|
| 32 |
+
- All events labeled as "JUDGING" even when they're task assignments
|
| 33 |
+
|
| 34 |
+
## Root Cause Analysis
|
| 35 |
+
|
| 36 |
+
### The Chain of Issues
|
| 37 |
+
|
| 38 |
+
| Location | Issue |
|
| 39 |
+
|----------|-------|
|
| 40 |
+
| `src/orchestrators/advanced.py:363-370` | `MagenticOrchestratorMessageEvent` raw events exposed without filtering |
|
| 41 |
+
| `src/orchestrators/advanced.py:368` | `event.kind` values (`user_task`, `task_ledger`, `instruction`) are internal framework concepts |
|
| 42 |
+
| `src/orchestrators/advanced.py:368` | Hard truncation: `text[:200]...` breaks mid-sentence |
|
| 43 |
+
| `src/orchestrators/advanced.py:367` | All manager events mapped to `type="judging"` regardless of actual purpose |
|
| 44 |
+
| `src/orchestrators/advanced.py:380` | Agent messages also truncated at 200 chars |
|
| 45 |
+
| `src/utils/models.py:136` | `"judging": "π§ "` icon shown for all these internal events |
|
| 46 |
+
| `src/app.py:248` | Events displayed verbatim via `event.to_markdown()` |
|
| 47 |
+
|
| 48 |
+
### Code Path
|
| 49 |
+
|
| 50 |
+
```
|
| 51 |
+
agent-framework-core (Microsoft)
|
| 52 |
+
β
|
| 53 |
+
MagenticOrchestratorMessageEvent(kind="task_ledger", message="...")
|
| 54 |
+
β
|
| 55 |
+
advanced.py:_process_event() - NO FILTERING
|
| 56 |
+
β
|
| 57 |
+
AgentEvent(type="judging", message=f"Manager ({event.kind}): {text[:200]}...")
|
| 58 |
+
β
|
| 59 |
+
models.py:to_markdown() β "π§ **JUDGING**: Manager (task_ledger): ..."
|
| 60 |
+
β
|
| 61 |
+
app.py β Displayed to user verbatim
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
## Impact
|
| 65 |
+
|
| 66 |
+
1. **User Confusion**: Users see internal framework bookkeeping, not meaningful progress
|
| 67 |
+
2. **Truncated Gibberish**: 200-char limit cuts prompts mid-sentence, making them uninterpretable
|
| 68 |
+
3. **Misleading Labels**: "JUDGING" event type is wrong - these are task assignments
|
| 69 |
+
4. **No Actionable Info**: Users can't understand what the system is actually doing
|
| 70 |
+
|
| 71 |
+
## Proposed Solutions
|
| 72 |
+
|
| 73 |
+
### Option A: Filter Internal Events (Minimal)
|
| 74 |
+
|
| 75 |
+
Skip internal manager events entirely - they're framework bookkeeping:
|
| 76 |
+
|
| 77 |
+
```python
|
| 78 |
+
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
|
| 79 |
+
if isinstance(event, MagenticOrchestratorMessageEvent):
|
| 80 |
+
# Skip internal framework bookkeeping events
|
| 81 |
+
if event.kind in ("user_task", "task_ledger", "instruction"):
|
| 82 |
+
return None # Don't expose to users
|
| 83 |
+
# ... rest of handling
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
**Pros**: Simple, removes noise
|
| 87 |
+
**Cons**: Users lose visibility into manager activity
|
| 88 |
+
|
| 89 |
+
### Option B: Transform to User-Friendly Messages (Better UX)
|
| 90 |
+
|
| 91 |
+
Map internal events to meaningful user messages:
|
| 92 |
+
|
| 93 |
+
```python
|
| 94 |
+
MANAGER_EVENT_MESSAGES = {
|
| 95 |
+
"user_task": "Manager received research task",
|
| 96 |
+
"task_ledger": "Manager tracking task progress",
|
| 97 |
+
"instruction": "Manager assigning work to agent",
|
| 98 |
+
}
|
| 99 |
+
|
| 100 |
+
def _process_event(self, event: Any, iteration: int) -> AgentEvent | None:
|
| 101 |
+
if isinstance(event, MagenticOrchestratorMessageEvent):
|
| 102 |
+
if event.kind in MANAGER_EVENT_MESSAGES:
|
| 103 |
+
return AgentEvent(
|
| 104 |
+
type="progress", # Not "judging"!
|
| 105 |
+
message=MANAGER_EVENT_MESSAGES[event.kind],
|
| 106 |
+
iteration=iteration,
|
| 107 |
+
)
|
| 108 |
+
```
|
| 109 |
+
|
| 110 |
+
**Pros**: Users see meaningful progress, correct event types
|
| 111 |
+
**Cons**: More code, loses raw detail for debugging
|
| 112 |
+
|
| 113 |
+
### Option C: Smart Truncation + Verbose Mode
|
| 114 |
+
|
| 115 |
+
1. Truncate at sentence boundaries, not hard character limit
|
| 116 |
+
2. Add `verbose_mode` setting that shows full internal events for debugging
|
| 117 |
+
3. Use appropriate event types based on `event.kind`
|
| 118 |
+
|
| 119 |
+
```python
|
| 120 |
+
def _smart_truncate(self, text: str, max_len: int = 200) -> str:
|
| 121 |
+
"""Truncate at sentence boundary."""
|
| 122 |
+
if len(text) <= max_len:
|
| 123 |
+
return text
|
| 124 |
+
# Find last sentence boundary before limit
|
| 125 |
+
truncated = text[:max_len]
|
| 126 |
+
last_period = truncated.rfind(". ")
|
| 127 |
+
if last_period > max_len // 2:
|
| 128 |
+
return truncated[:last_period + 1]
|
| 129 |
+
return truncated.rsplit(" ", 1)[0] + "..."
|
| 130 |
+
```
|
| 131 |
+
|
| 132 |
+
### Recommended Approach
|
| 133 |
+
|
| 134 |
+
**Combine Option A + B**:
|
| 135 |
+
|
| 136 |
+
1. **Default**: Filter out `task_ledger` and `instruction` events (pure bookkeeping)
|
| 137 |
+
2. **Transform**: `user_task` β "Assigning research task to agents"
|
| 138 |
+
3. **Proper Types**: Use `"progress"` not `"judging"` for manager events
|
| 139 |
+
4. **Future**: Add verbose mode for debugging
|
| 140 |
+
|
| 141 |
+
## Files to Modify
|
| 142 |
+
|
| 143 |
+
1. `src/orchestrators/advanced.py:361-410` - `_process_event()` method
|
| 144 |
+
2. `src/utils/models.py:107-123` - Add new event types if needed
|
| 145 |
+
3. `tests/unit/orchestrators/test_advanced_timeout.py` - Update assertions
|
| 146 |
+
|
| 147 |
+
## Related Issues
|
| 148 |
+
|
| 149 |
+
- P0: Advanced Mode Timeout No Synthesis (FIXED in PR #104)
|
| 150 |
+
- This P1 was discovered while testing the P0 fix
|
| 151 |
+
|
| 152 |
+
## Testing the Bug
|
| 153 |
+
|
| 154 |
+
```python
|
| 155 |
+
import asyncio
|
| 156 |
+
from src.orchestrators.advanced import AdvancedOrchestrator
|
| 157 |
+
|
| 158 |
+
async def test():
|
| 159 |
+
orch = AdvancedOrchestrator(max_rounds=3)
|
| 160 |
+
async for event in orch.run("sildenafil mechanism"):
|
| 161 |
+
if "Manager" in event.message:
|
| 162 |
+
print(f"[{event.type}] {event.message}")
|
| 163 |
+
# You'll see uninterpretable output
|
| 164 |
+
|
| 165 |
+
asyncio.run(test())
|
| 166 |
+
```
|
| 167 |
+
|
| 168 |
+
## References
|
| 169 |
+
|
| 170 |
+
- Microsoft Agent Framework: https://github.com/microsoft/agent-framework
|
| 171 |
+
- AgentEvent model: `src/utils/models.py:104`
|
| 172 |
+
- Advanced orchestrator: `src/orchestrators/advanced.py`
|