# Wearable Anomaly Detector 接口文档

本指南说明 `hf_release` 目录下各核心模块的调用方式，涵盖输入输出格式、常用方法与示例。

---

## 1. 初始化

```python
from wearable_anomaly_detector import WearableAnomalyDetector

detector = WearableAnomalyDetector(
    model_dir="checkpoints/phase2/exp_factor_balanced",
    device="cpu",          # 可选，默认自动检测
    threshold=None         # 可选，未设置时使用配置/默认阈值
)
```

| 参数 | 说明 |
| --- | --- |
| `model_dir` | Phase2 最佳权重所在目录，必须包含 `best_model.pt` |
| `device` | `"cpu"` / `"cuda"` / `"cuda:0"` 等 |
| `threshold` | 手动指定异常阈值（浮点数），不指定则使用配置或默认值 0.53 |

---

## 2. 数据结构

### 2.1 单个数据点 (`dict`)

```python
{
  "timestamp": "2025-01-01T08:00:00",
  "deviceId": "demo_user",
  "features": {
    "hr": 72.0,
    "hrv_rmssd": 35.0,
    "time_period_primary": "day",
    "data_quality": "high",
    "...": "..."
  },
  "static_features": {
    "age_group": 2,
    "sex": 0,
    "exercise": 1
  }
}
```

- 单个窗口需 12 条 5 分钟数据，顺序按时间递增。
- 缺失字段会自动回退到 `configs/features_config.json` 的默认值或分类映射。

### 2.2 `data_points` / `windows`

- **实时检测**需要 1 个窗口（`List[Dict]`）。
- **模式聚合**可传 `List[List[Dict]]`（每个内部列表代表一天）。

### 2.3 缺失字段与低质量数据

- 若某些特征缺失，直接删除键即可，推理时会回退到默认值。
- 静态特征缺失可将 `static_features` 设为空字典。
- 对于传感器丢包，可将 `hr` 等数值设为 `float("nan")`，模型会忽略该值。
- 仓库内提供 `test_data/example_window.json`，可直接作为 12 条完整窗口输入，用于验证 API 行为。

```python
window = build_window()
for point in window:
    point["features"].pop("hr_resting", None)        # 删除可选特征
    point["features"]["data_quality"] = "low"        # 标记质量
window[0]["static_features"] = {}                   # 缺少静态信息
window[3]["features"]["hr"] = float("nan")          # 某个时间点无心率

result = detector.detect_realtime(window, update_baseline=False)
```

```python
import json, Path
with open("test_data/example_window.json", "r") as f:
    sample_window = json.load(f)
result = detector.detect_realtime(sample_window, update_baseline=False)
```

### 2.4 官方测试脚本

若只想“读取一个 JSON → 获取模型输出”，可以直接运行：

```bash
python run_official_inference.py \
  --window-file test_data/example_window.json \
  --model-dir checkpoints/phase2/exp_factor_balanced
```

脚本会输出：

1. 模型原始 JSON 结果
2. 由 `AnomalyFormatter` 生成的 Markdown 文本

替换 `--window-file` 为自己的窗口数据即可模拟正式 API 调用。

---

## 3. `WearableAnomalyDetector` 方法

### 3.1 `predict(data_points, return_score=True, return_details=False)`

用于直接推理（无附加逻辑）。

```python
result = detector.predict(window, return_score=True, return_details=True)
```

**返回示例**

```python
{
  "is_anomaly": False,
  "threshold": 0.53,
  "anomaly_score": 0.47,
  "details": {
    "window_size": 12,
    "model_output": 0.47,
    "prediction_confidence": 0.06
  }
}
```

### 3.2 `detect_realtime(data_points, update_baseline=True, ...)`

在 `predict` 基础上，附加基线更新等逻辑，适合直接接入实时服务。

```python
result = detector.detect_realtime(window, update_baseline=False)
```

| 参数 | 默认值 | 说明 |
| --- | --- | --- |
| `data_points` | 必填 | 最新窗口数据 |
| `update_baseline` | `True` | 是否在推理后更新基线 |
| `return_score` | `True` | 是否返回异常分数 |
| `return_details` | `False` | 是否返回详细字段 |

### 3.3 `detect_pattern(data_points, days=None, min_duration_days=3, format_for_llm=False)`

对多天数据做异常模式聚合，输出模式摘要及可选的 LLM 文本。

```python
pattern_result = detector.detect_pattern(daily_data, days=7, format_for_llm=True)
```

**返回示例**

```python
{
  "anomaly_pattern": {
    "has_pattern": True,
    "duration_days": 3,
    "trend": "stable",
    "anomaly_type": "continuous_anomaly"
  },
  "formatted_for_llm": "...结构化 Markdown 文本..."
}
```

---

## 4. `AnomalyFormatter`

将检测结果、基线信息、历史趋势等转换为适合 LLM 的文本。

```python
from utils.formatter import AnomalyFormatter

formatter = AnomalyFormatter()  # 可传 config_path 指向自定义格式
text = formatter.format_for_llm(
    anomaly_result=result,
    baseline_info={
        "baseline_mean": 75.0,
        "baseline_std": 5.0,
        "current_value": 68.0,
        "deviation_pct": -9.3
    },
    daily_results=None
)
print(text)
```

**常用参数**

| 参数 | 类型 | 说明 |
| --- | --- | --- |
| `anomaly_result` | `dict` | 来自 `predict/detect_realtime` 的结果 |
| `baseline_info` | `dict` | 基线均值/标准差、当前值、偏离百分比等 |
| `related_indicators` | `dict` | 睡眠、活动、压力等指标，可选 |
| `daily_results` | `List[dict]` | 多天趋势（日期 + HRV/分数），可选 |

---

## 5. `BaselineStorage`（可选）

路径：`utils/baseline_storage.py`

```python
from utils.baseline_storage import BaselineStorage

storage = BaselineStorage(
    storage_type="file",
    file_path="data_storage/baselines.json",
    import_from_csv=False
)

storage.save_baseline({
    "device_id": "demo_user",
    "feature_name": "hrv_rmssd",
    "baseline_type": "personal",
    "baseline_mean": 75.0,
    "baseline_std": 5.0,
    "data_count": 30
})

baseline = storage.get_baseline("demo_user", "hrv_rmssd")
storage.update_baseline_incremental("demo_user", "hrv_rmssd", new_value=70.0, data_count=baseline["data_count"] + 1)
```

---

## 6. 快速脚本

| 场景 | 文件 | 说明 |
| --- | --- | --- |
| 官方推理（与线上一致） | `run_official_inference.py` | `python run_official_inference.py --window-file test_data/example_window.json` |
| 多场景演示（随机噪声/缺失/连续异常） | `test_quickstart.py` | `python test_quickstart.py`（演示中会暂时调低阈值） |
| PatchTrAD → build_case 双模式演示 | `simulate_patchad_case_pipeline.py` | `python simulate_patchad_case_pipeline.py --mode all`（输出预筛结果、case、校验信息） |
| 交互体验（选择样例并查看输出） | `gradio_app.py` | `python gradio_app.py` 或部署到 Hugging Face Space |

### 6.1 PatchTrAD + build_case 演示

```bash
python simulate_patchad_case_pipeline.py --mode all
```

输出包含：
- 模式A：平台自带 PatchTrAD，直接 `POST /api/build_case`
- 模式B：官方 `precheck` → `build_case` 两次交互
- 校验失败示例（缺少 history_windows）

可通过 `--mode platform` 或 `--mode official` 单独运行某个流程，也可替换 `--data-file` 为自有 JSONL。

---

## 7. 常见问题

| 问题 | 处理方式 |
| --- | --- |
| 没有配置文件 | `_load_config` 会自动回退默认值，无需额外设置 |
| 没有静态特征 | `FeatureCalculator` 将使用配置中的默认值 |
| 想换窗口尺寸 | 修改 `configs/detector_config.json` 中的 `detection.window_size` |
| 想换特征列表 | 修改 `configs/features_config.json`，无需改代码 |

---

如需更多示例或扩展，欢迎查看 `README.md` 的“真实数据测试”章节或提交 Issue。祝使用顺利！