|
|
--- |
|
|
title: JSON Semantic Validator |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: gradio |
|
|
sdk_version: 4.44.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: Hybrid JSON validation with rules + ML auto-fixing |
|
|
models: |
|
|
- thearnabsarkar/json-semval-minilm-v1 |
|
|
datasets: |
|
|
- thearnabsarkar/json-semval-synth-v1 |
|
|
--- |
|
|
|
|
|
# JSON Semantic Validator |
|
|
|
|
|
A hybrid JSON validator combining deterministic JSON Schema validation with ML-powered semantic error detection and auto-fixing. |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
1. **Select an example** from the dropdown or paste your own JSON schema and payload |
|
|
2. **Choose backend**: |
|
|
- `rules-only`: Fast deterministic validation only |
|
|
- `local`: Rules + ML predictions with heuristics |
|
|
- `onnx`: Rules + ML with ONNX inference (fastest) |
|
|
3. **Click "Run Validation"** to see errors and suggested fixes |
|
|
4. **Enable "Apply minimal fixes"** to auto-correct issues |
|
|
|
|
|
## β¨ Features |
|
|
|
|
|
- **Real-time validation** against JSON Schema Draft 2020-12 |
|
|
- **Format checking** for dates, emails, URIs, etc. |
|
|
- **Smart error detection** using a fine-tuned MiniLM model |
|
|
- **Auto-fixing** with 8 fix actions: |
|
|
- Type casting (number, boolean) |
|
|
- Date parsing and normalization |
|
|
- Enum fuzzy matching |
|
|
- Key renaming for aliases |
|
|
- And more! |
|
|
|
|
|
## π Performance |
|
|
|
|
|
- **Rules-only**: Detects schema violations |
|
|
- **Hybrid (Rules + ML)**: 60-80% auto-fix success rate on synthetic data |
|
|
|
|
|
## π Related Resources |
|
|
|
|
|
- **Model**: [thearnabsarkar/json-semval-minilm-v1](https://huggingface.co/thearnabsarkar/json-semval-minilm-v1) |
|
|
- **Dataset**: [thearnabsarkar/json-semval-synth-v1](https://huggingface.co/datasets/thearnabsarkar/json-semval-synth-v1) |
|
|
- **GitHub**: [json-semantic-validator](https://github.com/thearnabsarkar/json-semantic-validator) (if applicable) |
|
|
|
|
|
## π Examples |
|
|
|
|
|
The app includes pre-loaded examples demonstrating: |
|
|
- Type mismatches (`"25"` instead of `25`) |
|
|
- Invalid dates (`"15 Jan 2024"` instead of `"2024-01-15"`) |
|
|
- Enum typos (`"pendng"` instead of `"pending"`) |
|
|
- Boolean text (`"yes"` instead of `true`) |
|
|
|
|
|
Try the examples to see the hybrid validator in action! |
|
|
|
|
|
## π οΈ Technical Details |
|
|
|
|
|
- **Base Model**: nreimers/MiniLM-L6-H384-uncased |
|
|
- **Error Types**: 8 semantic error categories |
|
|
- **Fix Actions**: 7 deterministic fix operations |
|
|
- **Inference**: PyTorch or ONNX for fast CPU inference |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|