File size: 2,361 Bytes
6e5f15c 573f67b 6e5f15c 573f67b 6e5f15c 573f67b 6e5f15c 573f67b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
title: JSON Semantic Validator
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
short_description: Hybrid JSON validation with rules + ML auto-fixing
models:
- thearnabsarkar/json-semval-minilm-v1
datasets:
- thearnabsarkar/json-semval-synth-v1
---
# JSON Semantic Validator
A hybrid JSON validator combining deterministic JSON Schema validation with ML-powered semantic error detection and auto-fixing.
## π Quick Start
1. **Select an example** from the dropdown or paste your own JSON schema and payload
2. **Choose backend**:
- `rules-only`: Fast deterministic validation only
- `local`: Rules + ML predictions with heuristics
- `onnx`: Rules + ML with ONNX inference (fastest)
3. **Click "Run Validation"** to see errors and suggested fixes
4. **Enable "Apply minimal fixes"** to auto-correct issues
## β¨ Features
- **Real-time validation** against JSON Schema Draft 2020-12
- **Format checking** for dates, emails, URIs, etc.
- **Smart error detection** using a fine-tuned MiniLM model
- **Auto-fixing** with 8 fix actions:
- Type casting (number, boolean)
- Date parsing and normalization
- Enum fuzzy matching
- Key renaming for aliases
- And more!
## π Performance
- **Rules-only**: Detects schema violations
- **Hybrid (Rules + ML)**: 60-80% auto-fix success rate on synthetic data
## π Related Resources
- **Model**: [thearnabsarkar/json-semval-minilm-v1](https://huggingface.co/thearnabsarkar/json-semval-minilm-v1)
- **Dataset**: [thearnabsarkar/json-semval-synth-v1](https://huggingface.co/datasets/thearnabsarkar/json-semval-synth-v1)
- **GitHub**: [json-semantic-validator](https://github.com/thearnabsarkar/json-semantic-validator) (if applicable)
## π Examples
The app includes pre-loaded examples demonstrating:
- Type mismatches (`"25"` instead of `25`)
- Invalid dates (`"15 Jan 2024"` instead of `"2024-01-15"`)
- Enum typos (`"pendng"` instead of `"pending"`)
- Boolean text (`"yes"` instead of `true`)
Try the examples to see the hybrid validator in action!
## π οΈ Technical Details
- **Base Model**: nreimers/MiniLM-L6-H384-uncased
- **Error Types**: 8 semantic error categories
- **Fix Actions**: 7 deterministic fix operations
- **Inference**: PyTorch or ONNX for fast CPU inference
## License
MIT License
|