thearnabsarkar commited on
Commit
573f67b
Β·
verified Β·
1 Parent(s): 6e5f15c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +68 -6
README.md CHANGED
@@ -1,12 +1,74 @@
1
  ---
2
- title: Json Semval Validator
3
- emoji: 🐨
4
- colorFrom: red
5
- colorTo: green
6
  sdk: gradio
7
- sdk_version: 5.49.0
8
  app_file: app.py
9
  pinned: false
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: JSON Semantic Validator
3
+ emoji: πŸ”
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: Hybrid JSON validation with rules + ML auto-fixing
12
+ models:
13
+ - thearnabsarkar/json-semval-minilm-v1
14
+ datasets:
15
+ - thearnabsarkar/json-semval-synth-v1
16
  ---
17
 
18
+ # JSON Semantic Validator
19
+
20
+ A hybrid JSON validator combining deterministic JSON Schema validation with ML-powered semantic error detection and auto-fixing.
21
+
22
+ ## πŸš€ Quick Start
23
+
24
+ 1. **Select an example** from the dropdown or paste your own JSON schema and payload
25
+ 2. **Choose backend**:
26
+ - `rules-only`: Fast deterministic validation only
27
+ - `local`: Rules + ML predictions with heuristics
28
+ - `onnx`: Rules + ML with ONNX inference (fastest)
29
+ 3. **Click "Run Validation"** to see errors and suggested fixes
30
+ 4. **Enable "Apply minimal fixes"** to auto-correct issues
31
+
32
+ ## ✨ Features
33
+
34
+ - **Real-time validation** against JSON Schema Draft 2020-12
35
+ - **Format checking** for dates, emails, URIs, etc.
36
+ - **Smart error detection** using a fine-tuned MiniLM model
37
+ - **Auto-fixing** with 8 fix actions:
38
+ - Type casting (number, boolean)
39
+ - Date parsing and normalization
40
+ - Enum fuzzy matching
41
+ - Key renaming for aliases
42
+ - And more!
43
+
44
+ ## πŸ“Š Performance
45
+
46
+ - **Rules-only**: Detects schema violations
47
+ - **Hybrid (Rules + ML)**: 60-80% auto-fix success rate on synthetic data
48
+
49
+ ## πŸ”— Related Resources
50
+
51
+ - **Model**: [thearnabsarkar/json-semval-minilm-v1](https://huggingface.co/thearnabsarkar/json-semval-minilm-v1)
52
+ - **Dataset**: [thearnabsarkar/json-semval-synth-v1](https://huggingface.co/datasets/thearnabsarkar/json-semval-synth-v1)
53
+ - **GitHub**: [json-semantic-validator](https://github.com/thearnabsarkar/json-semantic-validator) (if applicable)
54
+
55
+ ## πŸ“ Examples
56
+
57
+ The app includes pre-loaded examples demonstrating:
58
+ - Type mismatches (`"25"` instead of `25`)
59
+ - Invalid dates (`"15 Jan 2024"` instead of `"2024-01-15"`)
60
+ - Enum typos (`"pendng"` instead of `"pending"`)
61
+ - Boolean text (`"yes"` instead of `true`)
62
+
63
+ Try the examples to see the hybrid validator in action!
64
+
65
+ ## πŸ› οΈ Technical Details
66
+
67
+ - **Base Model**: nreimers/MiniLM-L6-H384-uncased
68
+ - **Error Types**: 8 semantic error categories
69
+ - **Fix Actions**: 7 deterministic fix operations
70
+ - **Inference**: PyTorch or ONNX for fast CPU inference
71
+
72
+ ## License
73
+
74
+ MIT License