Legal AI Risk Analyzer β V7
Multi-task Legal-BERT for clause classification and risk scoring.
Performance
| Metric | Score |
|---|---|
| Classification Accuracy | 0.8793 (87.93%) |
| F1 Score (weighted) | 0.8750 |
| RΒ² Score (risk scoring) | 0.8692 |
| MAE (risk scoring) | 3.57 points |
| Risk Category Accuracy | 0.9336 (93.36%) |
Architecture
- Base model:
nlpaueb/legal-bert-base-uncased - Task 1: 100-class clause classification (LEDGAR)
- Task 2: Risk score regression (0β100)
- Heads: Linear(768β384βN) + ReLU + Dropout(0.25)
Usage
import torch, json
import torch.nn as nn
from transformers import AutoTokenizer, AutoModel
class MultiTaskLegalModel(nn.Module):
def __init__(self, model_name, num_labels,
hidden_size=768, dropout_rate=0.25):
super().__init__()
self.bert = AutoModel.from_pretrained(model_name)
self.dropout = nn.Dropout(dropout_rate)
self.classifier = nn.Sequential(
nn.Linear(hidden_size, hidden_size // 2),
nn.ReLU(), nn.Dropout(dropout_rate),
nn.Linear(hidden_size // 2, num_labels)
)
self.regressor = nn.Sequential(
nn.Linear(hidden_size, hidden_size // 2),
nn.ReLU(), nn.Dropout(dropout_rate),
nn.Linear(hidden_size // 2, 1),
nn.Sigmoid()
)
def forward(self, input_ids, attention_mask):
out = self.bert(input_ids=input_ids,
attention_mask=attention_mask,
return_dict=True)
cls = self.dropout(out.last_hidden_state[:, 0, :])
return self.classifier(cls), self.regressor(cls).squeeze(-1)
# ββ Load ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
model_dir = "path/to/legal_ai_hf_upload" # or HF repo id after upload
with open(f"{model_dir}/metadata.json") as f:
meta = json.load(f)
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = MultiTaskLegalModel(model_dir, meta['num_labels'])
model.load_state_dict(
torch.load(f"{model_dir}/full_model.pt", map_location="cpu")
)
model.eval()
# ββ Inference βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
def analyse_clause(text: str) -> dict:
inputs = tokenizer(text, return_tensors="pt",
truncation=True, max_length=256, padding=True)
with torch.no_grad():
logits, risk = model(**inputs)
label = meta["label_names"][logits.argmax().item()]
score = round(float(risk.item()) * 100, 1)
category = "Low" if score < 40 else ("Medium" if score < 70 else "High")
confidence = round(float(torch.softmax(logits, dim=1).max()), 3)
return {
"clause_type": label,
"confidence": confidence,
"risk_score": score,
"category": category
}
# Test
result = analyse_clause(
"The Company shall indemnify and hold harmless from all claims "
"without limitation whatsoever."
)
print(result)
# {'clause_type': 'Indemnifications', 'confidence': 0.91,
# 'risk_score': 85.0, 'category': 'High'}
Training details
- Dataset: LEDGAR β 60k train / 10k val / 10k test
- Hardware: 2Γ NVIDIA T4 (DataParallel)
- Batch size: 256 (128 per GPU) | FP16 mixed precision
- Optimizer: AdamW + Layer-wise LR Decay (Ξ³=0.88)
- Scheduler: Cosine annealing + warmup (6%)
- Loss: CrossEntropyLoss (label_smoothing=0.1) + HuberLoss (Ξ΄=0.1)
- Epochs: 15
- Downloads last month
- 55