Legal AI Risk Analyzer β€” V7

Multi-task Legal-BERT for clause classification and risk scoring.

Performance

Metric Score
Classification Accuracy 0.8793 (87.93%)
F1 Score (weighted) 0.8750
RΒ² Score (risk scoring) 0.8692
MAE (risk scoring) 3.57 points
Risk Category Accuracy 0.9336 (93.36%)

Architecture

  • Base model: nlpaueb/legal-bert-base-uncased
  • Task 1: 100-class clause classification (LEDGAR)
  • Task 2: Risk score regression (0–100)
  • Heads: Linear(768β†’384β†’N) + ReLU + Dropout(0.25)

Usage

import torch, json
import torch.nn as nn
from transformers import AutoTokenizer, AutoModel

class MultiTaskLegalModel(nn.Module):
    def __init__(self, model_name, num_labels,
                 hidden_size=768, dropout_rate=0.25):
        super().__init__()
        self.bert       = AutoModel.from_pretrained(model_name)
        self.dropout    = nn.Dropout(dropout_rate)
        self.classifier = nn.Sequential(
            nn.Linear(hidden_size, hidden_size // 2),
            nn.ReLU(), nn.Dropout(dropout_rate),
            nn.Linear(hidden_size // 2, num_labels)
        )
        self.regressor  = nn.Sequential(
            nn.Linear(hidden_size, hidden_size // 2),
            nn.ReLU(), nn.Dropout(dropout_rate),
            nn.Linear(hidden_size // 2, 1),
            nn.Sigmoid()
        )
    def forward(self, input_ids, attention_mask):
        out = self.bert(input_ids=input_ids,
                        attention_mask=attention_mask,
                        return_dict=True)
        cls = self.dropout(out.last_hidden_state[:, 0, :])
        return self.classifier(cls), self.regressor(cls).squeeze(-1)


# ── Load ──────────────────────────────────────────────────────────────────
model_dir = "path/to/legal_ai_hf_upload"   # or HF repo id after upload

with open(f"{model_dir}/metadata.json") as f:
    meta = json.load(f)

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model     = MultiTaskLegalModel(model_dir, meta['num_labels'])
model.load_state_dict(
    torch.load(f"{model_dir}/full_model.pt", map_location="cpu")
)
model.eval()

# ── Inference ─────────────────────────────────────────────────────────────
def analyse_clause(text: str) -> dict:
    inputs = tokenizer(text, return_tensors="pt",
                       truncation=True, max_length=256, padding=True)
    with torch.no_grad():
        logits, risk = model(**inputs)
    label    = meta["label_names"][logits.argmax().item()]
    score    = round(float(risk.item()) * 100, 1)
    category = "Low" if score < 40 else ("Medium" if score < 70 else "High")
    confidence = round(float(torch.softmax(logits, dim=1).max()), 3)
    return {
        "clause_type": label,
        "confidence":  confidence,
        "risk_score":  score,
        "category":    category
    }

# Test
result = analyse_clause(
    "The Company shall indemnify and hold harmless from all claims "
    "without limitation whatsoever."
)
print(result)
# {'clause_type': 'Indemnifications', 'confidence': 0.91,
#   'risk_score': 85.0, 'category': 'High'}

Training details

  • Dataset: LEDGAR β€” 60k train / 10k val / 10k test
  • Hardware: 2Γ— NVIDIA T4 (DataParallel)
  • Batch size: 256 (128 per GPU) | FP16 mixed precision
  • Optimizer: AdamW + Layer-wise LR Decay (Ξ³=0.88)
  • Scheduler: Cosine annealing + warmup (6%)
  • Loss: CrossEntropyLoss (label_smoothing=0.1) + HuberLoss (Ξ΄=0.1)
  • Epochs: 15
Downloads last month
55
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train AnkushRaheja/Cls_Class_Risk_Scr