Payment Extraction Model (Llama 3.2-1B)

Fine-tuned Llama 3.2-1B-Instruct for extracting payment information from multilingual text (English, Uzbek, Russian).

Model Details

  • Base Model: meta-llama/Llama-3.2-1B-Instruct
  • Training Data: 4,082 examples
  • Training Duration: 5 epochs
  • Method: LoRA (Low-Rank Adaptation)
  • Best Checkpoint: Step 900 (validation loss: 0.384)
  • Trainable Parameters: 0.9% (11.27M / 1.24B)

Capabilities

Extracts structured payment information:

  • amount: Payment amount
  • receiver_name: Recipient name
  • receiver_inn: Tax identification number
  • receiver_account: Bank account number
  • mfo: Bank code
  • payment_purpose: Purpose of payment
  • purpose_code: Payment purpose code
  • intent: Classification (create_transaction, partial_create_transaction, list_transaction)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    device_map="auto",
    torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(base_model, "primel/aibama")
tokenizer = AutoTokenizer.from_pretrained("primel/aibama")

# Extract payment info
text = "Transfer 500000 to LLC Technopark, INN 123456789"
prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a payment extraction assistant. Extract payment information from text and return ONLY valid JSON.<|eot_id|><|start_header_id|>user<|end_header_id|>

{text}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Training Data Distribution

  • create_transaction: 36.7% (1,500 examples)
  • partial_create_transaction: 52.6% (2,148 examples)
  • list_transaction: 10.6% (434 examples)

Performance

Metric Value
Training Loss 0.3785
Validation Loss 0.3844
Mean Token Accuracy 92.59%
Entropy 0.424

Limitations

  • Optimized for payment-related text in English, Uzbek, and Russian
  • May require base model access (Llama 3.2 license)
  • Best performance on structured payment instructions

Citation

@misc{payment-extractor-llama32,
  author = {Your Name},
  title = {Payment Extraction Model},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/primel/aibama}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for primel/aibama

Adapter
(505)
this model