Update README.md

5b233c2 verified about 1 year ago

5.44 kB

	---
	library_name: transformers
	tags: []
	---

	# Model Card

	<!-- Provide a quick summary of what the model is/does. -->



	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	This is a fine-tuned version of DeepSeek-R1-Distill-Llama-8B, optimized for telecom-related queries. The model has been fine-tuned to provide concise and factual answers, ensuring that it does role-play as a customer service agent.

	- Developed by: Mohamed Abdulaziz
	- Model type: Fine-tune-DeepSeek-R1-Distill-Llama-8B
	- Framework Used: Unsloth for fine tuning and wandb for performance monitoring
	- License: MIT License


	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
	This model is designed for customer support automation in the telecom industry. It assists in:
	- Answering common user queries about 5G, network issues, billing, and services.
	- Providing concise and factually correct responses.
	- Reducing workload on human support agents by handling routine inquiries.

	### Who can use this model?
	- Telecom companies: Automate customer service via chatbots.
	- Developers & researchers: Fine-tune and adapt for different use cases.
	- Call centers: Support agents in handling user requests efficiently.

	### Who might be affected?
	- End-users interacting with telecom chatbots.
	- Support agents using AI-assisted tools.
	- Developers & data scientists fine-tuning and deploying the model.


	## How to Get Started with the Model


	### 1️⃣ Import necessary libraries
	```python
	import torch
	from unsloth import FastLanguageModel
	from transformers import AutoTokenizer
	```

	### 2️⃣ Define model path
	```python
	model_path = "moo100/DeepSeek-R1-telecom-chatbot"
	```

	### 3️⃣ Load the model and tokenizer
	```python
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_path,
	max_seq_length=1024, # training length equal to 2048 but you can choose less than that to avoid OOM
	dtype=None # Uses default precision
	)
	```

	### 4️⃣ Optimize model for fast inference with Unsloth
	```python
	model = FastLanguageModel.for_inference(model)
	```

	### 5️⃣ Move model to GPU if available, otherwise use CPU
	```python
	device = "cuda" if torch.cuda.is_available() else "cpu"
	model.to(device)
	```

	### 6️⃣ Define system instruction to guide model responses
	```python
	system_instruction = """You are an AI assistant. Answer user questions concisely and factually.
	Do NOT role-play as a customer service agent. Only answer the user's query."""
	```

	### 7️⃣ Define user input (Replace with any query)
	```python
	user_input = "What are the benefits of 5G?"
	```

	### 8️⃣ Construct full prompt with instructions and user query
	```python
	full_prompt = f"{system_instruction}\n\nUser: {user_input}\nAssistant:"
	```

	### 9️⃣ Tokenize input prompt
	```python
	inputs = tokenizer(full_prompt, return_tensors="pt").to(device)
	```

	### 🔟 Generate model response with controlled stopping criteria
	```python
	outputs = model.generate(
	input_ids=inputs.input_ids, # Encoded input tokens
	attention_mask=inputs.attention_mask, # Mask for input length
	max_new_tokens=100, # Limits response length
	do_sample=True, # Enables randomness for variability
	temperature=0.5, # Controls randomness level
	top_k=50, # Samples from top 50 probable words
	eos_token_id=tokenizer.eos_token_id, # Stops at end-of-sentence token
	)
	```

	### 1️⃣1️⃣ Decode and extract only the newly generated response
	```python
	response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
	```

	### 1️⃣2️⃣ Print the AI-generated response
	```python
	print(response.split("\n")[0].strip())
	```



	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	talkmap/telecom-conversation-corpus

	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	- Loss Curve: Shows a steady decline, indicating model convergence.
	- Learning Rate Schedule: Linear decay applied.
	- Gradient Norm: Slight increase, but under control.
	- Global Steps & Epochs: Indicates training progress.

	Below are the training metrics recorded during fine-tuning:
	https://drive.google.com/file/d/1-SOfG8K3Qt2WSEuyj3kFhGYOYMB5Gk2r/view?usp=sharing



	# Evaluation

	## Methodology

	The chatbot was evaluated using Meta-Llama-3.3-70B-Instruct, assessing relevance, correctness, and fluency of its responses.

	## Results

	Meta-Llama-3.3-70B-Instruct Evaluation:

	Relevance: 9/10
	The response is highly relevant to the user’s query about 5G benefits, providing a concise and informative summary.

	Correctness: 10/10
	The response is factually accurate, highlighting key advantages such as faster data speeds, lower latency, increased capacity, and broader device compatibility.

	Fluency: 9/10
	The response is well-structured, grammatically sound, and easy to understand. Minor refinements could further enhance readability.