findthehead commited on
Commit
67af3de
Β·
1 Parent(s): b3bbb65

initial commit with LFS for GGUF model

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +233 -0
  3. cveparrot.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,236 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - security
7
+ - cve
8
+ - vulnerability
9
+ - t5
10
+ - text-generation
11
+ base_model: google-t5/t5-small
12
  ---
13
+
14
+ # CVEParrot 🦜
15
+
16
+ CVEParrot is a Google T5 model fine-tuned on CVE (Common Vulnerabilities and Exposures) database to understand and generate security vulnerability information.
17
+
18
+ ## Model Description
19
+
20
+ - **Developed by:** findthehead
21
+ - **Base Model:** Google T5
22
+ - **Training Data:** CVE Database
23
+ - **Language:** English
24
+ - **License:** Apache 2.0
25
+
26
+ This model has been specifically trained to understand and generate content related to cybersecurity vulnerabilities, CVE descriptions, and security intelligence.
27
+
28
+ ## Use Cases
29
+
30
+ - Generate CVE descriptions
31
+ - Analyze vulnerability information
32
+ - Security research and analysis
33
+ - Automated vulnerability documentation
34
+ - CVE information extraction and summarization
35
+
36
+ ## How to Use
37
+
38
+ ### Option 1: Using Hugging Face Transformers (Safetensors)
39
+
40
+ Install the required dependencies:
41
+
42
+ ```bash
43
+ pip install transformers torch
44
+ ```
45
+
46
+ **Inference Code:**
47
+
48
+ ```python
49
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
50
+
51
+ # Load model and tokenizer
52
+ model_name = "Prachir-AI/cveparrot"
53
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
54
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
55
+
56
+ # Prepare input
57
+ input_text = "Describe CVE-2024-1234"
58
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
59
+
60
+ # Generate output
61
+ outputs = model.generate(
62
+ input_ids,
63
+ max_length=512,
64
+ num_beams=4,
65
+ early_stopping=True,
66
+ temperature=0.7,
67
+ do_sample=True
68
+ )
69
+
70
+ # Decode and print result
71
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
72
+ print(generated_text)
73
+ ```
74
+
75
+ **Advanced Usage with Custom Parameters:**
76
+
77
+ ```python
78
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
79
+
80
+ # Load model and tokenizer
81
+ model_name = "Prachir-AI/cveparrot"
82
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
83
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
84
+
85
+ # Move to GPU if available
86
+ import torch
87
+ device = "cuda" if torch.cuda.is_available() else "cpu"
88
+ model = model.to(device)
89
+
90
+ # Example prompts
91
+ prompts = [
92
+ "Explain the security vulnerability:",
93
+ "Describe the CVE:",
94
+ "What is the impact of:",
95
+ ]
96
+
97
+ input_text = prompts[0] + " CVE-2024-1234"
98
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
99
+
100
+ # Generate with custom parameters
101
+ outputs = model.generate(
102
+ input_ids,
103
+ max_length=256,
104
+ min_length=50,
105
+ num_beams=5,
106
+ no_repeat_ngram_size=2,
107
+ early_stopping=True,
108
+ temperature=0.8,
109
+ top_k=50,
110
+ top_p=0.95,
111
+ do_sample=True
112
+ )
113
+
114
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
115
+ print(generated_text)
116
+ ```
117
+
118
+ ### Option 2: Using GGUF Model with Ollama (Local Inference)
119
+
120
+ The model is available in GGUF format for efficient local inference using Ollama.
121
+
122
+ **Step 1: Install Ollama**
123
+
124
+ ```bash
125
+ # Linux
126
+ curl -fsSL https://ollama.com/install.sh | sh
127
+
128
+ # macOS
129
+ brew install ollama
130
+
131
+ # Or download from https://ollama.com
132
+ ```
133
+
134
+ **Step 2: Pull and Run the Model**
135
+
136
+ ```bash
137
+ # Pull the model
138
+ ollama pull Prachir-AI/cveparrot
139
+
140
+ # Interactive mode
141
+ ollama run Prachir-AI/cveparrot
142
+
143
+ # Single query
144
+ ollama run Prachir-AI/cveparrot "Describe CVE-2024-1234"
145
+ ```
146
+
147
+ **Using Ollama API (Python):**
148
+
149
+ ```bash
150
+ pip install ollama
151
+ ```
152
+
153
+ ```python
154
+ import ollama
155
+
156
+ # Generate response
157
+ response = ollama.generate(
158
+ model='cveparrot',
159
+ prompt='Describe the security vulnerability CVE-2024-1234',
160
+ )
161
+
162
+ print(response['response'])
163
+ ```
164
+
165
+ **Using Ollama API (curl):**
166
+
167
+ ```bash
168
+ curl http://localhost:11434/api/generate -d '{
169
+ "model": "cveparrot",
170
+ "prompt": "Describe CVE-2024-1234",
171
+ "stream": false
172
+ }'
173
+ ```
174
+
175
+ ## Model Files
176
+
177
+ - `model.safetensors`: PyTorch model weights in Safetensors format
178
+ - `cveparrot.gguf`: Quantized GGUF model for efficient inference
179
+ - `tokenizer_config.json`: Tokenizer configuration
180
+ - `config.json`: Model configuration
181
+ - `spiece.model`: SentencePiece tokenizer model
182
+
183
+ ## Training Details
184
+
185
+ This model was fine-tuned on CVE database entries to understand and generate security vulnerability information. The training focused on:
186
+
187
+ - CVE descriptions and technical details
188
+ - Vulnerability severity and impact analysis
189
+ - Security patches and mitigation strategies
190
+ - Affected software and version information
191
+
192
+ ## Limitations
193
+
194
+ - The model is trained on historical CVE data and may not have information about very recent vulnerabilities
195
+ - Generated content should be verified against official CVE databases
196
+ - The model may occasionally generate plausible but incorrect security information
197
+ - Not a replacement for professional security analysis
198
+
199
+ ## Ethical Considerations
200
+
201
+ This model is designed for:
202
+ - βœ… Security research and education
203
+ - βœ… Vulnerability analysis and documentation
204
+ - βœ… Automated security intelligence gathering
205
+ - βœ… Assisting security professionals
206
+
207
+ This model should NOT be used for:
208
+ - ❌ Creating or exploiting vulnerabilities
209
+ - ❌ Malicious hacking activities
210
+ - ❌ Unauthorized security testing
211
+
212
+ ## Citation
213
+
214
+ If you use this model in your research or applications, please cite:
215
+
216
+ ```bibtex
217
+ @model{cveparrot2024,
218
+ author = {findthehead},
219
+ title = {CVEParrot: A T5 Model for CVE Analysis},
220
+ year = {2024},
221
+ publisher = {HuggingFace},
222
+ url = {https://huggingface.co/Prachir-AI/cveparrot}
223
+ }
224
+ ```
225
+
226
+ ## Developer
227
+
228
+ - **HuggingFace:** [findthehead](https://huggingface.co/findthehead)
229
+
230
+ ## Feedback and Contributions
231
+
232
+ For issues, questions, or contributions, please visit the model repository on HuggingFace.
233
+
234
+ ## License
235
+
236
+ This model is released under the Apache 2.0 License. See LICENSE file for details.
cveparrot.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78e3dd311ecfd17e9854a5b536176973d0bafb4f747f3147755c1ed5789a46c4
3
+ size 122073984