Arif commited on
Commit
ef17ebc
Β·
1 Parent(s): 5c19816

Updated readme

Browse files
Files changed (1) hide show
  1. README.md +99 -175
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  title: LLM Data Analyzer
2
  emoji: πŸ“Š
3
  colorFrom: blue
@@ -6,50 +7,50 @@ sdk: docker
6
  sdk_version: latest
7
  app_file: app.py
8
  pinned: false
9
- πŸ“Š LLM Data Analyzer
10
- An AI-powered tool for analyzing data and having conversations with an intelligent assistant powered by Llama 2.
11
-
12
- Features
13
- πŸ“€ Upload & Analyze: Upload CSV or Excel files and get instant analysis
14
-
15
- πŸ’¬ Chat: Have conversations with Llama 2 AI assistant
16
 
17
- πŸ“Š Data Statistics: View comprehensive data summaries and insights
18
 
19
- πŸš€ Fast: Runs on free Hugging Face CPU tier
20
 
21
- How to Use
22
- Upload Data - Start by uploading a CSV or Excel file
23
 
24
- Preview - Review your data and statistics
 
 
 
25
 
26
- Ask Questions - Get AI-powered analysis and insights
27
 
28
- Chat - Have follow-up conversations with the AI
 
 
 
29
 
30
- Technology Stack
31
- Model: Llama 2 7B (quantized to 4-bit)
32
 
33
- Framework: Streamlit
 
 
 
 
34
 
35
- Inference Engine: Llama.cpp
36
 
37
- Hosting: Hugging Face Spaces
 
 
 
 
 
 
 
38
 
39
- Language: Python 3.10+
40
 
41
- Performance
42
- Metric Value
43
- Speed ~5-10 tokens/second (free CPU)
44
- Model Size 4GB (quantized)
45
- Context Window 2048 tokens
46
- First Load ~30 seconds (model download)
47
- Subsequent Responses ~5-15 seconds
48
- Hardware Free Hugging Face CPU
49
- Local Development (Faster)
50
  For faster local development with GPU acceleration on Apple Silicon Mac:
51
 
52
- bash
53
  # Clone the repository
54
  git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
55
  cd LLM-Data-Analyzer
@@ -62,176 +63,99 @@ pip install -r requirements.txt
62
 
63
  # Run with MLX (Apple Silicon GPU - ~70 tokens/second)
64
  streamlit run app.py
65
- Deployment Options
66
- Option 1: Hugging Face Space (Free)
67
- CPU-based inference
68
-
69
- Speed: 5-10 tokens/second
70
-
71
- Cost: Free
72
-
73
- URL: https://huggingface.co/spaces/Arif-Badhon/llm-data-analyzer
74
-
75
- Option 2: Local with MLX (Fastest)
76
- GPU-accelerated on Apple Silicon
77
-
78
- Speed: 70+ tokens/second
79
-
80
- Cost: Free (uses your Mac)
81
-
82
- Perfect for development and portfolio showcase
83
-
84
- Option 3: Hugging Face PRO (Fast)
85
- GPU-accelerated inference
86
-
87
- Speed: 50+ tokens/second
88
-
89
- Cost: $9/month
90
-
91
- Best for production
92
-
93
- Project Structure
94
- text
95
- LLM-Data-Analyzer/
96
- β”œβ”€β”€ app.py # HF deployment app (self-contained)
97
- β”œβ”€β”€ requirements.txt # HF dependencies
98
- β”œβ”€β”€ README.md # This file
99
- β”œβ”€β”€ frontend/ # Local Streamlit app
100
- β”‚ β”œβ”€β”€ app.py # Multi-page local app
101
- β”‚ β”œβ”€β”€ pages/ # Streamlit pages
102
- β”‚ └── components/ # UI components
103
- β”œβ”€β”€ backend/ # FastAPI backend
104
- β”‚ β”œβ”€β”€ main.py
105
- β”‚ β”œβ”€β”€ routes/
106
- β”‚ └── services/
107
- β”œβ”€β”€ docker-compose.yml # Local Docker setup
108
- └── .env.local # Environment variables
109
- Environment Variables
110
- Create a .env.local file:
111
-
112
- bash
113
- # LLM Configuration
114
- DEBUG=true
115
- LLM_MODE=mlx # or llama_cpp
116
- LLM_MODEL_NAME_MLX=mlx-community/Llama-3.2-1B-Instruct
117
- LLM_MAX_TOKENS=512
118
- LLM_TEMPERATURE=0.7
119
- LLM_DEVICE=auto
120
-
121
- # Backend
122
- BACKEND_HOST=0.0.0.0
123
- BACKEND_PORT=8000
124
-
125
- # Frontend
126
- STREAMLIT_SERVER_PORT=8501
127
- Getting Started
128
- Quick Start (3 minutes)
129
- bash
130
- # 1. Install Python 3.10+
131
- # 2. Clone repo
132
- git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
133
- cd LLM-Data-Analyzer
134
-
135
- # 3. Install dependencies
136
- pip install -r frontend/requirements.txt
137
-
138
- # 4. Run Streamlit app
139
- streamlit run frontend/app.py
140
- With Docker (Local Development)
141
- bash
142
- # Make sure Docker Desktop is running
143
- docker-compose up --build
144
-
145
- # Access at http://localhost:8501
146
- Troubleshooting
147
- "Model download failed"
148
- Check internet connection
149
-
150
- HF Spaces need internet to download models from Hugging Face Hub
151
-
152
- Wait and refresh the page
153
-
154
- "App takes too long to load"
155
- Normal on first request (10-30 seconds)
156
-
157
- Model is being downloaded and cached
158
 
159
- Subsequent requests are much faster
160
 
161
- "Out of memory"
162
- Free tier CPU is limited
 
 
163
 
164
- Unlikely with quantized 4GB model
 
 
 
165
 
166
- If it happens, upgrade to HF PRO
 
 
 
167
 
168
- "Slow responses"
169
- Free tier CPU is slower than GPU
170
 
171
- Expected: 5-10 tokens/second
172
 
173
- For faster responses: use local MLX (70 t/s) or upgrade HF tier
174
-
175
- Future Improvements
176
- Add data visualization with Plotly charts
177
-
178
- Support for more file formats (JSON, Parquet, etc.)
179
-
180
- Database integration for conversation history
181
 
182
- User authentication and saved sessions
 
183
 
184
- Advanced analytics and statistical tests
 
 
185
 
186
- Export analysis reports as PDF
187
 
188
- Technologies Used
189
- Python - Core language
 
190
 
191
- Streamlit - Web UI framework
 
192
 
193
- FastAPI - Backend API framework
194
 
195
- Llama 2 - Large language model
 
 
 
196
 
197
- Llama.cpp - CPU inference
 
 
 
198
 
199
- MLX - Apple Silicon GPU inference
 
 
 
200
 
201
- Pandas - Data processing
 
 
 
202
 
203
- Plotly - Data visualization
204
 
205
- Docker - Containerization
 
 
 
 
 
 
 
206
 
207
- Hugging Face Hub - Model hosting
208
 
209
- License
210
- MIT License - feel free to use this project for personal or commercial purposes.
211
 
212
- Author
213
- Arif Badhon
214
 
215
- GitHub: @Arif-Badhon
216
 
217
- Portfolio: [Your Portfolio URL]
218
 
219
- Support
220
  If you encounter any issues:
 
 
 
221
 
222
- Check the Troubleshooting section
223
-
224
- Review Hugging Face Spaces Docs
225
-
226
- Open an issue on GitHub
227
-
228
- Acknowledgments
229
- Hugging Face - Model hosting and Spaces
230
-
231
- Streamlit - Web framework
232
-
233
- Meta AI - Llama models
234
-
235
- MLX Team - Apple Silicon support
236
 
237
- Happy analyzing! πŸš€
 
1
+ ---
2
  title: LLM Data Analyzer
3
  emoji: πŸ“Š
4
  colorFrom: blue
 
7
  sdk_version: latest
8
  app_file: app.py
9
  pinned: false
10
+ ---
 
 
 
 
 
 
11
 
12
+ # πŸ“Š LLM Data Analyzer
13
 
14
+ An AI-powered tool for analyzing data and having conversations with an intelligent assistant powered by Llama 2.
15
 
16
+ ## Features
 
17
 
18
+ - **πŸ“€ Upload & Analyze**: Upload CSV or Excel files and get instant analysis
19
+ - **πŸ’¬ Chat**: Have conversations with Llama 2 AI assistant
20
+ - **πŸ“Š Data Statistics**: View comprehensive data summaries and insights
21
+ - **πŸš€ Fast**: Runs on free Hugging Face CPU tier
22
 
23
+ ## How to Use
24
 
25
+ 1. **Upload Data** - Start by uploading a CSV or Excel file
26
+ 2. **Preview** - Review your data and statistics
27
+ 3. **Ask Questions** - Get AI-powered analysis and insights
28
+ 4. **Chat** - Have follow-up conversations with the AI
29
 
30
+ ## Technology Stack
 
31
 
32
+ - **Model**: Llama 2 7B (quantized to 4-bit)
33
+ - **Framework**: Streamlit
34
+ - **Inference Engine**: Llama.cpp
35
+ - **Hosting**: Hugging Face Spaces
36
+ - **Language**: Python 3.10+
37
 
38
+ ## Performance
39
 
40
+ | Metric | Value |
41
+ |--------|-------|
42
+ | Speed | ~5-10 tokens/second (free CPU) |
43
+ | Model Size | 4GB (quantized) |
44
+ | Context Window | 2048 tokens |
45
+ | First Load | ~30 seconds (model download) |
46
+ | Subsequent Responses | ~5-15 seconds |
47
+ | Hardware | Free Hugging Face CPU |
48
 
49
+ ## Local Development (Faster)
50
 
 
 
 
 
 
 
 
 
 
51
  For faster local development with GPU acceleration on Apple Silicon Mac:
52
 
53
+ ```bash
54
  # Clone the repository
55
  git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
56
  cd LLM-Data-Analyzer
 
63
 
64
  # Run with MLX (Apple Silicon GPU - ~70 tokens/second)
65
  streamlit run app.py
66
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
+ ## Deployment Options
69
 
70
+ ### Option 1: Hugging Face Space (Free)
71
+ - CPU-based inference
72
+ - Speed: 5-10 tokens/second
73
+ - Cost: Free
74
 
75
+ ### Option 2: Local with MLX (Fastest)
76
+ - GPU-accelerated on Apple Silicon
77
+ - Speed: 70+ tokens/second
78
+ - Cost: Free (uses your Mac)
79
 
80
+ ### Option 3: Hugging Face PRO (Fast)
81
+ - GPU-accelerated inference
82
+ - Speed: 50+ tokens/second
83
+ - Cost: $9/month
84
 
85
+ ## Getting Started
 
86
 
87
+ ### Quick Start (3 minutes)
88
 
89
+ ```bash
90
+ # 1. Install Python 3.10+
91
+ # 2. Clone repo
92
+ git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
93
+ cd LLM-Data-Analyzer
 
 
 
94
 
95
+ # 3. Install dependencies
96
+ pip install -r requirements.txt
97
 
98
+ # 4. Run Streamlit app
99
+ streamlit run app.py
100
+ ```
101
 
102
+ ### With Docker (Local Development)
103
 
104
+ ```bash
105
+ # Make sure Docker Desktop is running
106
+ docker-compose up --build
107
 
108
+ # Access at http://localhost:8501
109
+ ```
110
 
111
+ ## Troubleshooting
112
 
113
+ ### "Model download failed"
114
+ - Check internet connection
115
+ - HF Spaces need internet to download models from Hugging Face Hub
116
+ - Wait and refresh the page
117
 
118
+ ### "App takes too long to load"
119
+ - Normal on first request (10-30 seconds)
120
+ - Model is being downloaded and cached
121
+ - Subsequent requests are much faster
122
 
123
+ ### "Out of memory"
124
+ - Free tier CPU is limited
125
+ - Unlikely with quantized 4GB model
126
+ - If it happens, upgrade to HF PRO
127
 
128
+ ### "Slow responses"
129
+ - Free tier CPU is slower than GPU
130
+ - Expected: 5-10 tokens/second
131
+ - For faster responses: use local MLX (70 t/s) or upgrade HF tier
132
 
133
+ ## Technologies Used
134
 
135
+ - **Python** - Core language
136
+ - **Streamlit** - Web UI framework
137
+ - **Llama 2** - Large language model
138
+ - **Llama.cpp** - CPU inference
139
+ - **MLX** - Apple Silicon GPU inference
140
+ - **Pandas** - Data processing
141
+ - **Docker** - Containerization
142
+ - **Hugging Face Hub** - Model hosting
143
 
144
+ ## License
145
 
146
+ MIT License
 
147
 
148
+ ## Author
 
149
 
150
+ **Arif Badhon**
151
 
152
+ ## Support
153
 
 
154
  If you encounter any issues:
155
+ 1. Check the Troubleshooting section above
156
+ 2. Review Hugging Face Spaces Docs
157
+ 3. Open an issue on GitHub
158
 
159
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
160
 
161
+ **Happy analyzing! πŸš€**