Spaces:

Arif-Badhon
/

llm-data-analyzer

Sleeping

App Files Files Community

Arif commited on 6 days ago

Commit

ef17ebc

1 Parent(s): 5c19816

Updated readme

Browse files

Files changed (1) hide show

README.md +99 -175

README.md CHANGED Viewed

@@ -1,3 +1,4 @@
 title: LLM Data Analyzer
 emoji: 📊
 colorFrom: blue
@@ -6,50 +7,50 @@ sdk: docker
 sdk_version: latest
 app_file: app.py
 pinned: false
-📊 LLM Data Analyzer
-An AI-powered tool for analyzing data and having conversations with an intelligent assistant powered by Llama 2.
-Features
-📤 Upload & Analyze: Upload CSV or Excel files and get instant analysis
-💬 Chat: Have conversations with Llama 2 AI assistant
-📊 Data Statistics: View comprehensive data summaries and insights
-🚀 Fast: Runs on free Hugging Face CPU tier
-How to Use
-Upload Data - Start by uploading a CSV or Excel file
-Preview - Review your data and statistics
-Ask Questions - Get AI-powered analysis and insights
-Chat - Have follow-up conversations with the AI
-Technology Stack
-Model: Llama 2 7B (quantized to 4-bit)
-Framework: Streamlit
-Inference Engine: Llama.cpp
-Hosting: Hugging Face Spaces
-Language: Python 3.10+
-Performance
-Metric	Value
-Speed	~5-10 tokens/second (free CPU)
-Model Size	4GB (quantized)
-Context Window	2048 tokens
-First Load	~30 seconds (model download)
-Subsequent Responses	~5-15 seconds
-Hardware	Free Hugging Face CPU
-Local Development (Faster)
 For faster local development with GPU acceleration on Apple Silicon Mac:
-bash
 # Clone the repository
 git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
 cd LLM-Data-Analyzer
@@ -62,176 +63,99 @@ pip install -r requirements.txt
 # Run with MLX (Apple Silicon GPU - ~70 tokens/second)
 streamlit run app.py
-Deployment Options
-Option 1: Hugging Face Space (Free)
-CPU-based inference
-Speed: 5-10 tokens/second
-Cost: Free
-URL: https://huggingface.co/spaces/Arif-Badhon/llm-data-analyzer
-Option 2: Local with MLX (Fastest)
-GPU-accelerated on Apple Silicon
-Speed: 70+ tokens/second
-Cost: Free (uses your Mac)
-Perfect for development and portfolio showcase
-Option 3: Hugging Face PRO (Fast)
-GPU-accelerated inference
-Speed: 50+ tokens/second
-Cost: $9/month
-Best for production
-Project Structure
-text
-LLM-Data-Analyzer/
-├── app.py                    # HF deployment app (self-contained)
-├── requirements.txt          # HF dependencies
-├── README.md                 # This file
-├── frontend/                 # Local Streamlit app
-│   ├── app.py               # Multi-page local app
-│   ├── pages/               # Streamlit pages
-│   └── components/          # UI components
-├── backend/                 # FastAPI backend
-│   ├── main.py
-│   ├── routes/
-│   └── services/
-├── docker-compose.yml       # Local Docker setup
-└── .env.local              # Environment variables
-Environment Variables
-Create a .env.local file:
-bash
-# LLM Configuration
-DEBUG=true
-LLM_MODE=mlx                                    # or llama_cpp
-LLM_MODEL_NAME_MLX=mlx-community/Llama-3.2-1B-Instruct
-LLM_MAX_TOKENS=512
-LLM_TEMPERATURE=0.7
-LLM_DEVICE=auto
-# Backend
-BACKEND_HOST=0.0.0.0
-BACKEND_PORT=8000
-# Frontend
-STREAMLIT_SERVER_PORT=8501
-Getting Started
-Quick Start (3 minutes)
-bash
-# 1. Install Python 3.10+
-# 2. Clone repo
-git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
-cd LLM-Data-Analyzer
-# 3. Install dependencies
-pip install -r frontend/requirements.txt
-# 4. Run Streamlit app
-streamlit run frontend/app.py
-With Docker (Local Development)
-bash
-# Make sure Docker Desktop is running
-docker-compose up --build
-# Access at http://localhost:8501
-Troubleshooting
-"Model download failed"
-Check internet connection
-HF Spaces need internet to download models from Hugging Face Hub
-Wait and refresh the page
-"App takes too long to load"
-Normal on first request (10-30 seconds)
-Model is being downloaded and cached
-Subsequent requests are much faster
-"Out of memory"
-Free tier CPU is limited
-Unlikely with quantized 4GB model
-If it happens, upgrade to HF PRO
-"Slow responses"
-Free tier CPU is slower than GPU
-Expected: 5-10 tokens/second
-For faster responses: use local MLX (70 t/s) or upgrade HF tier
-Future Improvements
- Add data visualization with Plotly charts
- Support for more file formats (JSON, Parquet, etc.)
- Database integration for conversation history
- User authentication and saved sessions
- Advanced analytics and statistical tests
- Export analysis reports as PDF
-Technologies Used
-Python - Core language
-Streamlit - Web UI framework
-FastAPI - Backend API framework
-Llama 2 - Large language model
-Llama.cpp - CPU inference
-MLX - Apple Silicon GPU inference
-Pandas - Data processing
-Plotly - Data visualization
-Docker - Containerization
-Hugging Face Hub - Model hosting
-License
-MIT License - feel free to use this project for personal or commercial purposes.
-Author
-Arif Badhon
-GitHub: @Arif-Badhon
-Portfolio: [Your Portfolio URL]
-Support
 If you encounter any issues:
-Check the Troubleshooting section
-Review Hugging Face Spaces Docs
-Open an issue on GitHub
-Acknowledgments
-Hugging Face - Model hosting and Spaces
-Streamlit - Web framework
-Meta AI - Llama models
-MLX Team - Apple Silicon support
-Happy analyzing! 🚀

+---
 title: LLM Data Analyzer
 emoji: 📊
 colorFrom: blue
 sdk_version: latest
 app_file: app.py
 pinned: false
+---
+# 📊 LLM Data Analyzer
+An AI-powered tool for analyzing data and having conversations with an intelligent assistant powered by Llama 2.
+## Features
+- **📤 Upload & Analyze**: Upload CSV or Excel files and get instant analysis
+- **💬 Chat**: Have conversations with Llama 2 AI assistant
+- **📊 Data Statistics**: View comprehensive data summaries and insights
+- **🚀 Fast**: Runs on free Hugging Face CPU tier
+## How to Use
+1. **Upload Data** - Start by uploading a CSV or Excel file
+2. **Preview** - Review your data and statistics
+3. **Ask Questions** - Get AI-powered analysis and insights
+4. **Chat** - Have follow-up conversations with the AI
+## Technology Stack
+- **Model**: Llama 2 7B (quantized to 4-bit)
+- **Framework**: Streamlit
+- **Inference Engine**: Llama.cpp
+- **Hosting**: Hugging Face Spaces
+- **Language**: Python 3.10+
+## Performance
+| Metric | Value |
+|--------|-------|
+| Speed | ~5-10 tokens/second (free CPU) |
+| Model Size | 4GB (quantized) |
+| Context Window | 2048 tokens |
+| First Load | ~30 seconds (model download) |
+| Subsequent Responses | ~5-15 seconds |
+| Hardware | Free Hugging Face CPU |
+## Local Development (Faster)
 For faster local development with GPU acceleration on Apple Silicon Mac:
+```bash
 # Clone the repository
 git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
 cd LLM-Data-Analyzer
 # Run with MLX (Apple Silicon GPU - ~70 tokens/second)
 streamlit run app.py
+```
+## Deployment Options
+### Option 1: Hugging Face Space (Free)
+- CPU-based inference
+- Speed: 5-10 tokens/second
+- Cost: Free
+### Option 2: Local with MLX (Fastest)
+- GPU-accelerated on Apple Silicon
+- Speed: 70+ tokens/second
+- Cost: Free (uses your Mac)
+### Option 3: Hugging Face PRO (Fast)
+- GPU-accelerated inference
+- Speed: 50+ tokens/second
+- Cost: $9/month
+## Getting Started
+### Quick Start (3 minutes)
+```bash
+# 1. Install Python 3.10+
+# 2. Clone repo
+git clone https://github.com/Arif-Badhon/LLM-Data-Analyzer
+cd LLM-Data-Analyzer
+# 3. Install dependencies
+pip install -r requirements.txt
+# 4. Run Streamlit app
+streamlit run app.py
+```
+### With Docker (Local Development)
+```bash
+# Make sure Docker Desktop is running
+docker-compose up --build
+# Access at http://localhost:8501
+```
+## Troubleshooting
+### "Model download failed"
+- Check internet connection
+- HF Spaces need internet to download models from Hugging Face Hub
+- Wait and refresh the page
+### "App takes too long to load"
+- Normal on first request (10-30 seconds)
+- Model is being downloaded and cached
+- Subsequent requests are much faster
+### "Out of memory"
+- Free tier CPU is limited
+- Unlikely with quantized 4GB model
+- If it happens, upgrade to HF PRO
+### "Slow responses"
+- Free tier CPU is slower than GPU
+- Expected: 5-10 tokens/second
+- For faster responses: use local MLX (70 t/s) or upgrade HF tier
+## Technologies Used
+- **Python** - Core language
+- **Streamlit** - Web UI framework
+- **Llama 2** - Large language model
+- **Llama.cpp** - CPU inference
+- **MLX** - Apple Silicon GPU inference
+- **Pandas** - Data processing
+- **Docker** - Containerization
+- **Hugging Face Hub** - Model hosting
+## License
+MIT License
+## Author
+**Arif Badhon**
+## Support
 If you encounter any issues:
+1. Check the Troubleshooting section above
+2. Review Hugging Face Spaces Docs
+3. Open an issue on GitHub
+---
+**Happy analyzing! 🚀**