Spaces:
Sleeping
Sleeping
Arif
commited on
Commit
Β·
8c44361
1
Parent(s):
ee1a304
Updated redme
Browse files
README.md
CHANGED
|
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LLM Data Analyzer
|
| 2 |
+
|
| 3 |
+
> **Local LLM-powered Data Analysis on Mac M4**
|
| 4 |
+
|
| 5 |
+
A powerful, dual-mode data analysis platform that leverages local LLMs (via MLX on Apple Silicon) or containerized models (via Docker) to provide intelligent insights from your data. Built with FastAPI and designed for privacy and performance.
|
| 6 |
+
|
| 7 |
+
## π Features
|
| 8 |
+
|
| 9 |
+
- **Dual-Mode LLM Support**:
|
| 10 |
+
- **MLX Mode (Default)**: Runs optimized local LLMs directly on Apple Silicon (M1/M2/M3/M4) using `mlx-lm`.
|
| 11 |
+
- **Docker Model Runner**: Connects to OpenAI-compatible model runners (like `llama.cpp` server) running in Docker containers.
|
| 12 |
+
- **Intelligent Data Analysis**:
|
| 13 |
+
- Upload CSV or Excel files.
|
| 14 |
+
- Perform statistical analysis, trend detection, and outlier identification.
|
| 15 |
+
- Get ML-driven suggestions for data improvement.
|
| 16 |
+
- **Interactive Chat**: Chat with your data using the integrated LLM.
|
| 17 |
+
- **Modern API**: Robust FastAPI backend with comprehensive documentation.
|
| 18 |
+
|
| 19 |
+
## π οΈ Tech Stack
|
| 20 |
+
|
| 21 |
+
- **Backend**: FastAPI, Uvicorn
|
| 22 |
+
- **LLM Engine**: MLX (Apple Silicon), Docker (Containerized)
|
| 23 |
+
- **Data Processing**: Pandas, NumPy, Scikit-learn
|
| 24 |
+
- **Package Management**: `uv`
|
| 25 |
+
|
| 26 |
+
## π Prerequisites
|
| 27 |
+
|
| 28 |
+
- **Python**: 3.11 or higher
|
| 29 |
+
- **Package Manager**: `uv` (Recommended) or `pip`
|
| 30 |
+
- **Hardware**: Mac with Apple Silicon (for MLX mode) OR any system with Docker (for Docker mode)
|
| 31 |
+
|
| 32 |
+
## β‘ Quick Start
|
| 33 |
+
|
| 34 |
+
### 1. Clone the Repository
|
| 35 |
+
|
| 36 |
+
```bash
|
| 37 |
+
git clone <repository-url>
|
| 38 |
+
cd llm-data-analyzer
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
### 2. Install Dependencies
|
| 42 |
+
|
| 43 |
+
Using `uv` (recommended):
|
| 44 |
+
|
| 45 |
+
```bash
|
| 46 |
+
uv sync
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
Or using `pip`:
|
| 50 |
+
|
| 51 |
+
```bash
|
| 52 |
+
pip install -r requirements.txt
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
### 3. Configuration
|
| 56 |
+
|
| 57 |
+
Copy the example environment file:
|
| 58 |
+
|
| 59 |
+
```bash
|
| 60 |
+
cp .env.example .env.local
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
Edit `.env.local` to configure your settings:
|
| 64 |
+
|
| 65 |
+
- **`FASTAPI_ENV`**: Set to `development` for hot-reloading.
|
| 66 |
+
- **`DEBUG`**: Set to `true` to use **MLX (Local)** mode. Set to `false` to use **Docker Model Runner**.
|
| 67 |
+
- **`LLM_MODEL_NAME`**: Specify the Hugging Face model ID for MLX (e.g., `mlx-community/Llama-3.2-3B-Instruct-4bit`).
|
| 68 |
+
|
| 69 |
+
### 4. Run the Backend
|
| 70 |
+
|
| 71 |
+
```bash
|
| 72 |
+
# Activate virtual environment
|
| 73 |
+
source .venv/bin/activate
|
| 74 |
+
|
| 75 |
+
# Run the server
|
| 76 |
+
python -m backend.app.main
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
The API will be available at `http://localhost:8000`.
|
| 80 |
+
|
| 81 |
+
## π API Documentation
|
| 82 |
+
|
| 83 |
+
Once the server is running, access the interactive API docs:
|
| 84 |
+
|
| 85 |
+
- **Swagger UI**: [http://localhost:8000/docs](http://localhost:8000/docs)
|
| 86 |
+
- **ReDoc**: [http://localhost:8000/redoc](http://localhost:8000/redoc)
|
| 87 |
+
|
| 88 |
+
### Key Endpoints
|
| 89 |
+
|
| 90 |
+
- `POST /api/v1/chat`: Chat with the LLM.
|
| 91 |
+
- `POST /api/v1/upload`: Upload a dataset (CSV/Excel).
|
| 92 |
+
- `POST /api/v1/analyze`: Perform specific analysis on uploaded data.
|
| 93 |
+
- `POST /api/v1/suggestions`: Get ML-driven data improvement suggestions.
|
| 94 |
+
- `GET /api/v1/health`: Check system health and current LLM mode.
|
| 95 |
+
|
| 96 |
+
## ποΈ Project Structure
|
| 97 |
+
|
| 98 |
+
```
|
| 99 |
+
.
|
| 100 |
+
βββ backend/
|
| 101 |
+
β βββ app/
|
| 102 |
+
β β βββ api/ # API Routes
|
| 103 |
+
β β βββ services/ # Business Logic (LLM, Analyzer, etc.)
|
| 104 |
+
β β βββ models/ # Pydantic Models
|
| 105 |
+
β β βββ main.py # Application Entry Point
|
| 106 |
+
βββ frontend/ # (Under Development) Streamlit Frontend
|
| 107 |
+
βββ pyproject.toml # Project Dependencies
|
| 108 |
+
βββ README.md # Project Documentation
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
## β οΈ Frontend Status
|
| 112 |
+
|
| 113 |
+
The Streamlit frontend is currently under active development. Please use the Backend API directly or via the Swagger UI for testing and interaction.
|
| 114 |
+
|
| 115 |
+
## π€ Contributing
|
| 116 |
+
|
| 117 |
+
Contributions are welcome! Please feel free to submit a Pull Request.
|