--- title: "Data Analysis App" emoji: "πŸ“Š" colorFrom: "indigo" colorTo: "blue" sdk: "streamlit" sdk_version: "1.39.0" app_file: src/streamlit_app.py pinned: false license: "mit" --- # πŸ“Š Streamlit Data Analysis App (Gemini + Open-Source) This Streamlit app lets you **upload CSV or Excel datasets**, automatically clean and preprocess them, create **quick visualizations**, and even get **AI-generated insights** powered by Gemini or open-source models. --- ## πŸš€ Features βœ… Upload `.csv` or `.xlsx` datasets βœ… Automatic data cleaning & standardization βœ… Preprocessing pipeline (imputation, encoding, scaling) βœ… Quick visualizations (histogram, boxplot, correlation heatmap, etc.) βœ… Smart dataset summary and preview βœ… Optional **Gemini AI insights** for dataset interpretation --- ## 🧠 LLM Integration (Optional) You can enable AI-generated insights with **Gemini 2.0 Flash** or your own Hugging Face model. ### πŸ”‘ To configure: 1. Go to your Space’s **Settings β†’ Secrets** tab. 2. Add the following: GEMINI_API_KEY = your_gemini_api_key HF_TOKEN = your_huggingface_token # optional 3. Save, then **Restart your Space**. If you don’t add an API key, the app will still work for data cleaning and visualization. --- ## πŸ› οΈ Deployment Notes - **Runtime:** Python SDK - **SDK:** Streamlit - **File formats supported:** `.csv`, `.xlsx` - **Maximum file size:** 100 MB - **Recommended visibility:** Public (for full file upload support) --- ## βš™οΈ Troubleshooting ### ❌ AxiosError: Request failed with status code 403 If you encounter this: - Ensure your Space is **Public** (not Private). - Ensure `sdk: streamlit` and `app_file:` are correctly declared in the YAML metadata above. - Check that your **runtime** is β€œPython SDK”. - Recheck your **Gemini API Key** or token secrets. ### βœ… Fix Checklist | Issue | Fix | |-------|------| | App fails to start | Verify `app_file` matches your actual Python filename | | 403 Error | Make the Space public | | API not found | Add key to **Settings β†’ Secrets** | | File upload broken | Ensure `sdk: streamlit` and `runtime: python` | --- ## πŸ’‘ Example Workflow 1. Upload your dataset (e.g., `global_freelancers_raw.csv`). 2. View the raw preview and cleaned data table. 3. Generate preprocessing pipelines (e.g., median imputation + one-hot encoding). 4. Visualize trends with histograms, boxplots, or heatmaps. 5. (Optional) Ask Gemini for AI insights about correlations, patterns, or recommendations. --- ## 🧩 Tech Stack - **Frontend:** Streamlit - **Backend:** Python (Pandas, NumPy, Scikit-learn) - **AI Models:** Gemini 2.0 Flash / open-source LLMs (Qwen, Mistral, etc.) - **Visualization:** Matplotlib, Seaborn --- ## 🧾 License MIT License Β© 2025 You are free to use, modify, and share this app with attribution. ---