{"id":29437417,"url":"https://github.com/shaundann/autosight","last_synced_at":"2026-05-09T09:06:19.321Z","repository":{"id":304030741,"uuid":"1017555026","full_name":"shaundann/autosight","owner":"shaundann","description":"AutoSight is an AI-powered multi-agent data analysis pipeline built on Google Cloud. From ingesting raw CSVs to generating visualizations and natural language summaries — all results are displayed live in a Streamlit dashboard.","archived":false,"fork":false,"pushed_at":"2025-07-10T18:11:54.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-11T00:44:08.822Z","etag":null,"topics":["ai-agents","automated-data-analysis","bigquery","data-pipeline","gcp","google-cloud","llm","multi-agent-systems","python","streamlit","vertex-ai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shaundann.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-10T18:01:58.000Z","updated_at":"2025-07-10T18:14:55.000Z","dependencies_parsed_at":"2025-07-11T00:45:10.151Z","dependency_job_id":"c312f9a2-4aee-45ac-bd37-6cd651ad6683","html_url":"https://github.com/shaundann/autosight","commit_stats":null,"previous_names":["shaundann/autosight"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/shaundann/autosight","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaundann%2Fautosight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaundann%2Fautosight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaundann%2Fautosight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaundann%2Fautosight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shaundann","download_url":"https://codeload.github.com/shaundann/autosight/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaundann%2Fautosight/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265096818,"owners_count":23710794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","automated-data-analysis","bigquery","data-pipeline","gcp","google-cloud","llm","multi-agent-systems","python","streamlit","vertex-ai"],"created_at":"2025-07-13T06:02:10.404Z","updated_at":"2026-05-09T09:06:14.285Z","avatar_url":"https://github.com/shaundann.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 AutoSight – Automated AI Data Analysis Pipeline on Google Cloud\n\n**AutoSight** is a multi-agent AI system that automates the entire data analytics pipeline — from dataset ingestion to AI-powered insights — using Google Cloud Platform (GCP).\n\nWith just a CSV URL, AutoSight:\n- 📥 Crawls and uploads the dataset to Google Cloud Storage\n- 📊 Cleans and analyzes trends using pandas and seaborn\n- 🧠 Generates summaries using Gemini or GPT-like models\n- 🗃️ Loads data into BigQuery for structured querying\n- 📈 Displays all results in a modern Streamlit dashboard\n\n---\n\n## 🚀 Features\n\n- 🌐 Web-based CSV ingestion from any public URL  \n- 📈 Trend analysis + visual plots  \n- 🤖 Natural language summaries with LLMs  \n- ☁️ Cloud-native: Cloud Storage + BigQuery  \n- 🖥️ Streamlit-powered live dashboard  \n- 💡 Modular agent architecture for easy extensibility  \n\n---\n\n## 🛠️ Tech Stack\n\n| Layer         | Technology                      |\n|---------------|----------------------------------|\n| 💻 Backend     | Python 3.12                      |\n| ☁️ Cloud       | Google Cloud Storage, BigQuery, Vertex AI |\n| 🤖 AI          | Gemini Pro (or OpenAI GPT)       |\n| 📊 Visualization | matplotlib, seaborn, Streamlit  |\n| 🧱 Architecture | Multi-Agent System               |\n\n---\n\n## 📁 Folder Structure\n\n```\nAutoSight/\n├── main.py                      # Orchestrates entire pipeline\n├── dashboard.py                 # Streamlit frontend\n├── requirements.txt             # Python dependencies\n├── README.md                    # This file\n│\n├── agents/\n│   ├── data\\_crawler\\_agent/\n│   │   └── agent.py             # Downloads CSV to GCS\n│   ├── analyzer\\_agent/\n│   │   └── agent.py             # Analyzes data + generates summary\n│   └── bigquery\\_writer\\_agent/\n│       └── agent.py             # Loads dataset into BigQuery\n```\n\n\n---\n\n## ⚙️ Setup Guide\n\n### 1. Clone the repository\n\n```bash\ngit clone https://github.com/your-username/AutoSight.git\ncd AutoSight\n````\n\n### 2. Create a virtual environment\n\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```\n\n### 3. Install dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n### 4. Configure GCP\n\n1. Enable these APIs:\n\n   * Vertex AI\n   * Cloud Storage\n   * BigQuery\n2. Create a service account with roles:\n\n   * `Storage Admin`\n   * `BigQuery Data Editor`\n   * `Vertex AI User`\n3. Download its credentials file as:\n\n   ```\n   autosight-agent-key.json\n   ```\n\n### 5. Run the pipeline\n\n```bash\npython main.py\n```\n\nThe script will:\n\n* Download and store the dataset in GCS\n* Analyze and generate a plot + summary\n* Load structured data into BigQuery\n* Launch the interactive dashboard automatically\n\n---\n\n## 📊 Example Dataset\n\nDefault dataset:\n\n\u003e [Air Travel (1958–1960)](https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv)\n\n---\n\n## 🧠 Future Enhancements\n\n* [ ] Upload custom CSVs via dashboard\n* [ ] Deploy Streamlit app to GCP App Engine\n* [ ] Add forecasting and clustering agents\n* [ ] Multi-dataset support + scheduling\n\n---\n\n## 🙅‍♂️ .gitignore\n\nCreate a `.gitignore` with the following:\n\n```\nautosight-agent-key.json\noutput.png\nsummary.txt\n__pycache__/\n*.pyc\nvenv/\n.env\n```\n\n---\n\n## 💬 Contact\n\nBuilt by **Shaun Danny**\n\n📧 [shaundanny2007@gmail.com](mailto:shaundanny2007@gmail.com)\n💼 [LinkedIn]((https://www.linkedin.com/in/shaundanny/))\n\n---\n\n## 🏁 Acknowledgements\n\nInspired by [Google Cloud Multi-Agent Hackathon](https://googlecloudmultiagents.devpost.com) and the [original InsightAgents](https://github.com/Soulfullmens/insightagents) project.\n\n```\n\n---\n\nWould you like me to create and export this `README.md` file for you directly so you can drop it into your GitHub repo?\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshaundann%2Fautosight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshaundann%2Fautosight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshaundann%2Fautosight/lists"}