{"id":31753757,"url":"https://github.com/servicenow/insight-bench","last_synced_at":"2025-10-09T17:54:11.735Z","repository":{"id":244180883,"uuid":"813817183","full_name":"ServiceNow/insight-bench","owner":"ServiceNow","description":null,"archived":false,"fork":false,"pushed_at":"2025-08-05T18:49:37.000Z","size":22165,"stargazers_count":51,"open_issues_count":2,"forks_count":13,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-08-05T20:36:55.043Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ServiceNow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-11T19:55:37.000Z","updated_at":"2025-08-05T18:49:41.000Z","dependencies_parsed_at":"2024-06-13T09:41:21.386Z","dependency_job_id":"2675937b-8909-43b8-af35-1eb3e6aea62f","html_url":"https://github.com/ServiceNow/insight-bench","commit_stats":null,"previous_names":["servicenow/insight-bench"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ServiceNow/insight-bench","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Finsight-bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Finsight-bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Finsight-bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Finsight-bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ServiceNow","download_url":"https://codeload.github.com/ServiceNow/insight-bench/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Finsight-bench/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001804,"owners_count":26083197,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-09T17:54:04.646Z","updated_at":"2025-10-09T17:54:11.730Z","avatar_url":"https://github.com/ServiceNow.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Insight-Bench\n\n![Banner](data/banner.jpg)\n\n## Evaluating Data Analytics Agents Through Multi-Step Insight Generation\n[[Paper]](https://arxiv.org/pdf/2407.06423)[[Website]](https://insightbench.github.io/)[[Dataset]](https://huggingface.co/datasets/ServiceNow/insight_bench)\n\n\nInsight-Bench is a benchmark dataset designed to evaluate end-to-end data analytics by evaluating agents' ability to perform comprehensive data analysis across diverse use cases, featuring carefully curated insights, an evaluation mechanism based on LLaMA-3-Eval or G-EVAL, and a data analytics agent, AgentPoirot.\n\n## Data\n\nAll groundtruth notebooks are in `data/notebooks`. \n\nAn example notebook can be found here: `data/notebooks/flag-1.ipynb`\n\n## 1. Install the python libraries\n\n```\npip install --upgrade git+https://github.com/ServiceNow/insight-bench\n```\n\n## 2. Usage\n\nEvaluate agent on a single notebook\n\n```python\nimport os\nfrom insightbench import benchmarks, agents\n\n# Set OpenAI API Key\n# os.environ[\"OPENAI_API_KEY\"] = \"\u003copenai_api_key\u003e\"\n\n\n# Get Dataset\ndataset_dict = benchmarks.load_dataset_dict(\"data/notebooks/flag-1.json\")\n\n# Run an Agent\nagent = agents.Agent(\n    model_name=\"gpt-4o-mini\",\n    max_questions=2,\n    branch_depth=1,\n    n_retries=2,\n    savedir=\"results/sample\",\n)\npred_insights, pred_summary = agent.get_insights(\n    dataset_csv_path=dataset_dict[\"dataset_csv_path\"], return_summary=True\n)\n\n\n# Evaluate\nscore_insights = benchmarks.evaluate_insights(\n    pred_insights=pred_insights,\n    gt_insights=dataset_dict[\"insights\"],\n    score_name=\"rouge1\",\n)\nscore_summary = benchmarks.evaluate_summary(\n    pred=pred_summary, gt=dataset_dict[\"summary\"], score_name=\"rouge1\"\n)\n\n# Print Score\nprint(\"score_insights: \", score_insights)\nprint(\"score_summary: \", score_summary)\n```\n\n## 3. Evaluate Agent on Multiple Insights\n\n```bash\npython main.py --openai_api_key \u003copenai_api_key\u003e\n               --savedir_base \u003csavedir_base\u003e\n```\n\n\n## Citation\n\n```bibtex\n@article{sahu2024insightbench,\n  title={InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation},\n  author={Sahu, Gaurav and Puri, Abhay and Rodriguez, Juan and Abaskohi, Amirhossein and Chegini, Mohammad and Drouin, Alexandre and Taslakian, Perouz and Zantedeschi, Valentina and Lacoste, Alexandre and Vazquez, David and Chapados, Nicolas and Pal, Christopher and others},\n  journal={arXiv preprint arXiv:2407.06423},\n  year={2024}\n}\n\n```\n\n## 🤝 Contributing\n- Please check the outstanding issues and feel free to open a pull request.\n- Please include any feedback or suggestions or feature requests in the issues section.\n- You are welcome to contribute to the codebase and add new datasets and flags\n\n\n### Thank you!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fservicenow%2Finsight-bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fservicenow%2Finsight-bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fservicenow%2Finsight-bench/lists"}