{"id":29136353,"url":"https://github.com/lamm-mit/sparks","last_synced_at":"2025-10-24T09:28:05.175Z","repository":{"id":301124433,"uuid":"972559621","full_name":"lamm-mit/Sparks","owner":"lamm-mit","description":"Multi-modal, multi-agent AI system capable of independently conducting research by formulating hypotheses, performing experiments, and adapting its strategy ","archived":false,"fork":false,"pushed_at":"2025-06-25T08:28:22.000Z","size":1282,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-25T09:35:37.246Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lamm-mit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-25T09:22:29.000Z","updated_at":"2025-06-25T08:28:25.000Z","dependencies_parsed_at":"2025-06-25T09:47:35.949Z","dependency_job_id":null,"html_url":"https://github.com/lamm-mit/Sparks","commit_stats":null,"previous_names":["lamm-mit/sparks"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lamm-mit/Sparks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamm-mit%2FSparks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamm-mit%2FSparks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamm-mit%2FSparks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamm-mit%2FSparks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lamm-mit","download_url":"https://codeload.github.com/lamm-mit/Sparks/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamm-mit%2FSparks/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002620,"owners_count":26083425,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-30T11:07:58.397Z","updated_at":"2025-10-10T03:34:33.370Z","avatar_url":"https://github.com/lamm-mit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/user-attachments/assets/beb49b33-82d2-4421-bcbe-0a604588fc0d\" width=\"300\" alt=\"SPARKS\"/\u003e\n\u003c/div\u003e\n\n\u003ch1 align=\"center\"\u003eSPARKS\u003c/h1\u003e\n\u003ch2 align=\"center\"\u003eMulti-Agent Artificial Intelligence Model for End-to-End Scientific Discovery\u003c/h2\u003e\n\nA. Ghafarollahi, M.J. Buehler*\n\nMassachusetts Institute of Technology\n\n*mbuehler@MIT.EDU\n\nAdvances in artificial intelligence (AI) promise autonomous discovery, yet most systems still resurface knowledge latent in their training data. We present **Sparks, a multi-modal multi-agent AI model** that executes the entire discovery cycle that includes hypothesis generation, experiment design and iterative refinement to develop generalizable principles and a report without human intervention.\n\nApplied to protein science, Sparks uncovered two previously unknown phenomena: (i) a length-dependent mechanical crossover whereby beta-sheet-biased peptides surpass alpha-helical ones in unfolding force beyond ~80 residues, establishing a new design principle for peptide mechanics; and (ii) a chain-length/secondary-structure stability map revealing unexpectedly robust beta-sheet-rich architectures and a ``frustration zone'' of high variance in mixed alpha/beta folds. \n\nThese findings emerged from fully self-directed reasoning cycles that combined generative sequence design, high-accuracy structure prediction and physics-aware property models, with paired generation-and-reflection agents enforcing self-correction and reproducibility. The key result is that  Sparks can independently conduct rigorous scientific inquiry and identify previously unknown scientific principles.\n\n## Model Overview\n\n![Fig 1](https://github.com/user-attachments/assets/cfab1fe2-f8df-4d32-9c5a-dcd11b157d9a)\n\nFigure 1. **Overview of Sparks, a multi-agent AI model for automated scientific discovery.** \n\n**Panel a**: Contemporary AI systems excel at statistical generalization within known domains, but rarely generate or validate hypotheses that extend beyond prior data, and cannot typically identify shared principles across distinct phenomena. This is because powerful models tend to memorize physics without discovering shared concepts. For scientific discovery, however, the elucidation of more general and shared foundational concepts (such as a scaling law, design principle, or crossover) is critical, in order to create significantly higher extrapolation capacity. **Panel b**: Sparks automates the end-to-end scientific process through four interconnected modules: 1) hypothesis generation, 2) testing, 3) refinement, and 4) documentation. The system begins with a user-defined query, which includes research goals, tools to test the hypothesis, and experimental constraints to guide the experimentation. It then formulates an innovative research idea with a testable hypothesis, followed by rigorous experimentation and refinement cycles. All findings are synthesized into a final document that captures the research objective, methodology, results, and directions for future work, in addition to a shared principle (such as in the examples presented here a scaling law or mechanistic rule). Each module is operated by specialized AI agents with clearly defined, synergistic roles.\n\n### Installation\n\n```\nconda create -n Sparks python=3.10\nconda activate Sparks\n\n# Install PyPI requirements\npip install -r requirements.txt\n```\n\n### Launching Sparks\nThis repository provides code for running Sparks, a system for automated scientific discovery.\n\nTo launch Sparks, open and run the notebook in the main directory:\n\n```\nlaunch_Sparks.ipynb\n```\nThis notebook takes the following inputs:\n\n- ```query```: A user-defined research question or goal.\n\n- ```tools```: Custom Python functions (defined by the user) that Sparks can call to test ideas.\n\n- ```constraints```: Optional conditions (defined by the user) that should be respected when Sparks evaluates hypotheses.\n\n\n### Defining Custom Tools\nSparks relies on user-defined tools to validate research ideas. Here's how to define and connect them:\n\n1. **Define your tools** as Python functions in the file:\n  ```\n  functions.py\n  ```\n\n2. **Describe each tool** in the ```launch_Sparks.ipynb``` notebook, including:\n\n- The tool's name\n\n- A brief description of its purpose\n\n- Its input parameters\n\n- Its expected output\n\nThis description is how Sparks \"understands\" what each tool does.\n\n**To adapt Sparks for your use case, just update ```functions.py``` with your tools and modify ```launch_Sparks.ipynb``` to include their descriptions.**\n\n### Original paper\n\nPlease cite this work as:\n```\n@article{ghafarollahi2025sparksmultiagentartificialintelligence,\n      title={Sparks, a Multi-Agent AI Model for Automated End-to-End Scientific Discovery}, \n      author={Alireza Ghafarollahi and Markus J. Buehler},\n      year={2025},\n      eprint={2504.19017},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2504.19017}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamm-mit%2Fsparks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flamm-mit%2Fsparks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamm-mit%2Fsparks/lists"}