{"id":21426722,"url":"https://github.com/j535d165/scisort","last_synced_at":"2025-07-14T10:30:43.484Z","repository":{"id":66129870,"uuid":"540602039","full_name":"J535D165/scisort","owner":"J535D165","description":"Sort files in research project folders in a scientific order","archived":false,"fork":false,"pushed_at":"2022-10-30T22:34:41.000Z","size":55,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-08T14:12:16.478Z","etag":null,"topics":["directory-lister","directory-listing","file-sorting","python","research","research-data-management","science","sorting","utrecht-university"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/J535D165.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-09-23T20:10:59.000Z","updated_at":"2024-09-15T09:01:21.000Z","dependencies_parsed_at":"2023-03-10T23:39:29.696Z","dependency_job_id":null,"html_url":"https://github.com/J535D165/scisort","commit_stats":{"total_commits":48,"total_committers":1,"mean_commits":48.0,"dds":0.0,"last_synced_commit":"7a81d998d6f0c4c86c35fa2f5493b9f02782533d"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J535D165%2Fscisort","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J535D165%2Fscisort/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J535D165%2Fscisort/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J535D165%2Fscisort/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/J535D165","download_url":"https://codeload.github.com/J535D165/scisort/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225970178,"owners_count":17553395,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["directory-lister","directory-listing","file-sorting","python","research","research-data-management","science","sorting","utrecht-university"],"created_at":"2024-11-22T21:43:25.539Z","updated_at":"2024-11-22T21:43:25.991Z","avatar_url":"https://github.com/J535D165.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg alt=\"Scisort - Sort files in science projects.\" src=\"https://github.com/J535D165/scisort/raw/main/scisort_repocard.svg\"\u003e\n\u003c/p\u003e\n\n# Scisort - Sort files in research projects\n\n![PyPI](https://img.shields.io/pypi/v/scisort) [![DOI](https://zenodo.org/badge/540602039.svg)](https://zenodo.org/badge/latestdoi/540602039)\n\nScisort is a Python package for sorting files in research projects and\nscientific (data) repositories. Files and folders are sorted in such a way\nthat inspecting folders in research projects is more intuitive. See the\n[philosophy of scisort](#philosophy-of-scisort) to understand the sorting algorithm.\n\n--- \n\nSince scisort is a low-level API, most researchers, developers, and data\nscientists may be more interested in [`scitree`](https://github.com/J535D165/scitree).\nScitree is a smart recursive directory listing program that makes use of scisort.\n\n---\n\n## Philosophy of scisort\n\nPhilosophy of scisort and [scitree](https://github.com/J535D165/scitree):\n\n- Read the README first, therefore I'm on top\n- Before I install or use the content, I open the [LICENSE](https://choosealicense.com/).\n- Files first, folders second\n- Numbered files are [naturally sorted](https://en.wikipedia.org/wiki/Natural_sort_order)\n- I love [intuitive and reproducible project structures](https://doi.org/10.1371/journal.pcbi.1005510)\n- Follow the order of execution where possible\n- I ignore, what git ignores\\*\n\n*\\* Only for [`scitree`](https://github.com/J535D165/scitree).*\n\nFor more information about the structure, see [scisort/scisort/keygen.py](https://github.com/J535D165/scisort/blob/main/scisort/keygen.py). \n\n## Installation\n\nScisort requires Python 3.6 or later.\n\n```sh\npip install scisort\n```\n\n## Getting started\n\n### Traditional sorting\n\nConsider the following project folder structure. It's a mixture of files and\nfolders. The folder is sorted on the file or folder name. Some reasons why\nthis sort is not intuitive:\n\n\n```python\nfiles = ['LICENSE.txt',\n 'README.md',\n 'data',\n 'data/Bos_2018.csv',\n 'jobs.sh',\n 'output',\n 'output/simulation',\n 'output/simulation/Bos_2018',\n 'output/simulation/Bos_2018/descriptives',\n 'output/simulation/Bos_2018/descriptives/data_stats_Bos_2018.json',\n 'output/simulation/Bos_2018/descriptives/wordcloud_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives/wordcloud_irrelevant_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives/wordcloud_relevant_Bos_2018.png',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_0.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_1640.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3154.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3518.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3519.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3721.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_4612.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_4699.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_559.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_5673.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_6.json',\n 'output/simulation/Bos_2018/plot_recall_sim_Bos_2018.png',\n 'output/simulation/Bos_2018/state_files',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_0.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_1640.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3154.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3518.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3519.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3721.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_4612.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_4699.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_559.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_5673.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_6.asreview',\n 'output/tables',\n 'output/tables/data_descriptives.csv',\n 'output/tables/data_descriptives.xlsx',\n 'output/tables/data_metrics.csv',\n 'output/tables/data_metrics.xlsx',\n 'scripts',\n 'scripts/get_plot.py',\n 'scripts/merge_descriptives.py',\n 'scripts/merge_metrics.py']\n```\n\nThe files and folders are real research output created with `ASReview-makita`\n(see [examples](examples)).\n\n### Scisort sorting\n\nScisort integrates with Python's `sorted` by supplying the sort key.\n\n```python\nfrom scisort import scisort_keygen\n\nsorted(files, key=scisort_keygen())\n```\n\n```python\n['README.md',\n 'LICENSE.txt',\n 'jobs.sh',\n 'data',\n 'data/Bos_2018.csv',\n 'scripts',\n 'scripts/get_plot.py',\n 'scripts/merge_descriptives.py',\n 'scripts/merge_metrics.py',\n 'output',\n 'output/simulation',\n 'output/simulation/Bos_2018',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_0.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_6.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_559.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_1640.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3154.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3518.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3519.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_3721.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_4612.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_4699.json',\n 'output/simulation/Bos_2018/metrics_sim_Bos_2018_5673.json',\n 'output/simulation/Bos_2018/plot_recall_sim_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives',\n 'output/simulation/Bos_2018/descriptives/wordcloud_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives/wordcloud_irrelevant_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives/wordcloud_relevant_Bos_2018.png',\n 'output/simulation/Bos_2018/descriptives/data_stats_Bos_2018.json',\n 'output/simulation/Bos_2018/state_files',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_0.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_6.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_559.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_1640.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3154.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3518.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3519.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_3721.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_4612.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_4699.asreview',\n 'output/simulation/Bos_2018/state_files/sim_Bos_2018_5673.asreview',\n 'output/tables',\n 'output/tables/data_descriptives.csv',\n 'output/tables/data_descriptives.xlsx',\n 'output/tables/data_metrics.csv',\n 'output/tables/data_metrics.xlsx']\n```\n\n### Third party support\n\nScisort also integrates with other libraries implementing sorting based on a\nkey.\n\n#### Pandas\n\n```python\nimport pandas as pd\n\nfrom scisort import scisort_keygen_pandas\n\npd.Series(files).sort_values(key=scisort_keygen_pandas())\n```\n\n#### Natsort\n\n```python\nimport natsort as ns\n\nns.natsorted(files, key=scisort_keygen())\n```\n\n## License\n\n[MIT](/LICENSE)\n\n## Contact\n\nFeel free to reach out with questions, remarks, and suggestions. The\n[issue tracker](/issues) is a good starting point. You can also email me at\n[jonathandebruinos@gmail.com](mailto:jonathandebruinos@gmail.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj535d165%2Fscisort","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fj535d165%2Fscisort","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj535d165%2Fscisort/lists"}