{"id":21279002,"url":"https://github.com/dagshub/fds","last_synced_at":"2025-04-12T21:23:41.308Z","repository":{"id":36976527,"uuid":"353587745","full_name":"DagsHub/fds","owner":"DagsHub","description":"Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc","archived":false,"fork":false,"pushed_at":"2024-06-30T23:54:20.000Z","size":228,"stargazers_count":387,"open_issues_count":15,"forks_count":22,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-04T00:11:29.486Z","etag":null,"topics":["data-science","dvc","git"],"latest_commit_sha":null,"homepage":"http://fastds.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DagsHub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-01T05:43:43.000Z","updated_at":"2025-03-22T11:40:15.000Z","dependencies_parsed_at":"2024-06-21T13:00:05.635Z","dependency_job_id":"cf53b0d2-5503-49fa-aa01-34ff05d0b7b0","html_url":"https://github.com/DagsHub/fds","commit_stats":{"total_commits":227,"total_committers":12,"mean_commits":"18.916666666666668","dds":0.3876651982378855,"last_synced_commit":"c3eb10a2e096c9eee5e57487af447f269a5739a4"},"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DagsHub%2Ffds","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DagsHub%2Ffds/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DagsHub%2Ffds/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DagsHub%2Ffds/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DagsHub","download_url":"https://codeload.github.com/DagsHub/fds/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248632988,"owners_count":21136785,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","dvc","git"],"created_at":"2024-11-21T10:17:59.006Z","updated_at":"2025-04-12T21:23:41.274Z","avatar_url":"https://github.com/DagsHub.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [![Fast Data Science](https://user-images.githubusercontent.com/18662887/122681354-821f8680-d1fc-11eb-9c72-575d66ff0c3b.png) aka `fds`](http://fastds.io)\n\n[![Discord](https://img.shields.io/discord/698874030052212737)](https://discord.com/invite/9gU36Y6)\n[![Tests](https://github.com/dagshub/fds/actions/workflows/test.yml/badge.svg?branch=main)](https://github.com/DAGsHub/fds/actions/workflows/test.yml)\n[![PyPI](https://img.shields.io/pypi/v/fastds.svg)](https://pypi.python.org/pypi/fastds/)\n[![MIT license](https://img.shields.io/badge/License-MIT-blue.svg)](https://lbesson.mit-license.org/)\n\u003ca href=\"https://twitter.com/TheRealDAGsHub\" title=\"DagsHub on Twitter\"\u003e\u003cimg src=\"https://img.shields.io/twitter/follow/TheRealDAGsHub.svg?style=social\"\u003e\u003c/a\u003e\n\n---\n\n`fds` is a tool for Data Scientists made by [DagsHub](https://dagshub.com/) to version control data and code at once.\n\nAt a high level, `fds` is a command line wrapper around Git and [DVC](https://dvc.org), meant to minimize the chances of human error, automate repetitive tasks, and provide a smoother landing for new users.\n\n[See the launch blog](https://dagshub.com/blog/fds-fast-data-science-with-git-and-dvc) for more information about the motivation behind this project.\n=======\n## Installation\n\n- Install `fds` using PIP `pip3 install fastds`\n- Once installed successfully, you can start using `fds`\n- eg: `fds init` should trigger the init command\n- You can also use `sdf` instead of `fds` - it's identical, but might be more fun to type 🤓 \n\n\n## Commands Supported\n\n```\n$ fds -h\nusage: fds [-h] [-v] {init,status,add,commit,push,save} ...\n\nOne command for all your git and dvc needs\n\npositional arguments:\n  {init,status,add,commit,push,save}\n                        command (refer commands section in documentation)\n    init                initialize a git and dvc repository\n    status              get status of your git and dvc repository\n    add                 add files/folders to git and dvc repository\n    commit              commits added changes to git and dvc repository\n    clone               Clones from git repository and pulls from dvc remote\n    push                push commits to remote git and dvc repository\n    save                saves all project files to a new version and pushes\n                        them to your remote\n```\n\n## Examples\n\n### `fds status` = `dvc status` + `git status`\n`fds status` lets us quickly check the full status of the repo - both DVC and git at the same time, to make sure we don't forget anything.\n\n![image](https://user-images.githubusercontent.com/611655/121861591-9d712a00-cd02-11eb-9a8f-a9579f773889.png)\n\nHere, we can see that we have a small, normal text file - `.gitignore`, plus a `bigfile.txt` and `data` folder which we would want to add to DVC and not to git. `fds` add makes that easy!\n\n### `fds add` = `dvc add` + `git add` wizard 🧙‍♂️\n\nYou're probably used to the convenience of using `git add .` to just track everything. Unfortunately, you have to be careful doing this when working with large files - one wrong move, and you might fry your hard drive by accidentally telling git to track a huge dataset!  \nWe wanted to retain the convenience of just typing one command which means \"just track all changes, I'll do a `git commit` in one second\", which will be smart enough to avoid the pitfalls of large data files.  \n`fds add` does exactly that, while interactively asking the user how to handle files. You can add to DVC, or git, recursively step into large folders, skip or ignore files, etc.\n\n![image](https://user-images.githubusercontent.com/611655/121861680-aeba3680-cd02-11eb-866e-d6a752fdc920.png)\n\nHere's the file tree of the repo I used above, with file sizes included. Note how `bigfile.txt` and `data/` were automatically added to DVC and not git:\n\n![image](https://user-images.githubusercontent.com/611655/121862659-b201f200-cd03-11eb-9710-8ce1a603d953.png)\n\n### `fds commit` = `dvc commit` + `git commit`\n\nFinally, to close the loop of a real workflow, what happens when I change existing DVC tracked files? Without FDS, you'd have to remember to separately run `dvc repro` or `dvc commit`, then `git add tracked_file.dvc`, and only then `git commit`.  \n`fds commit` does all that for you - commits changes to DVC first, then adds the `.dvc` files with the updated hashes to git, then immediately commits these changes (plus any other staged changes) to a new git commit. Voila!\n\n![image](https://user-images.githubusercontent.com/611655/121862710-c219d180-cd03-11eb-8ad1-b672b4817aee.png)\n\n### Important note on using FDS inside Jupyter notebooks and Google Colab\n\nFDS is designed for interactive use via prompts that require user input.  \nWhile this is possible to do inside Jupyter notebooks, **it won't work with the `%%bash` magic**.  \nYou have to use `!fds` since `%%bash` prevents user input.  \n[See Colab example here](https://colab.research.google.com/drive/1Jiu2aaLGFbxHEFG4tvAN3qznTqT6n7QI?usp=sharing)\n\n## Contributing\n\nWe would love for you to try out FDS yourself, and to give us feedback. It would really help us to prioritize future features, so please [vote on or create issues](https://github.com/dagshub/fds/issues)!  \nIf you'd like to take a more active part, we have some [good first issues](https://github.com/DAGsHub/fds/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) that you can start with. We'll be happy to provide guidance on the best way to do so.\n\nAnd of course, we're always happy to have you on the DagsHub discord, where you can ask questions or give feedback on FDS:\n[![Discord](https://img.shields.io/discord/698874030052212737)](https://discord.com/invite/9gU36Y6)\n\n----\n\u003cdiv style=\"\n    display: flex;\n    align-items: center;\n\"\u003e\n  \u003cspan\u003eMade with ❤️ \u0026nbsp; by \u003c/span\u003e \u003ca href=\"https://dagshub.com\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/DAGsHub/client/master/dagshub_github.png\" width=300 alt=\"dagshub-logo\"/\u003e\u003c/a\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdagshub%2Ffds","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdagshub%2Ffds","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdagshub%2Ffds/lists"}