{"id":15292577,"url":"https://github.com/noahgift/devml","last_synced_at":"2026-03-01T22:38:15.826Z","repository":{"id":62567884,"uuid":"106223403","full_name":"noahgift/devml","owner":"noahgift","description":"Product of Pragmatic AI Labs: Machine Learning, Statistics and Utilities around Developer Productivity, Company Productivity and Project Productivity","archived":false,"fork":false,"pushed_at":"2023-06-15T18:08:14.000Z","size":7814,"stargazers_count":28,"open_issues_count":19,"forks_count":23,"subscribers_count":4,"default_branch":"master","last_synced_at":"2026-01-09T10:50:07.133Z","etag":null,"topics":["ai","churn-statistics","data-science","defects","git","github","jupyter-notebook","machine-intelligence","machine-learning","pandas","productivity","python","seaborn","visualization"],"latest_commit_sha":null,"homepage":"https://paiml.com/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/noahgift.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-10-09T01:26:46.000Z","updated_at":"2024-10-12T06:45:49.000Z","dependencies_parsed_at":"2025-04-13T10:11:33.825Z","dependency_job_id":"ff87d623-4433-4837-8eab-3fae76e7c702","html_url":"https://github.com/noahgift/devml","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/noahgift/devml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahgift%2Fdevml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahgift%2Fdevml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahgift%2Fdevml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahgift%2Fdevml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/noahgift","download_url":"https://codeload.github.com/noahgift/devml/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noahgift%2Fdevml/sbom","scorecard":{"id":691361,"data":{"date":"2025-08-11","repo":{"name":"github.com/noahgift/devml","commit":"079aefd83b63c7c387846a8e1a8f8ecb16cc9a6c"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.9,"checks":[{"name":"Code-Review","score":0,"reason":"Found 2/27 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":1,"reason":"9 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2024-4 / GHSA-2mqj-m65w-jghx","Warn: Project is vulnerable to: PYSEC-2023-165 / GHSA-cwvm-v4w8-q58c","Warn: Project is vulnerable to: PYSEC-2022-42992 / GHSA-hcpj-qp55-gfph","Warn: Project is vulnerable to: PYSEC-2023-137 / GHSA-pr76-5cm5-w9cj","Warn: Project is vulnerable to: PYSEC-2023-161 / GHSA-wfm5-v35h-vwf4","Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: PYSEC-2019-124 / GHSA-38fc-9xqv-7f7q","Warn: Project is vulnerable to: PYSEC-2019-123 / GHSA-887w-45rq-vxgf","Warn: Project is vulnerable to: PYSEC-2012-9 / GHSA-hfg2-wf6j-x53p"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 6 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T02:21:54.691Z","repository_id":62567884,"created_at":"2025-08-22T02:21:54.691Z","updated_at":"2025-08-22T02:21:54.691Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29987409,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T21:06:37.093Z","status":"ssl_error","status_checked_at":"2026-03-01T21:05:45.052Z","response_time":124,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","churn-statistics","data-science","defects","git","github","jupyter-notebook","machine-intelligence","machine-learning","pandas","productivity","python","seaborn","visualization"],"created_at":"2024-09-30T16:18:54.657Z","updated_at":"2026-03-01T22:38:15.795Z","avatar_url":"https://github.com/noahgift.png","language":"Jupyter Notebook","readme":"[![Codacy Badge](https://api.codacy.com/project/badge/Grade/3e382eedf6424c1282aab4dd13e54c26)](https://www.codacy.com/app/noahgift/devml?utm_source=github.com\u0026utm_medium=referral\u0026utm_content=noahgift/devml\u0026utm_campaign=badger)\n[![CircleCI](https://circleci.com/gh/noahgift/devml.svg?style=svg)](https://circleci.com/gh/noahgift/devml)\n\n# devml\nMachine Learning, Statistics and Utilities around Developer Productivity\n\nThis is an open source project sponsored by [Pragmatic AI Labs](http://paiml.com).\n\nKey functions:\n* Can checkout all repositories in Github\n* Converts a tree of checked out repositories on disk into a pandas dataframe\n* Statistics on combined DataFrames\n\n\n## Pragmatic AI Labs\n![alt text](https://paiml.com/images/logo_with_slogan_white_background.png)\n\nYou can continue learning about these topics by:\n\n*   Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](https://amzn.to/2LFLVEg)\n*   Viewing more content at [noahgift.com](https://noahgift.com/)\n*   Viewing more content at [Pragmatic AI Labs](https://paiml.com/)\n\n## Pragmatic AI Book\n\nThis material is covered in [Chapter 8 of Pragmatic AI Book](https://amzn.to/2LFLVEg)\n\n## Related News about Workplace Analytics\n\nThis project is in the field of workplace analytics and is talked about in the Harvard Business Review.\n[People and Workplace Analytics](https://hbr.org/2018/05/how-people-analytics-can-help-you-change-process-culture-and-strategy)\n\n## Related IBM Developerworks Articles\n* Article about project on IBM Developerworks Part 1:  https://www.ibm.com/developerworks/library/ba-github-analytics-1/index.html\n\n* Article about project on IBM Developerworks Part 2:  https://www.ibm.com/developerworks/opensource/library/ba-github-analytics-2/index.html\n\n## Installation\n\n```\npip install devml\n```\n\nThis pip install installs a command-line tool:  dml (which is referenced in the documentation below). And also library devml, which is referenced below as well.\n\n## Get environment setup\n\nCode is written to support Python 3.6 or greater.  You can get that here:  https://www.python.org/downloads/release/python-360/.\n\nAn easy way to run the project locally is to check the repo out and in the root of the repo run:\n\n```\nmake setup\n```\n\n\nThen create a virtualenv in  ~/.devml:\n\n```\n$ python3 -m venv ~/.devml\n```\n\n### Next, source that virtualenv:\n\n```\nsource ~/.devml/bin/activate\n```\n\n#### Run Make All (installs, lints and tests)\n```\nmake all\n\n# #Example output\n#(.devml) ➜  devml git:(master) make all\n#pip install -r requirements.txt\n#Requirement already satisfied: pytest in /Users/noahgift/.devml/lib/python3.6/site-packages (from -r requirements.txt (line #1)\n---------- coverage: platform darwin, python 3.6.2-final-0 -----------\nName                       Stmts   Miss  Cover\n----------------------------------------------\ndevml/__init__.py              1      0   100%\ndevml/author_stats.py          6      6     0%\ndevml/fetch_repo.py           54     42    22%\ndevml/mkdata.py               84     21    75%\ndevml/org_stats.py            76     55    28%\ndevml/post_processing.py      50     35    30%\ndevml/state.py                29      9    69%\ndevml/stats.py                55     43    22%\ndevml/ts.py                   29     14    52%\ndevml/util.py                 12      4    67%\ndml.py                       111     66    41%\n----------------------------------------------\nTOTAL                        507    295    42%\n...\n```\n\nIf you don't use virtualenv or don't want to use it, no problem, just run `make all` it should probably work if you have python 3.6 installed:\n\n```\nmake all\n```\n\n## Explore Jupyter Notebooks on Github Organizations\n\nYou can explore combined datasets here using this example as a starter:\n\nhttps://github.com/noahgift/devml/blob/master/notebooks/github_data_exploration.ipynb\n\n![Pallets Project](https://user-images.githubusercontent.com/58792/31581904-66ee7fc0-b12a-11e7-804a-7b0f1728f30a.png)\n\n## Explore Jupyter Notebooks on Repository Churn\n\nYou can explore File Metadata exploration example here:\n\nhttps://github.com/noahgift/devml/blob/master/notebooks/repo_file_exploration.ipynb\n\n#### All Files Churned by type:\n![Pallets Project Relative Churn by file type](https://user-images.githubusercontent.com/58792/31587879-59d9724e-b19e-11e7-942e-999c02d7b566.png)\n\n#### Summary Churn Statistics by type:\n\n![Pallets Project by file type Churn statistics](https://user-images.githubusercontent.com/58792/31587931-5d79199e-b19f-11e7-89c2-98185fdef909.png)\n\n## Expected Configuration\n\nThe command-line tools expects for you to create a project directory with a config.json file.\nInside the config.json file, you will need to provide an oath token.  You can find information about how to do that here:  https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/.\n\nAlternately, you can pass these values in via the python API or via the command-line as options.\nThey stand for the following:\n\n* org:  Github Organization (To clone entire tree of repos)\n* checkout_dir:  place to checkout\n* oath:  personal oath token generated from Github\n\n```\n➜  devml git:(master) ✗ cat project/config.json\n{\n    \"project\" :\n        {\n            \"org\":\"pallets\",\n            \"checkout_dir\": \"/tmp/checkout\",\n            \"oath\": \"\u003ckeygenerated from Github\u003e\"\n        }\n\n}\n```\n\nNote: The jupyter notebooks use git@github.com to access GitHub repos. This is using SSH as the protocol, and is expecting an SSH key to be created, and added to your GitHub repo. See [Generating an SSH Key](https://help.github.com/en/articles/generating-an-ssh-key) for instructions.\n\n## Basic command-line Usage\n\nYou can find out stats for a checkout or a directory full of checkout as follows\n\n```text\n\ndml gstats author --path ~/src/mycompanyrepo(s)\nTop Commits By Author:                     author_name  commits\n0                     John Smith     3059\n1                      Sally Joe     2995\n2                   Greg Mathews     2194\n3                 Jim Mayflower      1448\n```\n\n## Basic API Usage (Converting a tree of repo(s) into a pandas DataFrame)\n\n```\nIn [1]: from devml import (mkdata, stats)\n\nIn [2]: org_df = mkdata.create_org_df(path=\"/src/mycompanyrepo(s)\")\nIn [3]: author_counts = stats.author_commit_count(org_df)\n\nIn [4]: author_counts.head()\nOut[4]:\n      author_name  commits\n0       John Smith     3059\n1        Sally Joe     2995\n2     Greg Mathews     2194\n3    Jim Mayflower     1448\n4   Truck Pritter      1441\n\n```\n\nTo analyze information in an IBM Engineering Workflow Manager (formerly Rational Team Concert or RTC) project area, use for example:\n\n```\nIn [1]: from devml import (mkdata, stats)\n\nIn [2]: projectarea_df = mkdata.create_projectarea_df(ccmServer=\"https://server:9443/ccm\", projectArea=\"Your Project Area\", userId=\"yourId\", password=\"yourPassword\")\nIn [3]: author_counts = stats.author_commit_count(projectarea_df)\n\nIn [4]: author_counts.head()\nOut[4]:\n                   author_name  commits\n0                Carol Newbold       35\n1  Jose de Jesus Herrera Ledon       11\n2               ATTAULLAH SYED        4\n3                 Patricia Der        2\n4                 Eric Solomon        1\n\n```\n## Clone all repos in Github using API\n\n```\nIn [1]: from devml import (mkdata, stats, state, fetch_repo)\n\nIn [2]: dest, token, org = state.get_project_metadata(\"../project/config.json\")\nIn [3]: fetch_repo.clone_org_repos(token, org,\n        dest, branch=\"master\")\n017-10-14 17:11:36,590 - devml - INFO - Creating Checkout Root:  /tmp/checkout\n2017-10-14 17:11:37,346 - devml - INFO - Found Repo # 1 REPO NAME: flask , URL: git@github.com:pallets/flask.git\n2017-10-14 17:11:37,347 - devml - INFO - Found Repo # 2 REPO NAME: pallets-sphinx-themes , URL: git@github.com:pallets/pallets-sphinx-themes.git\n2017-10-14 17:11:37,347 - devml - INFO - Found Repo # 3 REPO NAME: markupsafe , URL: git@github.com:pallets/markupsafe.git\n2017-10-14 17:11:37,348 - devml - INFO - Found Repo # 4 REPO NAME: jinja , URL: git@github.com:pallets/jinja.git\n2017-10-14 17:11:37,349 - devml - INFO - Found Repo # 5 REPO NAME: werkzeug , URL: git@githu\nIn [4]: !ls -l /tmp/checkout\ntotal 0\ndrwxr-xr-x  21 noahgift  wheel  672 Oct 14 17:11 click\ndrwxr-xr-x  25 noahgift  wheel  800 Oct 14 17:11 flask\ndrwxr-xr-x  11 noahgift  wheel  352 Oct 14 17:11 flask-docs\ndrwxr-xr-x  12 noahgift  wheel  384 Oct 14 17:11 flask-ext-migrate\ndrwxr-xr-x   8 noahgift  wheel  256 Oct 14 17:11 flask-snippets\ndrwxr-xr-x  14 noahgift  wheel  448 Oct 14 17:11 flask-website\ndrwxr-xr-x  18 noahgift  wheel  576 Oct 14 17:11 itsdangerous\ndrwxr-xr-x  23 noahgift  wheel  736 Oct 14 17:11 jinja\ndrwxr-xr-x  18 noahgift  wheel  576 Oct 14 17:11 markupsafe\ndrwxr-xr-x   4 noahgift  wheel  128 Oct 14 17:11 meta\ndrwxr-xr-x  10 noahgift  wheel  320 Oct 14 17:11 pallets-sphinx-themes\ndrwxr-xr-x   9 noahgift  wheel  288 Oct 14 17:11 pocoo-sphinx-themes\ndrwxr-xr-x  15 noahgift  wheel  480 Oct 14 17:11 website\ndrwxr-xr-x  25 noahgift  wheel  800 Oct 14 17:11 werkzeug\n```\n\n## Advanced CLI-Author:  Get Activity Statistics for a Tree of Checkouts or a Checkout and sort\n\n```\n ➜  devml git:(master) ✗ dml gstats activity --path /tmp/checkout --sort active_days\n\nTop Unique Active Days:               author_name  active_days active_duration  active_ratio\n86         Armin Ronacher          989       3817 days      0.260000\n501  Markus Unterwaditzer          342       1820 days      0.190000\n216            David Lord          129        712 days      0.180000\n664           Ron DuPlain           78        854 days      0.090000\n444         Kenneth Reitz           68       2566 days      0.030000\n197      Daniel Neuhäuser           42       1457 days      0.030000\n297          Georg Brandl           41       1337 days      0.030000\n196     Daniel Neuhäuser           36        435 days      0.080000\n450      Keyan Pishdadian           28        885 days      0.030000\n169     Christopher Grebs           28       1515 days      0.020000\n666    Ronny Pfannschmidt           27       3060 days      0.010000\n712           Simon Sapin           22        793 days      0.030000\n372           Jeff Widman           19        840 days      0.020000\n427    Julen Ruiz Aizpuru           16         36 days      0.440000\n21                 Adrian           16       1935 days      0.010000\n569        Nicholas Wiles           14        197 days      0.070000\n912                lord63           14        692 days      0.020000\n756           ThiefMaster           12       1287 days      0.010000\n763       Thomas Waldmann           11       1560 days      0.010000\n628            Priit Laes           10       1567 days      0.010000\n23        Adrian Moennich           10        521 days      0.020000\n391  Jochen Kupperschmidt           10       3060 days      0.000000\n```\n\n## Advanced CLI-Churn:  Get churn by file type\n\n#### Get the top ten files sorted by churn count with the extension .py:\n\n```\n✗ dml gstats churn --path /Users/noahgift/src/flask --limit 10 --ext .py\n2017-10-15 12:10:55,783 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/flask]\n                       files  churn_count  line_count extension  \\\n1            b'flask/app.py'          316      2183.0       .py\n3        b'flask/helpers.py'          176      1019.0       .py\n5    b'tests/flask_tests.py'          127         NaN       .py\n7                b'flask.py'          104         NaN       .py\n8                b'setup.py'           80       112.0       .py\n10           b'flask/cli.py'           75       759.0       .py\n11      b'flask/wrappers.py'           70       194.0       .py\n12      b'flask/__init__.py'           65        49.0       .py\n13           b'flask/ctx.py'           62       415.0       .py\n14  b'tests/test_helpers.py'           62       888.0       .py\n\n    relative_churn\n1             0.14\n3             0.17\n5              NaN\n7              NaN\n8             0.71\n10            0.10\n11            0.36\n12            1.33\n13            0.15\n14            0.07\n```\n#### Get descriptive statistics for extension .py and compare to another repository\n\nIn this example, flask, this repo and cpython are all compared to see how the median churn is.\n\n```\n(.devml) ➜  devml git:(master) dml gstats metachurn --path /Users/noahgift/src/flask --ext .py --statistic median\n2017-10-15 12:39:44,781 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/flask]\nMEDIAN Statistics:\n\n           churn_count  line_count  relative_churn\nextension\n.py                  2        85.0            0.13\n(.devml) ➜  devml git:(master) dml gstats metachurn --path /Users/noahgift/src/devml --ext .py --statistic median\n2017-10-15 12:40:10,999 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/devml]\nMEDIAN Statistics:\n\n           churn_count  line_count  relative_churn\nextension\n.py                  1        62.5            0.02\n\n(.devml) ➜  devml git:(master) dml gstats metachurn --path /Users/noahgift/src/cpython --ext .py --statistic median\n2017-10-15 12:42:19,260 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/cpython]\nMEDIAN Statistics:\n\n           churn_count  line_count  relative_churn\nextension\n.py                  7       169.5             0.1\n\n```\n\n#### Get Relative Churn for an Author\n\n\n\n```\n\ndml gstats authorchurnmeta --author \"Armin Ronacher\" --path /tmp/checkout/flask --ext .py\n\n#He has 6.5% median relative churn...very good.\n\ncount    193.000000\nmean       0.331860\nstd        0.625431\nmin        0.001000\n25%        0.030000\n50%        0.065000\n75%        0.250000\nmax        3.000000\nName: author_rel_churn, dtype: float64\n```\n\n#### Compare CPython Active Ratio with Linux Active Ratio\n\n```\n# Linux Development Active Ratio\ndml gstats activity --path /Users/noahgift/src/linux --sort active_days\n\n                       author_name  active_days active_duration  active_ratio\n14541                 Takashi Iwai         1677       4590 days      0.370000\n4382                  Eric Dumazet         1460       4504 days      0.320000\n3641               David S. Miller         1428       4513 days      0.320000\n7216                 Johannes Berg         1329       4328 days      0.310000\n8717                Linus Torvalds         1281       4565 days      0.280000\n275                        Al Viro         1249       4562 days      0.270000\n9915         Mauro Carvalho Chehab         1227       4464 days      0.270000\n9375                    Mark Brown         1198       4187 days      0.290000\n3172                 Dan Carpenter         1158       3972 days      0.290000\n12979                 Russell King         1141       4602 days      0.250000\n1683                      Axel Lin         1040       2720 days      0.380000\n400                   Alex Deucher         1036       3497 days      0.300000\n\n\n# CPython Development Active Ratio\n\n            author_name  active_days active_duration  active_ratio\n146    Guido van Rossum         2256       9673 days      0.230000\n301   Raymond Hettinger         1361       5635 days      0.240000\n128          Fred Drake         1239       5335 days      0.230000\n47    Benjamin Peterson         1234       3494 days      0.350000\n132        Georg Brandl         1080       4091 days      0.260000\n375      Victor Stinner          980       2818 days      0.350000\n235     Martin v. Löwis          958       5266 days      0.180000\n36       Antoine Pitrou          883       3376 days      0.260000\n362          Tim Peters          869       5060 days      0.170000\n164         Jack Jansen          800       4998 days      0.160000\n24   Andrew M. Kuchling          743       4632 days      0.160000\n330    Serhiy Storchaka          720       1759 days      0.410000\n44         Barry Warsaw          696       8485 days      0.080000\n52         Brett Cannon          681       5278 days      0.130000\n262        Neal Norwitz          559       2573 days      0.220000\n\nIn this analysis, Guido of Python has a 23% probability of working on a given day, and Linux has a 28% chance.\n\n```\n\n\n## Deletion Statistics\n\n#### Find all delete files from repository\n\n```\ndml gstats deleted --path /Users/noahgift/src/flask\n\nDELETION STATISTICS\n\n                                                 files          ext\n0                        b'tests/test_deprecations.py'          .py\n1                       b'scripts/flask-07-upgrade.py'          .py\n2                             b'flask/ext/__init__.py'          .py\n3                                  b'flask/exthook.py'          .py\n4                        b'scripts/flaskext_compat.py'          .py\n5                                 b'tests/test_ext.py'          .py\n\n```\n\n## FAQ\n\n#### What is Churn and Why Do I Care?\n\nCode churn is the amount of times a file has been modified.  Relative churn is the amount of times it has been modified relative to lines of code.  Research into defects in software has shown that relative code churn is highly predictive of defects, i.e., the greater the relative churn number the higher the amount of defects.\n\n\"Increase in relative code churn measures is\naccompanied by an increase in system defect\ndensity; \"\n\nYou can read the entire study here:  https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/icse05churn.pdf\n\n\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoahgift%2Fdevml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnoahgift%2Fdevml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoahgift%2Fdevml/lists"}