Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/karlicoss/ghexport
Export your Github activity: events, repositories, stars, etc.
https://github.com/karlicoss/ghexport
backup data-liberation export github github-api takeout
Last synced: 5 days ago
JSON representation
Export your Github activity: events, repositories, stars, etc.
- Host: GitHub
- URL: https://github.com/karlicoss/ghexport
- Owner: karlicoss
- License: mit
- Created: 2017-01-03T19:47:07.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2023-10-20T18:57:20.000Z (about 1 year ago)
- Last Synced: 2024-08-02T12:44:51.395Z (3 months ago)
- Topics: backup, data-liberation, export, github, github-api, takeout
- Language: Python
- Homepage:
- Size: 45.9 KB
- Stars: 45
- Watchers: 5
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.org
- License: LICENSE
Awesome Lists containing this project
- project-awesome - karlicoss/ghexport - Export your Github activity: events, repositories, stars, etc. (Python)
- awesome-starred - karlicoss/ghexport - Export your Github activity: events, repositories, stars, etc. (github-api)
README
#+begin_src python :dir src :results drawer :exports results
import ghexport.export as E; return E.make_parser().prog
#+end_src#+RESULTS:
:results:
Export your Github personal data: issues, PRs, comments, followers and followings, etc.*Note*: this only deals with metadata. If you want a download of actual git repositories, I recommend using [[https://github.com/josegonzalez/python-github-backup][python-github-backup]].
:end:* Setting up
1. The easiest way is =pip3 install --user git+https://github.com/karlicoss/ghexport=.Alternatively, use =git clone --recursive=, or =git pull && git submodule update --init=. After that, you can use =pip3 install --editable=.
2. To use the API, you need to get a [[https://github.com/settings/tokens][personal access token]] from settings. Note that you need to use =repo= scope.
* Exporting#+begin_src python :dir src :results drawer :exports results
import ghexport.export as E; return E.make_parser().epilog
#+end_src#+RESULTS:
:results:Usage:
*Recommended*: create =secrets.py= keeping your api parameters, e.g.:
: token = "TOKEN"
After that, use:
: python3 -m ghexport.export --secrets /path/to/secrets.py
That way you type less and have control over where you keep your plaintext secrets.
*Alternatively*, you can pass parameters directly, e.g.
: python3 -m ghexport.export --token
However, this is verbose and prone to leaking your keys/tokens/passwords in shell history.
You can also import ~ghexport.export~ as a module and call ~get_json~ function directly to get raw JSON.
I *highly* recommend checking exported files at least once just to make sure they contain everything you expect from your export. If not, please feel free to ask or raise an issue!
:end:
** Extra export options
- you can control specific data you want to export via ~--include~ option (see ~--help~ for available fields)By default, all data will be included in the export.
- you can include or exclude [[https://docs.github.com/en/rest/metrics/traffic][repository traffic]] data via ~--include-repos-traffic~ or ~--exclude-repos-traffic~.
Currently it's included by default.
You might want to exclude it if you have some issues with traffic API endpoint (it tends to be flakier than other endpoints).
* API limitations
*WARNING*: github API limits extent to which you can retrieve certain data, e.g. [[https://developer.github.com/v3/activity/events][events]] you can only get events from the past 90 days, and not more than 300 events.
I *highly* recommend to export regularly and keep old exports. Easy way to achieve it is command like this:
: python3 -m ghexport.export --secrets /path/to/secrets.py >"export-$(date -I).json"
Or, you can use [[https://github.com/karlicoss/arctee][arctee]] that automates this.
To get your older data past 90 days, you can request a [[https://github.com/settings/admin][manual export]] in your account settings.
# TODO hmm, mention that dal.py can handle this?
* Known Issues
The =requests= (and therefore =PyGithub=) modules on which this depends seems to sometimes fail to login if a =~/.netrc= file is present, see [[https://github.com/psf/requests/issues/5801#issuecomment-901610012][here]] for context.
* Using the data
#+begin_src python :dir src :results drawer :exports results
import ghexport.exporthelpers.dal_helper as D; return D.make_parser().epilog
#+end_src#+RESULTS:
:results:You can use =ghexport.dal= (stands for "Data Access/Abstraction Layer") to access your exported data, even offline.
I elaborate on motivation behind it [[https://beepb00p.xyz/exports.html#dal][here]].- main usecase is to be imported as python module to allow for *programmatic access* to your data.
You can find some inspiration in [[https://beepb00p.xyz/mypkg.html][=my.=]] package that I'm using as an API to all my personal data.
- to test it against your export, simply run: ~python3 -m ghexport.dal --source /path/to/export~
- you can also try it interactively: ~python3 -m ghexport.dal --source /path/to/export --interactive~
:end:
Example output:
: Your events:
: Counter({'PushEvent': 181,
: 'WatchEvent': 27,
: 'CreateEvent': 22,
: 'IssueCommentEvent': 20,
: 'PullRequestEvent': 15,
: 'IssuesEvent': 5,
: 'DeleteEvent': 5,
: 'ForkEvent': 3,
: 'PullRequestReviewCommentEvent': 1})