https://github.com/tomashubelbauer/github-tarball
Example of extracting a GitHub tarball with Node
https://github.com/tomashubelbauer/github-tarball
github tar tarball
Last synced: 2 months ago
JSON representation
Example of extracting a GitHub tarball with Node
- Host: GitHub
- URL: https://github.com/tomashubelbauer/github-tarball
- Owner: TomasHubelbauer
- Created: 2020-08-04T19:07:29.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2022-04-14T20:23:28.000Z (about 4 years ago)
- Last Synced: 2025-06-01T16:41:42.162Z (about 1 year ago)
- Topics: github, tar, tarball
- Language: JavaScript
- Homepage:
- Size: 1.95 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# GitHub Tarball
This is an example code for downloading GitHub repository `.tar.gz` archive from
the GitHub API (which provides ZIP and TAR archive downloads), decompressing it
using Node's native `zlib` module and parsing the resulting stream entries out.
## Configuring
Replace `userName` and `repositoryName` in the code prior to running it if you
wish to try this on your own repository.
## Running
`node .`
## Testing
No tests yet.
## Purpose
This repository exists as a mere proof of concept. I wanted to see if I can get
from pure Node (no NPM or ESM dependencies) to a fully cloned GitHub repository.
There are Git clients build in pure JavaScript (check out Isomorphic-Git), but I
wanted to see if it would be possible without using any dependencies at all.
## To-Do
### Decide how to handle files which do not contain their relative directory
Most files are okay and they come out with paths like `repo/file.ext`. However,
some files, probably files whose full relative path would exceed the 100 bytes
available for the file name in Tar, only have their file name and no path stored
in the archive. I think the safe and probably expected thing to do here is to
take the last known path entry and prepend its path to the file name. A reliable
way to detect these files is probably just whether their file name contains a
slash?
### Fix padding NUL byte detection at the end of the archive
Right now after each entry I check to see if the remaining bytes are all NUL. I
don't use a particularly efficient way of doing it either and even if I did this
is just plain wasteful. I don't have a better way to detect and throw away the
end-of-file padding, so this works for now, but it should be improved.