{"id":13693649,"url":"https://github.com/go-shiori/obelisk","last_synced_at":"2025-05-16T03:05:22.612Z","repository":{"id":38314967,"uuid":"250925391","full_name":"go-shiori/obelisk","owner":"go-shiori","description":"Go package and CLI tool for saving web page as single HTML file","archived":false,"fork":false,"pushed_at":"2025-01-01T07:55:12.000Z","size":323,"stargazers_count":278,"open_issues_count":9,"forks_count":23,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-08T13:13:12.066Z","etag":null,"topics":["archive","cli","go","golang","hacktoberfest"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/go-shiori.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-29T00:53:06.000Z","updated_at":"2025-04-07T22:06:02.000Z","dependencies_parsed_at":"2024-01-09T18:04:44.186Z","dependency_job_id":"dcf8f986-3c5a-47f2-a431-361adb114a6f","html_url":"https://github.com/go-shiori/obelisk","commit_stats":{"total_commits":83,"total_committers":7,"mean_commits":"11.857142857142858","dds":"0.45783132530120485","last_synced_commit":"61fdf00f94d39851328b07e231ee0d23ed23f07d"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/go-shiori%2Fobelisk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/go-shiori%2Fobelisk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/go-shiori%2Fobelisk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/go-shiori%2Fobelisk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/go-shiori","download_url":"https://codeload.github.com/go-shiori/obelisk/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254459088,"owners_count":22074605,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archive","cli","go","golang","hacktoberfest"],"created_at":"2024-08-02T17:01:14.692Z","updated_at":"2025-05-16T03:05:17.603Z","avatar_url":"https://github.com/go-shiori.png","language":"Go","funding_links":["https://www.paypal.me/RadhiFadlillah","https://ko-fi.com/radhifadlillah"],"categories":["Tools \u0026 Software","开源类库","Open source library","cli"],"sub_categories":["Acquisition","文本处理","Word Processing"],"readme":"\u003cp align=\"center\"\u003e\n\t\u003cimg src=\"https://raw.githubusercontent.com/go-shiori/obelisk/master/docs/readme/logo.png\" alt=\"Obelisk\" width=\"450\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003eGo packages and CLI tool for saving web page as single HTML file\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\t\u003ca href=\"https://choosealicense.com/licenses/mit\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=license\u0026message=MIT\u0026color=5fa6b0\"\u003e\u003c/a\u003e\n\t\u003ca href=\"https://goreportcard.com/report/github.com/go-shiori/obelisk\"\u003e\u003cimg src=\"https://goreportcard.com/badge/github.com/go-shiori/obelisk\"\u003e\u003c/a\u003e\n\t\u003ca href=\"https://godoc.org/github.com/go-shiori/obelisk\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=godoc\u0026message=reference\u0026color=5272B4\u0026logo=go\"\u003e\u003c/a\u003e\n\t\u003ca href=\"https://www.paypal.me/RadhiFadlillah\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=donate\u0026message=PayPal\u0026color=00457C\u0026logo=paypal\"\u003e\u003c/a\u003e\n\t\u003ca href=\"https://ko-fi.com/radhifadlillah\"\u003e\u003cimg src=\"https://img.shields.io/static/v1?label=donate\u0026message=Ko-fi\u0026color=F16061\u0026logo=ko-fi\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\nObelisk is a Go package and CLI tool for saving web page as single HTML file, with all of its assets embedded. It's inspired by the great [Monolith](https://github.com/Y2Z/monolith) and intended as improvement for my old [WARC](https://github.com/go-shiori/warc) package.\n\n## Features\n\n- Embeds all resources (e.g. CSS, image, JavaScript, etc) producing a single HTML5 document that is easy to store and share.\n- In case the submitted URL is not HTML (for example a PDF page), Obelisk will still save it as it is.\n- Downloading each assets are done concurrently, which make the archival process for a web page is quite fast.\n- Accepts cookies, useful for pages that need login or article behind paywall.\n\n## As Go package\n\nRun following command inside your Go project :\n\n```shell\ngo get -u -v github.com/go-shiori/obelisk\n```\n\nNext, include Obelisk in your application :\n\n```go\nimport \"github.com/go-shiori/obelisk\"\n```\n\nNow you can use Obelisk archival feature for your application. For basic usage you can check the [example](https://github.com/go-shiori/obelisk/blob/master/examples/basic.go).\n\n## As CLI application\n\nYou can download the latest version of Obelisk from [release page](https://github.com/go-shiori/obelisk/releases). To build from source, make sure you use `go \u003e= 1.13` then run following commands :\n\n```shell\ngo get -u -v github.com/go-shiori/obelisk/cmd/obelisk\n```\n\nNow you can use it from your terminal :\n\n```shell\n$ obelisk -h\n\nCLI tool for saving web page as single HTML file\n\nUsage:\n  obelisk [url1] [url2] ... [urlN] [flags]\n\nFlags:\n  -z, --gzip                          gzip archival result\n  -h, --help                          help for obelisk\n  -i, --input string                  path to file which contains URLs\n      --insecure                      skip X.509 (TLS) certificate verification\n  -c, --load-cookies string           path to Netscape cookie file\n      --max-concurrent-download int   max concurrent download at a time (default 10)\n      --no-css                        disable CSS styling\n      --no-embeds                     remove embedded elements (e.g iframe)\n      --no-js                         disable JavaScript\n      --no-medias                     remove media elements (e.g img, audio)\n  -o, --output string                 path to save archival result\n  -q, --quiet                         disable logging\n      --skip-resource-url-error       skip process resource url error\n  -t, --timeout int                   maximum time (in second) before request timeout (default 60)\n  -u, --user-agent string             set custom user agent\n      --verbose                       more verbose logging\n```\n\nThere are some CLI behavior that I think need to be explained more here :\n\n- The `--input` flag accepts text file that contains list of urls that look like this :\n\n    ```plain\n\thttp://www.domain1.com/some/path\n\thttp://www.domain2.com/some/path\n\thttp://www.domain3.com/some/path\n\t```\n\n- The `--load-cookies` flag accepts Netscape cookie file that usually look like this :\n\n    ```plain\n\t# Netscape HTTP Cookie File\n\t# https://curl.haxx.se/rfc/cookie_spec.html\n\t# This is a generated file! Do not edit.\n\t\n\t#HttpOnly_.google.com\tTRUE\t/\tFALSE\t1631153524\tKEY\tVALUE\n\t#HttpOnly_.google.com\tTRUE\t/ads\tTRUE\t1621062000\tKEY\tVALUE\n\t.developers.google.com\tTRUE\t/\tFALSE\t1642167486\tKEY\tVALUE\n\t```\n\n- If `--output` flag is not specified then Obelisk will generate file name for the archive and save it in current working directory.\n- If `--output` flag is set to `-` and there is only one URL to process (either from input file or from CLI arguments) then the default output will be `stdout`.\n- If `--output` flag is specified but there are more than one URL to process, Obelisk will generate file name for the archive, but keep using the directory from the specified output path.\n- If `--output` flag is specified but it sets to an existing directory, Obelisk will also generate file name for the archive.\n\n## F.A.Q\n\n**Why the name is Obelisk ?**\n\nIt's inspired by Monolith, therefore it's Obelisk.\n\n**How does it compare to WARC ?**\n\nMy WARC package uses `bolt` database to contain archival result, which make it hard to share and view. I also think my code in WARC is not really easy to understand, so I often confused when I try to add additional feature or refactoring it.\n\n**How does it compare to Monolith ?**\n\n- Both embeds all resources to HTML file, mostly using base64 data URL. The difference is Obelisk will use inline `\u003cscript\u003e` and `\u003cstyle\u003e` for external JavaScript and CSS files. This is done because in many page the browser will struggles to load JavaScript that encoded into data URL. Inlining scripts and styles also make archival result smaller since we don't encode them using base64.\n- In Obelisk all request to external URL is disabled by default using Content Security Policy, while in Monolith we need to specify it manually. This is done because in my opinion archive shouldn't need and shouldn't be able to send request to external resources.\n- In Obelisk downloading assets are done concurrently. Thanks to this, Obelisk (most of the time) will be faster than Monolith when archiving a web page.\n\n**Why not just contribute to Monolith ?**\n\n- I don't have any knowledge about Rust. I do want to learn it though.\n- I have a plan to update [Shiori](https://github.com/go-shiori/shiori), so I need a Go package for archiving web page.\n\n## Attributions\n\nOriginal logo is created by [Freepik](https://www.flaticon.com/authors/freepik) in theirs [egypt](https://www.flaticon.com/packs/egypt-23) and [desert](https://www.flaticon.com/packs/desert-7) pack, which can be downloaded from [www.flaticon.com](https://www.flaticon.com/).\n\n## License\n\nObelisk is distributed using [MIT license](https://choosealicense.com/licenses/mit/), which means you can use and modify it however you want. However, if you make an enhancement for it, if possible, please send a pull request. If you like this project, please consider donating to me either via [PayPal](https://www.paypal.me/RadhiFadlillah) or [Ko-Fi](https://ko-fi.com/radhifadlillah).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgo-shiori%2Fobelisk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgo-shiori%2Fobelisk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgo-shiori%2Fobelisk/lists"}