{"id":17274005,"url":"https://github.com/owulveryck/repocketable","last_synced_at":"2025-03-22T18:34:34.575Z","repository":{"id":45636193,"uuid":"383369906","full_name":"owulveryck/rePocketable","owner":"owulveryck","description":"Tool to fetch articles from (getPocket|the web) and turn them into epub","archived":false,"fork":false,"pushed_at":"2022-03-30T06:36:52.000Z","size":336,"stargazers_count":53,"open_issues_count":3,"forks_count":4,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-05-01T15:55:01.603Z","etag":null,"topics":["epub","epub-generation","getpocket","hacktoberfest","hacktoberfest2021","readability"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/owulveryck.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":["owulveryck"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2021-07-06T06:54:06.000Z","updated_at":"2024-03-22T14:08:21.000Z","dependencies_parsed_at":"2022-09-17T10:24:28.848Z","dependency_job_id":null,"html_url":"https://github.com/owulveryck/rePocketable","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owulveryck%2FrePocketable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owulveryck%2FrePocketable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owulveryck%2FrePocketable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/owulveryck%2FrePocketable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/owulveryck","download_url":"https://codeload.github.com/owulveryck/rePocketable/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245003687,"owners_count":20545646,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["epub","epub-generation","getpocket","hacktoberfest","hacktoberfest2021","readability"],"created_at":"2024-10-15T08:52:51.199Z","updated_at":"2025-03-22T18:34:34.264Z","avatar_url":"https://github.com/owulveryck.png","language":"Go","readme":"# rePocketable\n\n[![GitHub go.mod Go version of a Go module](https://img.shields.io/github/go-mod/go-version/gomods/athens.svg)](https://github.com/gomods/athens)\n[![Linux](https://svgshare.com/i/Zhy.svg)](https://svgshare.com/i/Zhy.svg)\n[![macOS](https://svgshare.com/i/ZjP.svg)](https://svgshare.com/i/ZjP.svg)\n[![Windows](https://svgshare.com/i/ZhY.svg)](https://svgshare.com/i/ZhY.svg)\n[![Build](https://github.com/owulveryck/rePocketable/actions/workflows/go.yml/badge.svg)](https://github.com/owulveryck/rePocketable/actions/workflows/go.yml)\n\nThis tool and its webpage are under construction.\n\nBest possible option if you want to see what it will eventually do is to run a cli tool such as to epub:\n\n```shell\ngo run cmd/toEpub/*.go https://whateverpageyouwanttoread/\n```\n\nThis utility takes optional `-H` arguments to pass headers to the http downloader.\nThis option can be used several times to be compatible with the curl command.\n\nex:\n\n```shell\n toEpub -H 'sec-ch-ua: \"Chromium\";v=\"94\", \"Google Chrome\";v=\"94\", \";Not A Brand\";v=\"99\"' \\\n  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36' \\\n  -H 'sec-ch-ua-platform: \"macOS\"'  https://thewebsite/thepage.html\n```\n\nI wrote some explanation of the concept in a [blog post](https://blog.owulveryck.info/2021/10/07/reading-from-the-web-offline-and-distraction-free.html)\n\n## Hacktoberfest\n\nThis is a toy project, but I am more and more relying on it. I think that hacktoberfest is a good opportunity to turn this project into a product.\nI will write a contributing guide soon; meanwhile if you want to participate the urgent matters are:\n\n- Writing a proper vision: discussing it into an issue and submitting a PR to mention in in the README\n- Writing autonomous end-to-end tests: grabbing a sample page, running an `httptest` server, running a toEpub code and analysing the result\n- Writing a proper documentation\n- Adding a contribution guide\n- sky is the limit, discuss in issues and submit PR once issues are discussed :D \n\n## Features\n\nThe internal libraries (used by the CLI) are implemeting those features:\n\n- Webpage fetching and pre-processing\n  - preprocessing and sanitization of figures to fetch the correct image from responsive and/or javascript tags (Medium and Toward datascience)\n  - experimental feature to turn LaTeX figures into pictures ([github.com/go-latex/latex](https://github.com/go-latex/latex))\n  - extraction of the content based on the ARC90 readility project ([github.com/cixtor/readability](https://github.com/cixtor/readability))\n- Opengraph processing to extract meta informations ([github.com/dyatlov/go-opengraph](https://github.com/dyatlov/go-opengraph))\n  - Generation of a cover picture with the front image of the website, the title and the author of the artible\n  - Generation of a first chapter with meta data such as the publication date\n- epub generation ([github.com/bmaupin/go-epub](https://github.com/bmaupin/go-epub))\n- experimental getpocket integration\n  - reading the article lists and generating epubs from the list\n  - a daemon mode that will eventually runs on a ereader device to sync the list (heavy WIP)\n\n## Configurations\n\nThose configuration may influence various internal libraries.\n\n| KEY                             | TYPE        | DEFAULT    | REQUIRED    | DESCRIPTION    |\n|---------------------------------|-------------|------------|-------------|----------------|\n| DOWNLOADER_LIVENESS_CHECK       | Duration    | 5m         | true        |                |\n| DOWNLOADER_PROBE_TIMEOUT        | Duration    | 60m        | true        |                |\n| DOWNLOADER_HTTP_TIMEOUT         | Duration    | 10s        | true        |                |\n| DOWNLOADER_TRANSPORT_TIMEOUT    | Duration    | 5s         | true        |                |\n\nThose configuration are used for cli using the pocket integration\n\n| KEY                        | TYPE        | DEFAULT                         | REQUIRED    | DESCRIPTION                                                        |\n|----------------------------|-------------|---------------------------------|-------------|--------------------------------------------------------------------|\n| POCKET_UPDATE_FREQUENCY    | Duration    | 1h                              | true        | How often to query getPocket                                       |\n| POCKET_HEALTH_CHECK        | Duration    | 30s                             | true        |                                                                    |\n| POCKET_POCKET_URL          | String      | https://getpocket.com/v3/get    | true        |                                                                    |\n| POCKET_CONSUMER_KEY        | String      |                                 | true        | See https://getpocket.com/developer/apps/ to get a consumer key    |\n| POCKET_USERNAME            | String      |                                 |             | The pocket username (will try to fetch it if not found)            |\n| POCKET_TOKEN               | String      |                                 |             | The access token, will try to fetch it if not found or invalid     |\n","funding_links":["https://github.com/sponsors/owulveryck"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowulveryck%2Frepocketable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fowulveryck%2Frepocketable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowulveryck%2Frepocketable/lists"}