{"id":15502380,"url":"https://github.com/matsubara0507/scrapbook","last_synced_at":"2026-03-02T07:33:24.599Z","repository":{"id":29998397,"uuid":"123379061","full_name":"matsubara0507/scrapbook","owner":"matsubara0507","description":"this is cli that collect posts of site that is wrote in config yaml using feed or scraping.","archived":false,"fork":false,"pushed_at":"2022-06-05T13:55:11.000Z","size":287,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-22T06:37:31.256Z","etag":null,"topics":["haskell","haskell-application"],"latest_commit_sha":null,"homepage":null,"language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matsubara0507.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-01T03:49:52.000Z","updated_at":"2022-06-05T13:55:12.000Z","dependencies_parsed_at":"2022-08-18T06:36:14.620Z","dependency_job_id":null,"html_url":"https://github.com/matsubara0507/scrapbook","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/matsubara0507/scrapbook","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matsubara0507%2Fscrapbook","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matsubara0507%2Fscrapbook/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matsubara0507%2Fscrapbook/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matsubara0507%2Fscrapbook/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matsubara0507","download_url":"https://codeload.github.com/matsubara0507/scrapbook/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matsubara0507%2Fscrapbook/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29995040,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-02T01:47:34.672Z","status":"online","status_checked_at":"2026-03-02T02:00:07.342Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["haskell","haskell-application"],"created_at":"2024-10-02T09:09:37.151Z","updated_at":"2026-03-02T07:33:24.582Z","avatar_url":"https://github.com/matsubara0507.png","language":"Haskell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# scrapbook\n\n[![Hackage](https://img.shields.io/hackage/v/scrapbook.svg?style=flat)](https://hackage.haskell.org/package/scrapbook)\n![Build Application](https://github.com/matsubara0507/scrapbook/workflows/Build%20Application/badge.svg)\n[![](https://images.microbadger.com/badges/image/matsubara0507/scrapbook.svg)](https://microbadger.com/images/matsubara0507/scrapbook \"Get your own image badge on microbadger.com\")\n\nThis is cli tool that collect posts of site that is wrote in config yaml using feed or scraping.\n\n## Usage\n\n1. clone this repository or add `scrapbook` package to `extra-deps` in `stack.yaml`\n2. run `stack install`\n\ne.g.\n\n```\n$ stack exec -- scrapbook -o \"example\" example/sites.yaml\n```\n\n### Docker\n\nbuild docker image:\n\n```\n$ stack --docker build -j 1 Cabal # if out of memory in docker\n$ stack --docker --local-bin-path=./bin install\n$ docker build -t matsubara0507/scrapbook . --build-arg local_bin_path=./bin\n```\n\n### Command\n\n```\nscrapbook [options] [input-file]\n  -o DIR                --output=DIR                 Write output to DIR instead of stdout.\n  -t FORMAT, -w FORMAT  --to=FORMAT, --write=FORMAT  Specify output format. default is `feed`.\n                        --version                    Show version\n```\n\n### GHCi\n\n```haskell\n\u003e\u003e import Control.Lens ((^.))\n\u003e\u003e import Data.Maybe\n\u003e\u003e conf \u003c- fromJust \u003c$\u003e readConfig \"example/sites.yaml\"\n\u003e\u003e (Right posts) \u003c- collect . fmap concat $ mapM (fetch . toSite) (conf ^. #sites)\n\u003e\u003e collect $ writeFeed \"example\" (fromJust $ conf ^. #feed) posts\nRight ()\n```\n\n## Example\n\nsee [matsuara0507/scrapbook-example](https://github.com/matsubara0507/scrapbook-example)\n\n## Documentation\n\nHow to write config yaml file.\n\n```yaml\n# configuration for generating Atom feed (Optional)\nfeed:\n  ## write as site title to Atom feed\n  title: \"Sample Site Posts\"\n  ## write as site url to Atom feed\n  baseUrl: \"https://example.com\"\n  ## file name (Optional)\n  ### if nothing, use same name from input file\n  name: atom.xml\n\n# Haskeller's site configuration\nsites:\n    ## Title of site\n  - title: \"ひげメモ\"\n    ## Author of site\n    author: matsubara0507\n    ## URL of site\n    url: https://matsubara0507.github.io\n    ## Feed url of site\n    ### there are several field to set feed url\n    ### `feed` is basic field. This field auto branch to Atom or RSS 2.0.\n    feed: https://matsubara0507.github.io/feed\n  - title: \"Kuro's Blog\"\n    author: \"Hiroyuki Kurokawa\"\n    url: http://kurokawh.blogspot.com/\n    ### `atom` is for Atom feed.  \n    atom:\n      ### feed url of Atom\n      url: http://kurokawh.blogspot.com/feeds/posts/default\n      ### set attr as constraint for link on each entry of Atom feed (Optional)\n      ### if nothing, choice head. if set multiple attr, conjunction.\n      linkAttrs:\n        rel: alternate\n  - title: \"あどけない話\"\n    author: \"kazu-yamamoto\"\n    url: http://d.hatena.ne.jp/kazu-yamamoto\n    ### `rss` is for RSS 2.0 feed.\n    ### set feed url.\n    rss: http://d.hatena.ne.jp/kazu-yamamoto/rss2\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatsubara0507%2Fscrapbook","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatsubara0507%2Fscrapbook","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatsubara0507%2Fscrapbook/lists"}