{"id":16715075,"url":"https://github.com/gusty/scrapem","last_synced_at":"2025-04-10T06:14:37.382Z","repository":{"id":142921070,"uuid":"75875612","full_name":"gusty/ScrapeM","owner":"gusty","description":"A monadic web scraping library","archived":false,"fork":false,"pushed_at":"2018-10-10T13:15:45.000Z","size":91,"stargazers_count":17,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-24T07:26:23.975Z","etag":null,"topics":["computation-expressions","extract","fsharp","monad","scraper","scrapping"],"latest_commit_sha":null,"homepage":null,"language":"F#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gusty.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-12-07T21:06:06.000Z","updated_at":"2022-05-02T09:00:25.000Z","dependencies_parsed_at":"2023-04-01T13:36:59.691Z","dependency_job_id":null,"html_url":"https://github.com/gusty/ScrapeM","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusty%2FScrapeM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusty%2FScrapeM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusty%2FScrapeM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusty%2FScrapeM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gusty","download_url":"https://codeload.github.com/gusty/ScrapeM/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248166908,"owners_count":21058481,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computation-expressions","extract","fsharp","monad","scraper","scrapping"],"created_at":"2024-10-12T21:08:17.002Z","updated_at":"2025-04-10T06:14:37.363Z","avatar_url":"https://github.com/gusty.png","language":"F#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ScrapeM\nA monadic web scraping library\n\nThis library makes [web scraping](https://en.wikipedia.org/wiki/Web_scraping) easier by providing ways to automatically maintain state through different request, handling cookies, form submission and http headers.\n\n*One function to scrap'em all*\n\nThis is essentially a single-function library which integrates many existing libraries and present several ways to approach web scraping by using different monads.\n\nAll other common functions used here come from different libraries like [FSharp.Data](http://fsharp.github.io/FSharp.Data/), [Http.fs](https://github.com/haf/Http.fs) and [F#+](https://github.com/gusty/FSharpPlus)\n\n*Scrapes the web with category*\n\nIt's possible to create stateful linq-style queries which simulates basic user interaction with form submission by using different flavours of State monads. Also sequences expressions are available to integrate the data being extracted from multiple webpages in the same query.\n\n## Getting started\n\nImportant: At the moment this library is in a 'Prototype' stage\n\nRecommended: Visual Studio 2017 to avoid slow compile time of generic code\n\nIn order to try the examples run:\n\n    \u003e build.cmd // on windows    \n    $ ./build.sh  // on unix\n    \nNow you can try the sample files:\n\n\n* [Basic query with state handling](Sample-State-1.fsx) - Extracts a text from a website with login.\n* [Basic query with multiple results](Sample-Seq-1.fsx) - Extracts many tables from a website, using a type provider.\n* [Advanced query with state handling and multiple results](Sample-StateT-Seq-1.fsx) - Extracts many texts from a website by using different logins.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgusty%2Fscrapem","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgusty%2Fscrapem","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgusty%2Fscrapem/lists"}