{"id":13726065,"url":"https://github.com/yannham/mechaml","last_synced_at":"2025-03-16T14:31:19.842Z","repository":{"id":52176695,"uuid":"57646540","full_name":"yannham/mechaml","owner":"yannham","description":"OCaml functional web scraping library","archived":false,"fork":false,"pushed_at":"2021-05-06T11:12:53.000Z","size":334,"stargazers_count":91,"open_issues_count":1,"forks_count":6,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-02-27T10:59:08.224Z","etag":null,"topics":["html","ocaml","scraping","web"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yannham.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-05-01T16:20:06.000Z","updated_at":"2024-10-21T20:04:00.000Z","dependencies_parsed_at":"2022-08-24T04:00:38.971Z","dependency_job_id":null,"html_url":"https://github.com/yannham/mechaml","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yannham%2Fmechaml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yannham%2Fmechaml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yannham%2Fmechaml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yannham%2Fmechaml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yannham","download_url":"https://codeload.github.com/yannham/mechaml/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243819037,"owners_count":20352807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","ocaml","scraping","web"],"created_at":"2024-08-03T01:02:51.395Z","updated_at":"2025-03-16T14:31:19.136Z","avatar_url":"https://github.com/yannham.png","language":"OCaml","funding_links":[],"categories":["OCaml"],"sub_categories":[],"readme":"# Mechaml [![Build Status](https://travis-ci.org/yannham/mechaml.svg?branch=master)](https://travis-ci.org/yannham/mechaml)\r\n\r\n## Description\r\n\r\nMechaml is a functional web scraping library that allows to :\r\n* Fetch web content\r\n* Analyze, fill and submit HTML forms\r\n* Handle cookies, headers and redirections\r\n\r\nMechaml is built on top of existing libraries that provide low-level features : [Cohttp](https://github.com/mirage/ocaml-cohttp) and\r\n[Lwt](https://github.com/ocsigen/lwt) for asynchronous I/O and HTTP handling, and\r\n[Lambdasoup](https://github.com/aantron/lambda-soup) to parse HTML. It provides\r\nan interface that handles the interactions between these and add a few\r\nother features.\r\n\r\n## Overview\r\n\r\nThe library is divided into 3 main modules :\r\n* Agent : User-agent features. Perform requests, get back content, headers, status code, ...\r\n* Cookiejar : Cookies handling\r\n* Page : HTML parsing and forms handling\r\n\r\nThe Format module provides helpers to manage the formatted content in forms such\r\nas date, colors, etc. For more details, see the [documentation](https://yannham.github.io/mechaml/)\r\n\r\n## Installation\r\n\r\n### From opam\r\n```\r\nopam install mechaml\r\n```\r\n\r\n### From source\r\nMechaml uses the dune build system, which can be installed through opam. Then,\r\njust run \r\n```\r\ndune build\r\n```\r\nto build the library.\r\n\r\nUse `dune build @doc` to generate the documentation, `dune runtest` to build and\r\nexecute tests, and `dune build examples/XXX.exe` to compile example XXX.\r\n\r\n## Usage\r\n\r\nHere is sample of code that fetches a web page, fills a login form and submits\r\nit in the monadic style:\r\n\r\n```ocaml\r\nopen Mechaml\r\nmodule M = Agent.Monad\r\nopen M.Infix\r\n\r\nlet require msg = function\r\n  | Some a -\u003e a\r\n  | None -\u003e failwith msg\r\n\r\nlet action_login =\r\n  Agent.get \"http://www.somewebsite.com\"\r\n  \u003e|= Agent.HttpResponse.page\r\n  \u003e|= (function page -\u003e\r\n    page\r\n    |\u003e Page.form_with \"[name=login]\"\r\n    |\u003e require \"Can't find the login form !\"\r\n    |\u003e Page.Form.set \"username\" \"mynick\"\r\n    |\u003e Page.Form.set \"password\" \"@xlz43\")\r\n  \u003e\u003e= Agent.submit\r\n\r\nlet _ =\r\n  M.run (Agent.init ()) action_login\r\n```\r\n\r\nMore examples are available in the dedicated folder.\r\n\r\n# license\r\n\r\nGNU LGPL v3\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyannham%2Fmechaml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyannham%2Fmechaml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyannham%2Fmechaml/lists"}