{"id":19881547,"url":"https://github.com/chifisource/toolipscrawl.jl","last_synced_at":"2025-10-13T22:45:01.488Z","repository":{"id":107336369,"uuid":"608906785","full_name":"ChifiSource/ToolipsCrawl.jl","owner":"ChifiSource","description":"toolips-based web-crawlers for julia","archived":false,"fork":false,"pushed_at":"2025-06-22T20:29:33.000Z","size":64,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-30T23:47:52.283Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ChifiSource.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["emmaccode","UnformalPenguin"]}},"created_at":"2023-03-03T01:08:32.000Z","updated_at":"2025-06-22T20:29:07.000Z","dependencies_parsed_at":"2023-11-11T20:20:58.881Z","dependency_job_id":"76589e43-165c-4928-b70f-3d8a734e57c0","html_url":"https://github.com/ChifiSource/ToolipsCrawl.jl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ChifiSource/ToolipsCrawl.jl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FToolipsCrawl.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FToolipsCrawl.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FToolipsCrawl.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FToolipsCrawl.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ChifiSource","download_url":"https://codeload.github.com/ChifiSource/ToolipsCrawl.jl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FToolipsCrawl.jl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279017140,"owners_count":26085984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T17:14:35.339Z","updated_at":"2025-10-13T22:45:01.443Z","avatar_url":"https://github.com/ChifiSource.png","language":"Julia","funding_links":["https://github.com/sponsors/emmaccode","https://github.com/sponsors/UnformalPenguin"],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n \u003cimg id=\"mainimage\" src=\"https://github.com/ChifiSource/image_dump/blob/main/toolips/toolipscrawl.png\"\u003e\u003c/img\u003e\n\n[documentation](https://chifidocs.com/toolips/ToolipsCrawl.jl)\n \n \u003ch6 id=\"crawlsub\"\u003etoolips crawl provides web-crawling for all!\u003c/h6\u003e\n \u003c/div\u003e\n\nThis package builds a web-scraping and web-crawling library atop the [toolips](https://github.com/ChifiSource/Toolips.jl) web-development framework. This package prominently features high-level syntax atop the `Toolips` `Component` structure.\n- [get started](#get-started)\n  - [installation](#installation)\n  - [documentation](#documentation)\n  - [overview](#overview)\n- [contributing guidelines](#contributing)\n### get started\n- To get started with `ToolipsCrawl`, you will need [julia.](https://julialang.org).\n###### installation\nAfter installing Julia, `ToolipsCrawl` may be installed with `Pkg`\n```julia\nusing Pkg; Pkg.add(\"ToolipsCrawl\")\n```\nAlternatively, `Unstable` may be added for the latest (sometimes broken) changes.\n```julia\nusing Pkg; Pkg.add(name = \"ToolipsCrawl\", rev = \"Unstable\")\n```\n##### documentation\nDocumentation for `ToolipsCrawl` is available on [chifidocs](https://chifidocs.com/toolips/ToolipsCrawl)\n## overview\n`ToolipsCrawl` usage centers around the `Crawler` type. This constructor is never called directly in conventional usage of the package, **instead** we use the high-level methods for `scrape` and `crawl`.\n- `scrape(f::Function, address::String)` -\u003e `::Crawler`\n- `scrape(f::Function, address::String, components::String ...)` -\u003e `::Crawler`\n- `crawl(f::Function, address::String)` -\u003e `::Crawler`\n- `crawl(f::Function, addresses::String ...)` -\u003e `::Crawler`\n\nAs of right now, there are two main functions for grabbing components...\n- `get_by_name(crawler::Crawler, name::String)` and `get_by_tag(crawler::Crawler, tag::String)`.\n\nThese *getters* are used on the `Crawler` within a scraping function provided to `crawl` or `scrape`.\n```julia\nusing ToolipsCrawl\nrows = []\nscrape(\"https://github.com/ChifiSource\") do c::Crawler\n    current_rows = get_by_tag(c, \"td\")\n    for row::ToolipsCrawl.Component{:td} in current_rows\n        push!(rows, row[:text])\n    end\nend\n```\n```julia\nusing ToolipsCrawl\ntitles = []\ncrawl(\"https://chifidocs.com\") do crawler::Crawler\n    title_comps = get_by_tag(crawler, \"title\")\n    if length(title_comps) \u003e 0\n        @info \"scraped title from \" * crawler.address\n        push!(titles, title_comps[1][:text])\n    end\nend\n```\n### contributing\n`chifi` tries to be quite leniant in accepting pull requests, but following these guidelines will help speed up our processes and make merging your pull-request easier. Please consider the following guidelines:\n- Ensure the issue or the upgrade is applicable to the current version of the project on the `Unstable` branch.\n- **please pull-request to `Unstable`**\n- Open  a unique issue for each issues, please do not group multiple problems into a single issue.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchifisource%2Ftoolipscrawl.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchifisource%2Ftoolipscrawl.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchifisource%2Ftoolipscrawl.jl/lists"}