{"id":30950740,"url":"https://github.com/chipscoco/oceanmonkey","last_synced_at":"2025-09-11T05:18:32.005Z","repository":{"id":57447703,"uuid":"432937935","full_name":"chipscoco/OceanMonkey","owner":"chipscoco","description":"OceanMonkey is a High-Level Distributed Web Crawling and Web Scraping framework base on multi-process and multi-coroutines, used to crawl websites and extract structured data from their pages like the classical scrapy framework. ","archived":false,"fork":false,"pushed_at":"2022-03-27T23:37:02.000Z","size":89,"stargazers_count":7,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-07T21:06:59.254Z","etag":null,"topics":["coroutines","crawler","multiprocessing","python","python3","scraper","scraping","spider"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chipscoco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-11-29T07:04:14.000Z","updated_at":"2025-07-03T14:04:37.000Z","dependencies_parsed_at":"2022-09-15T22:20:51.307Z","dependency_job_id":null,"html_url":"https://github.com/chipscoco/OceanMonkey","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/chipscoco/OceanMonkey","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chipscoco%2FOceanMonkey","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chipscoco%2FOceanMonkey/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chipscoco%2FOceanMonkey/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chipscoco%2FOceanMonkey/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chipscoco","download_url":"https://codeload.github.com/chipscoco/OceanMonkey/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chipscoco%2FOceanMonkey/sbom","scorecard":{"id":278092,"data":{"date":"2025-08-11","repo":{"name":"github.com/chipscoco/OceanMonkey","commit":"bffd0c9cd3fca7822466f721c2c5308a96a33d1d"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/27 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-17T15:01:33.755Z","repository_id":57447703,"created_at":"2025-08-17T15:01:33.755Z","updated_at":"2025-08-17T15:01:33.755Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274580767,"owners_count":25311233,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-11T02:00:13.660Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coroutines","crawler","multiprocessing","python","python3","scraper","scraping","spider"],"created_at":"2025-09-11T05:18:31.000Z","updated_at":"2025-09-11T05:18:31.987Z","avatar_url":"https://github.com/chipscoco.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"https://github.com/chipscoco/OceanMonkey/blob/main/artwork/logo.jpg\"\u003e\n   \n\nOverview\n========\n\nOceanMonkey is a High-Level Distributed Web Crawling and Web Scraping framework base on multi-process and multi-coroutines, used to\ncrawl websites and extract structured data from their pages like the classical scrapy framework.\n\n## Installation guide\n\n### Supported OS\n\n    OceanMonkey works on Linux, Windows and macOS.\n\n### Supported Python versions\n\n    OceanMonkey requires Python 3.5+, either the CPython implementation.\n\n### Installing\nif you’re already familiar with installation of Python packages, you can install OceanMonkey and its dependencies from PyPI with:\n\n    pip install oceanmonkey\n\nAlso you can install OceanMonkey by dowloading the project's source code and install it through the setup.py:\n    \n    python setup.py install\n\n## Quick Start\n\n### Create a Monkey Project\nuse the monkeys command to create a OceanMonkey Project like the following:\n  \n    monkeys startproject BeBe\nor:\n\n    monkeys strtproject  D:\\BeBe\n    \n### Write the scraping logic\nwhen you execute the startproject command, it will generates two Python script file under the monkeys' directory,\nnamely **gibbons.py** and **orangutans.py**. just write the gibbons.py for scraping.\n\n### Write the store logic\njust write the **orangutans.py** for cleaning and storing items extracted from page source.\n\n### Run the project\nit's so easy to run the project, just execute the run command under the project's directory.\n\n    cd BeBe\n    monkeys run\n    \n# Sample code \n```\nfrom oceanmonkey import Gibbon\nfrom oceanmonkey import Request\nfrom oceanmonkey import Signal,SignalValue\n\n\nclass WuKong(Gibbon):\n    handle_httpstatus_list = [404, 500]\n    allowed_domains = ['www.chipscoco.com']\n    start_id = 9\n\n    def parse(self, response):\n        if response.status_code in self.handle_httpstatus_list or response.repeated:\n            self.start_id += 1\n            next_url = \"http://www.chipscoco.com/?id={}\".format(self.start_id)\n            yield Request(url=next_url, callback=self.parse)\n        else:\n            item = {}\n            item['author'] = response.xpath('//span[@class=\"mr20\"]/text()').extract_first()\n            item['title'] = response.xpath('//h1[@class=\"f-22 mb15\"]/text()').extract_first()\n            yield item\n            self.start_id += 1\n            next_url = \"http://www.chipscoco.com/?id={}\".format(self.start_id)\n            yield Request(url=next_url, callback=self.parse)\n            yield Signal(value=SignalValue.SAY_GOODBYE)\n```\ndetailed usage on OceanMonkey see [https://github.com/chipscoco/OceanMonkey/tree/main/docs](https://github.com/chipscoco/OceanMonkey/tree/main/docs).\n\n## Contact\n\n|Author          | Email            | Wechat      |\n| ---------------|:----------------:| -----------:|\n| chenzhengqiang | chenzhengqiang@chipscoco.com | Pretty-Style |\n\n**Notice:  Any comments and suggestions are welcomed**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchipscoco%2Foceanmonkey","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchipscoco%2Foceanmonkey","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchipscoco%2Foceanmonkey/lists"}