{"id":19655990,"url":"https://github.com/sandrewtx08/gearbest_scraper","last_synced_at":"2025-07-23T05:33:33.593Z","repository":{"id":138692492,"uuid":"422988613","full_name":"sandrewTx08/Gearbest_Scraper","owner":"sandrewTx08","description":"Seeks catalog ads from Gearbest web page, scraping catalogs information then it's storing by a sequence of SQL commands through a relational database.","archived":false,"fork":false,"pushed_at":"2021-11-25T13:03:10.000Z","size":71,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-27T02:18:37.334Z","etag":null,"topics":["crawler","gearbest","lxml","python","scraper","scraping","sqlite3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sandrewTx08.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-30T20:53:57.000Z","updated_at":"2021-11-25T13:03:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"3c213c35-38e9-46be-8b4e-fb267c2cc25e","html_url":"https://github.com/sandrewTx08/Gearbest_Scraper","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/sandrewTx08/Gearbest_Scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandrewTx08%2FGearbest_Scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandrewTx08%2FGearbest_Scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandrewTx08%2FGearbest_Scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandrewTx08%2FGearbest_Scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sandrewTx08","download_url":"https://codeload.github.com/sandrewTx08/Gearbest_Scraper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sandrewTx08%2FGearbest_Scraper/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266624728,"owners_count":23958299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","gearbest","lxml","python","scraper","scraping","sqlite3"],"created_at":"2024-11-11T15:25:27.257Z","updated_at":"2025-07-23T05:33:33.562Z","avatar_url":"https://github.com/sandrewTx08.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gearbest_Scraper\r\n\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n![GitHub tag (latest by date)](https://img.shields.io/github/v/tag/sandrewTx08/Gearbest_Scraper)\r\n\r\n# Overview\r\n\r\nGearbest_Scraper is simple, intuitive and manageable, it seeks catalog ads from Gearbest web page, scraping their information then it's storing by a sequence of SQL commands through a relational database.\r\n\r\n# Support\r\n\r\n| \u003cimg width=100px src=\"https://www.iconsdb.com/icons/preview/green/csv-xxl.png\"\u003e | \u003cimg width=100px src=\"https://logodownload.org/wp-content/uploads/2016/10/mysql-logo.png\"\u003e | \u003cimg width=100px src=\"https://logodownload.org/wp-content/uploads/2018/05/sqlite-logo.png\"\u003e | \u003cimg width=100px src=\"https://upload.wikimedia.org/wikipedia/commons/2/29/Postgresql_elephant.svg\"\u003e | \u003cimg width=100px src=\"https://logodownload.org/wp-content/uploads/2016/10/Microsoft-SQL-Server-Logo-1.png\"\u003e | \u003cimg width=100px src=\"https://logodownload.org/wp-content/uploads/2014/04/oracle-logo-0.png\"\u003e |\r\n|---|---|---|---|---|---|\r\n| CSV | MySQL | SQLite | Postgresql | Microsoft SQL Server | Oracle Database |\r\n| ✅ Avaliable | ✅ Avaliable | ✅ Avaliable | ⚠️ Soon | ⚠️ Maybe soon | ⚠️ Maybe soon |\r\n\r\n# Installing\r\n \r\n1. Download: \r\n```bash\r\n\u003e git clone https://github.com/sandrewTx08/Gearbest_Scraper\r\n```\r\n\r\n2. Move to directory \r\n\r\n```bash\r\n\u003e cd Gearbest_Scraper\r\n```\r\n\r\n3. Installing dependencies: \r\n\r\n```bash\r\nGearbest_Scraper\u003e install.bat\r\n```\r\n\r\n__or__\r\n\r\n```bash\r\nGearbest_Scraper:~$ ./install.bash\r\n```\r\n\r\n__or__\r\n\r\n```bash\r\nGearbest_Scraper\u003e pip install -r requirements.txt\r\n```\r\n\r\n# How to use\r\n\r\n1. Define yours search list keywords in __configuration.json__ file \r\n\r\n```json\r\n{\"search\":{\"list\":[\"keyword_foo_1\",\"keyword_foo_2\",\"keyword_foo_3\"]}}\r\n```\r\n\r\n2. Execute the program \r\n\r\n```bash\r\n\u003e cd Gearbest_Scraper\r\nGearbest_Scraper\u003e start.bat\r\n```\r\n\r\n__or__\r\n\r\n```bash\r\nGearbest_Scraper\u003e python main.py\r\n```\r\n\r\n# Methods\r\n\r\nMethods is how Gearbest_Scraper receive catalog ads.\r\n\r\nSo you can use a simple script instead parsing argument.\r\n\r\nWindows:\r\n\r\n```bash\r\nGearbest_Scraper\u003e start.bat\r\n```\r\n\r\nLinux:\r\n\r\n```bash\r\nGearbest_Scraper:~$ ./start.bash\r\n```\r\n\r\nSetting search method example:\r\n \r\n```\r\nMethod: s\r\n```\r\n\r\n__Search__ is select by default.\r\n\r\n## Link method\r\n\r\nThis method scrape all catalogs related to main page links on painel called \"Category\".\r\nThe number total page is set by sum of parent and childrens links on painel menu. Overtime database get larger.\r\n\r\n\u003cimg width=90% src=https://user-images.githubusercontent.com/89039740/140583781-b1ba8b7c-115c-4a3d-a17b-065bf359b39d.gif\u003e\r\n\r\nCommand line:\r\n```bash\r\n\u003e python main.py --mode link\r\n```\r\n\r\n## Search method\r\n\r\nSearch method uses a configuration file to set catalog targets.\r\nThe \"search_list\" inside the file must contain a list of keywords to be scrape like a search bar style.\r\n\r\n\u003cimg width=90% src=https://user-images.githubusercontent.com/89039740/140584581-3d8c88f3-b32e-4eb4-9f86-575e27001e0b.png\u003e\r\n\r\nCommand line:\r\n```bash\r\n\u003e python main.py --mode search\r\n```\r\n\r\n## Popular method\r\n\r\nIt scrape the most popular searches according web page.\r\n\r\n\u003cimg width=90% src=https://user-images.githubusercontent.com/89039740/140584576-1ed76b3c-beb8-464e-8fe0-10b78f5e85a6.png\u003e\r\n\r\nCommand line:\r\n```bash\r\n\u003e python main.py --mode popular\r\n```\r\n\r\n# Configuration file:\r\n\r\nThe configuration file must have the following fields:\r\n\r\nfield|key|description|\r\n|---|---|---|\r\n|method||settings realted to its function|\r\n|connection|request|to request web pages|\r\n|connection|database|database settings|\r\n\r\n## Configurations example:\r\n\r\nPhone brands list example:\r\n```json\r\n{\"search\":{\"list\":[\"asus\",\"huawai\",\"lenovo\",\"samsung\",\"ulefone\",\"xiaomi\"]}}\r\n```\r\n\r\nHTTP Header example:\r\n```json\r\n{\"headers\":{\"User-Agent\":\"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\"}}\r\n```\r\n\r\nDefining database path:\r\n```json\r\n{\"database\":{\"sqlite\":{\"path\":\"C:/Users/some_user/Documents/gearbest_scraper.db\"}}}\r\n```\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsandrewtx08%2Fgearbest_scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsandrewtx08%2Fgearbest_scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsandrewtx08%2Fgearbest_scraper/lists"}