{"id":18885545,"url":"https://github.com/hazembz/mlh-demo","last_synced_at":"2026-04-16T04:31:06.932Z","repository":{"id":183271935,"uuid":"662523049","full_name":"HazemBZ/mlh-demo","owner":"HazemBZ","description":"A hybrid websites scraping system.","archived":false,"fork":false,"pushed_at":"2024-07-07T09:57:02.000Z","size":107,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-29T18:51:08.260Z","etag":null,"topics":["celery","python","redis","scraping","selenium","web-automation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HazemBZ.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-05T10:09:54.000Z","updated_at":"2025-02-05T13:24:27.000Z","dependencies_parsed_at":"2024-07-07T11:05:50.974Z","dependency_job_id":null,"html_url":"https://github.com/HazemBZ/mlh-demo","commit_stats":null,"previous_names":["hazembz/mlh-demo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/HazemBZ/mlh-demo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HazemBZ%2Fmlh-demo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HazemBZ%2Fmlh-demo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HazemBZ%2Fmlh-demo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HazemBZ%2Fmlh-demo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HazemBZ","download_url":"https://codeload.github.com/HazemBZ/mlh-demo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HazemBZ%2Fmlh-demo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31871395,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"online","status_checked_at":"2026-04-16T02:00:06.042Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["celery","python","redis","scraping","selenium","web-automation"],"created_at":"2024-11-08T07:19:43.600Z","updated_at":"2026-04-16T04:31:06.901Z","avatar_url":"https://github.com/HazemBZ.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n## Architecture\n\n\u003cp align=\"start\"\u003e\n    \u003cimg src=\"images/project-architecture.png\" width='600'\u003e\n\u003cp\u003e\n\n\n## Components\n\n**Tasks Manager**: Orchestrates running Scenarios using a task queue;\n Arranges the run order of scenarios (in sequence, in parallel), \n and handles  pre/post scenario runs (resources cleanup; notifying Web client with progress, etc)\n\n**Task queue**: Runs jobs sent by the Tasks Manager\n\n**Scenario**: A unit that handles every steps required to achieve the intended objective:\n - extracting data using scraping/received request\n - starting/stopping an agent that interact with a web app\n\n**Agent**: A frontend client to interact with apps\n  that dynamically generate their content (Single page apps, etc)\n\n**Templates**: Data formats (form inputs, selection menu dictionaries, etc),\nreverse engineered to correctly parse and transfer data to the targeted website \n\n## Setup\n\n**Run containers**\n```\ndocker-compose up \n```\n\n## Testing\n\nIf you're really interested in a poc run, then you can test the system with\none of the automated interactions.\n\n1.install httpie\n\n2.Send a pre-populated post request\n\n```\nhttp localhost:8004/announce \u003c request.json\n```\n\n3.Access \"https://www.menzili.tn/connexion\" using the \"tilzxivqvzjmubrgsh@cazlp.com\" as email and \"tilzxivqvzjmubrgsh\" password, count to 10 😉 and refresh.\n\n4.You can use  run `docker-compose logs web` to get more logged info and access info about created tasks @`localhost:5555` for flower dashboard\n\n5.Free resources `docker-compose down`","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhazembz%2Fmlh-demo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhazembz%2Fmlh-demo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhazembz%2Fmlh-demo/lists"}