{"id":21885020,"url":"https://github.com/saeidemadi/scrapeformankan","last_synced_at":"2025-03-22T01:31:59.177Z","repository":{"id":243837727,"uuid":"812493448","full_name":"saeidEmadi/scrapeForMankan","owner":"saeidEmadi","description":"This is a student project for a data mining course and is a simple exercise","archived":false,"fork":false,"pushed_at":"2024-06-17T13:08:29.000Z","size":1926,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-26T19:46:43.274Z","etag":null,"topics":["activity","crawling","csv","dataset","food","linear-regression","randomforestregressor","spider","spiderman","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/saeidEmadi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-09T03:48:51.000Z","updated_at":"2024-06-17T13:08:32.000Z","dependencies_parsed_at":"2024-06-17T13:06:13.966Z","dependency_job_id":null,"html_url":"https://github.com/saeidEmadi/scrapeForMankan","commit_stats":null,"previous_names":["saeidemadi/scrapeformankan"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saeidEmadi%2FscrapeForMankan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saeidEmadi%2FscrapeForMankan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saeidEmadi%2FscrapeForMankan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saeidEmadi%2FscrapeForMankan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/saeidEmadi","download_url":"https://codeload.github.com/saeidEmadi/scrapeForMankan/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244893474,"owners_count":20527601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["activity","crawling","csv","dataset","food","linear-regression","randomforestregressor","spider","spiderman","webscraping"],"created_at":"2024-11-28T10:18:19.839Z","updated_at":"2025-03-22T01:31:59.150Z","avatar_url":"https://github.com/saeidEmadi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# scrape For Mankan\nThis is a student project for a data mining course and is a simple exercise\nIn this project, we have tried to extract data from a site using ``` web scraping ``` and ``` crawling ``` methods and create a data set.\nAnd after cleaning and preparing the data set, we first analyze it and then use the obtained results to predict and guide users.\n\n*You can easily calculate the amount of calories needed based on the amount of daily activity :*\n  - **ridingBike**\n  - **running**\n  - **walking**\n  - **cleaningUp**\n    \n*and foods that provide the same amount of calories to the body.*\n\n\n## DataSet ![Mankan_dataset.csv](https://github.com/saeidEmadi/scrapeForMankan/blob/main/Mankan_dataset.csv)\nThis dataset contains useful information such as the amount of ```calories, protein, etc```. about foods and edibles\n\n**This dataset has 1821 records and 11 columns**\n\n| siteId | name | calory | carbo | protein | fat | fiber | activity1 | activity2 | activity3 | activity4 |\n| ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- |\n\nThe unit of columns is as :\n- ` siteId ` : *Integer*\n- ` name ` : *String*\n- ` calory ` : *Kcal[kilocalorie]*\n- ` carbo ` : *g[Gram]*\n- ` protein ` : *g[Gram]*\n- ` fat ` : *g[Gram]*\n- ` fiber ` : *g[Gram]*\n- `activity1 = ridingBike` : *m[Minute]*\n- `activity2 = running` : *m[Minute]*\n- `activity3 = walking` : *m[Minute]*\n- `activity4 = cleaningUp` : *m[Minute]*\n\n\nIn this project, we use the following two models with the specified accuracy:\n- `Linear regression`: *0.84*\n- `RandomForestRegressor`: *0.87*\n  \nwe have used **RandomForestRegressor** for prediction because is very accurate.\n\n\n\n**Dataset Reference :** ![Mankan.me](https://www.mankan.me/)\n\u003e [!TIP]\n\u003e Thanks to the Mankan site\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaeidemadi%2Fscrapeformankan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsaeidemadi%2Fscrapeformankan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaeidemadi%2Fscrapeformankan/lists"}