{"id":23293316,"url":"https://github.com/theprojectsx/articleparser","last_synced_at":"2025-12-31T14:44:10.485Z","repository":{"id":230604902,"uuid":"779771454","full_name":"TheProjectsX/ArticleParser","owner":"TheProjectsX","description":"Parse Articles from WEB via Search Query","archived":false,"fork":false,"pushed_at":"2024-03-30T20:21:07.000Z","size":6,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-03T14:26:56.017Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TheProjectsX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-30T18:34:59.000Z","updated_at":"2024-03-30T18:37:09.000Z","dependencies_parsed_at":"2025-04-06T18:57:29.654Z","dependency_job_id":null,"html_url":"https://github.com/TheProjectsX/ArticleParser","commit_stats":null,"previous_names":["theprojectsx/articleparser"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TheProjectsX/ArticleParser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheProjectsX%2FArticleParser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheProjectsX%2FArticleParser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheProjectsX%2FArticleParser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheProjectsX%2FArticleParser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TheProjectsX","download_url":"https://codeload.github.com/TheProjectsX/ArticleParser/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheProjectsX%2FArticleParser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265814838,"owners_count":23832780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-20T06:14:49.722Z","updated_at":"2025-12-31T14:44:10.446Z","avatar_url":"https://github.com/TheProjectsX.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Article Parser\n\nGet Articles Data just by Search Query\n\n### Workflow:\n\n- Uses googlesearch_python package to search for Google Search URLs\n- Uses requests to get the URL's Webpage data\n- Uses bs4, wikipedia and newspaper Libraries to Parse the actual Contents of the URL's\n- Uses html2text to convert the HTML content to Markup. Which is decided by user\n- Returns a Generator Object containing Search Results\n\n### Installations:\n\nInstall using pip\n\n```bash script\npip install git+https://github.com/TheProjectsX/ArticleParser.git\n```\n\n## Usages\n\n### Get Articles via Search Query\n\n```python\nimport articleparser\n\narticlesData = articleparser.getArticles(query=\"What is Node JS?\")\nfor article in articlesData:\n    print(\"URL:\", article[\"url\"])\n    print(\"Title:\", article[\"title\"])\n    print(\"Body:\", article[\"body\"][:400])\n```\n\n### Get Google Search Results\n\n```python\nimport articleparser\n\nsearchResults = articleparser.getGoogleSearchResults(query=\"What is Node JS?\")\nfor article in articlesData:\n    print(\"URL:\", article[\"url\"])\n    print(\"Title:\", article[\"title\"])\n    print(\"Description:\", article[\"description\"])\n```\n\n### Parse Article from a Certain URL\n\nUser can pass a certain Webpage URL to parse it's content\n\n```python\nimport articleparser\n\narticle = articleparser.parseArticle(url=\"\")\n\nprint(\"Title:\", article[\"title\"])\nprint(\"Content:\", article[\"content\"][:400])\n```\n\n## NOTE:\n\nThere are many useful Parameters in each Function.\nYou can get it's description Just by hovering in them or opening the file!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftheprojectsx%2Farticleparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftheprojectsx%2Farticleparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftheprojectsx%2Farticleparser/lists"}