{"id":22293633,"url":"https://github.com/xi/pyjsonproxy","last_synced_at":"2025-03-25T22:18:21.313Z","repository":{"id":26904545,"uuid":"30366268","full_name":"xi/PyJSONProxy","owner":"xi","description":"simple proxy and scraper","archived":false,"fork":false,"pushed_at":"2019-05-05T08:29:41.000Z","size":36,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-30T19:24:24.751Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xi.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.rst","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-02-05T16:42:24.000Z","updated_at":"2019-05-05T08:29:19.000Z","dependencies_parsed_at":"2022-09-09T00:52:22.991Z","dependency_job_id":null,"html_url":"https://github.com/xi/PyJSONProxy","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xi%2FPyJSONProxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xi%2FPyJSONProxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xi%2FPyJSONProxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xi%2FPyJSONProxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xi","download_url":"https://codeload.github.com/xi/PyJSONProxy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245550681,"owners_count":20633883,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-03T17:29:55.038Z","updated_at":"2025-03-25T22:18:21.270Z","avatar_url":"https://github.com/xi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"PyJSONProxy - simple proxy and scraper\n\n\nsimple proxy\n============\n\nAJAX requests are restricted by the `same origin policy`_. This can be\nbypassed by using either `JSONP`_, `CORS`_ or a local proxy. This\nimplements the third variant. So you can do something like this::\n\n    $ curl http://localhost:5000/github/xi/\n    {\n      \"login\": \"xi\",\n      ...\n    }\n\nWith a configuration like this::\n\n    ENDPOINTS = {\n        'github': {\n            'host': 'https://api.github.com/users/'\n        }\n    }\n\n\nscraping\n========\n\nMaybe the more interesting part is that this also contains a simple\nscraping mechanism. So if a service does not offer an API but only plain\nHTML pages, PyJSONProxy can extract information from there::\n\n    $ curl http://localhost:5000/github/xi/\n    {\n      \"url\": \"https://github.com/xi/\",\n      \"login\": \"xi\",\n      \"activity\": [\n        ...\n      ],\n      \"repos\": [{\n        ...\n      }]\n      ...\n    }\n\n::\n\n    ENDPOINTS = {\n        'github': {\n            'host': 'https://github.com/',\n            'fields': {\n                'login': '.vcard-username',\n                'fullname': '.vcard-fullname',\n                'email': '.vcard-details .email',\n                'join-date': '.vcard-details .join-date@datetime',\n                'activity': {\n                    'selector': '.contribution-activity-listing ul a'\n                },\n                'repos': {\n                    'selector': '.popular-repos a.mini-repo-list-item',\n                    'fields': {\n                        'url': '@href',\n                        'name': '.repo',\n                        'description': '.repo-description'\n                    }\n                }\n            }\n        }\n    }\n\nSelectors are generally CSS-selectors with the additional option to\nselect an attribute by appending an ``@`` and the attribute name. If no\nattribute is selected, the text content of the element will be used.\n\n\nCORS header\n===========\n\nBy setting ``ALLOW_CORS`` to ``True``, an\n``Access-Control-Allow-Origin``-header with value ``*`` will be set for\nall responses.\n\n\nDocumentation\n=============\n\nSome simple documentation is automatically generated and available under\n``/`` (for all endpoints) or ``/{endpoint}/`` (for an individual\nendpoint). To provide some input for this documentation, you can add a\ndescription to both endpoints and fields::\n\n    ENDPOINTS = {\n        'github': {\n            'host': 'https://github.com/',\n            'doc': 'Access data about GitHub users',\n            'fields': {\n              'login': '.vcard-username',\n              'fullname': '.vcard-fullname',\n              'email': '.vcard-details .email'\n              'join-date': '.vcard-details .join-date@datetime'\n            },\n            'fields_doc': {\n              'login': 'github username',\n              'fullname': 'the user\\'s full name',\n              'join-date': 'date when the user joined github in ISO-xx format'\n            }\n        }\n    }\n\n\nNote on security and performance\n================================\n\nSecurity and performance were not a priority in this project. So it\nmight be bad.\n\n\n.. _same origin policy: https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy\n.. _JSONP: https://en.wikipedia.org/wiki/JSONP\n.. _CORS: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxi%2Fpyjsonproxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxi%2Fpyjsonproxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxi%2Fpyjsonproxy/lists"}