{"id":24611091,"url":"https://github.com/jonathanporta/pyetl-framework","last_synced_at":"2025-03-18T15:24:27.728Z","repository":{"id":77506282,"uuid":"56897631","full_name":"JonathanPorta/pyetl-framework","owner":"JonathanPorta","description":null,"archived":false,"fork":false,"pushed_at":"2016-06-26T05:49:08.000Z","size":44,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-24T19:36:56.964Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JonathanPorta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-04-23T03:03:50.000Z","updated_at":"2025-01-22T09:56:06.000Z","dependencies_parsed_at":"2023-04-09T14:17:00.588Z","dependency_job_id":null,"html_url":"https://github.com/JonathanPorta/pyetl-framework","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fpyetl-framework","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fpyetl-framework/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fpyetl-framework/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JonathanPorta%2Fpyetl-framework/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JonathanPorta","download_url":"https://codeload.github.com/JonathanPorta/pyetl-framework/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244246987,"owners_count":20422529,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-24T19:34:41.970Z","updated_at":"2025-03-18T15:24:27.723Z","avatar_url":"https://github.com/JonathanPorta.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pyscrape\n\n### Setup Local Dev\nInstall python, virtualenv and deps. To get started go [here](https://realpython.com/blog/python/flask-by-example-part-1-project-setup).\n\nOnce\n\nThen: `pip install -r requirements.txt`\n\n### Pushing to Production/Staging on Heroku (Don't do this, ci should do this)\nThis is an example of how a sample app would deploy. This shouldn't be here.\ngit remote add heroku-staging git@heroku.com:pyscrape-staging.git\ngit remote add heroku-production git@heroku.com:pyscrape-production.git\nOr\n`make deploy`\n\n### Release\nFirst, create a new pip package. This will bump the patch version and write it to `VERSION`.\n`make package`\n\nThen, to push to the package to the repository:\n`make release`\n\n### Usage\nPyetl is meant as a framework to help with the extraction, transformation and loading of data between sources.\n\nTo get started, create a new Python project and then `pip install pypscraper-framework`.\n\nTo run the app flask app frontend: `pyetl_flask`\nTo run the worker process: `pyetl_worker`\n\nThe following two environment vars are required:\n```\nexport APP_SETTINGS='DevelopmentConfig' # name of the corresponding config class for this env.\nexport APP_BASEDIR=$(pwd) # must point to directory containing your config file.\n```\n\nA config file is also required. See `config.py.example`.\n\n### Concepts\nThe framework relies heavily on naming conventions and magically imports.\n#### Pipe\nThis represents the flow of data from a source to a target. `pipe.start()` should do whatever is necessary determine and enqueue any and all `ETLJob`s that must execute in order for a run to be considered succesfull.\n\n#### ETLJob\nA base class defined in the framework. It has three methods: extract, transform, load.\n\n#### Transformer/Extractor/Loader\nExtend this class with a class that has the same name as your pipe's class. The ETLJob will run `{Transformer|Extractor|Loader}.execute()` when executing.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonathanporta%2Fpyetl-framework","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjonathanporta%2Fpyetl-framework","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonathanporta%2Fpyetl-framework/lists"}