{"id":16390193,"url":"https://github.com/davidzajac1/reptoro","last_synced_at":"2026-05-15T01:39:01.941Z","repository":{"id":216234224,"uuid":"404814180","full_name":"davidzajac1/Reptoro","owner":"davidzajac1","description":"A Data Visualization and Analytics Platform for the Reptile Industry","archived":false,"fork":false,"pushed_at":"2022-07-05T01:34:34.000Z","size":97210,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-05-13T00:46:44.675Z","etag":null,"topics":["analytics","data-analysis","data-visualization","plotly-dash","python"],"latest_commit_sha":null,"homepage":"https://reptoro.herokuapp.com/","language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davidzajac1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-09-09T17:35:20.000Z","updated_at":"2023-05-20T05:21:06.000Z","dependencies_parsed_at":"2024-01-09T06:03:19.678Z","dependency_job_id":null,"html_url":"https://github.com/davidzajac1/Reptoro","commit_stats":null,"previous_names":["davidzajac1/reptoro"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidzajac1%2FReptoro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidzajac1%2FReptoro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidzajac1%2FReptoro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidzajac1%2FReptoro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davidzajac1","download_url":"https://codeload.github.com/davidzajac1/Reptoro/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219862885,"owners_count":16555951,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","data-analysis","data-visualization","plotly-dash","python"],"created_at":"2024-10-11T04:42:24.633Z","updated_at":"2025-10-04T11:42:12.440Z","avatar_url":"https://github.com/davidzajac1.png","language":"CSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://reptoro.herokuapp.com/\"\u003e\n      \u003cimg width=\"100%\" src=\"img/reptoro_header.PNG\" alt=\"Header\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n![Language](https://img.shields.io/badge/Language-Python-success?style=flat)\n![Flask](https://img.shields.io/badge/Flask-v2.0.1-informational?style=flat)\n![pandas](https://img.shields.io/badge/pandas-v1.3.3-informational?style=flat)\n![SQLAlchemy](https://img.shields.io/badge/SQLAlchemy-v1.4.23-informational?style=flat)\n![dash](https://img.shields.io/badge/dash-v2.0.0-informational?style=flat)\n\n# A Data Visualization and Analytics Platform for the Reptile Industry\n\nMost billion dollar industries have many large corporations who employ full-time business analysts to crunch numbers and analyze data, making prices and margins competitive.\n\nThe exotic reptile trade is a rare exception, with only a handful of medium sized players and with the majority of the industry being comprised of “Mom and Pop” operations there are potentially undiscovered high margin opportunities. These animals frequently sell for four and even five figures but live inside an enclosure the size of a shoe box.\n\nReptoro is the first and only data visualization and analytics platform specific to the Exotic Reptile Industry.\n\n## How it Works\n\nReptoro continuously webscrapes industry marketplaces, adding new animals and breeders to the database as well as tracking price and status changes to existing listings.\n\nBelow is a flowchart of the cloud-based tech stack and ETL pipeline. [Apache Airflow](https://github.com/apache/airflow) is used to schedule and orchestrate periodic webscrapes using AWS Lambda Functions to request and parse data.\n\nData is dumped into an AWS S3 bucket where it is cleaned and transformed using AWS. After the scrape is over a Lambda Function updates our PostgreSQL database.\n\nReptoro is hosted on an AWS EC2 instance, runs on a Flask framework and utilizes [`dash`](https://github.com/plotly/dash) to interactively display data on dashboards.\n\n![alt text](img/flowchart.JPG)\n\n## ETL Pipeline\n\nBelow you'll see a flowchart of our ETL Pipeline from the Apache Airflow GUI. All long tasks in the pipeline are conducted at 10x concurrency using Boto3 to trigger multiple Lambda Functions at the same time.\n\nIn the first step a Chrome Browser is rendered in the Lambda Function using Selenium to interact with the website, login to an account and extract the Session ID so that it can be passed in to other Lambda Functions to access login-required data.\n\nAfter the login Session IDs are extracted, they are passed down the pipeline and more Lambda Functions scrape all search results pages. The URLs are queried against the PostgreSQL database to check for listings that are not already in the database.\n\nThe new listings are then scraped and inserted into the database, the same process is completed for all sellers profiles.\n\n![alt text](img/airflow_chart.JPG)\n\n\n## Web App\n\nReptoro is hosted on an AWS EC2 instance using AWS ElasticBeanstalk to auto-scale and provision resources, so that the website will still function during an influx of requests.\n\nThe Web Application is written in Python using [`Flask`](https://github.com/pallets/flask) and [`Jinja2`](https://github.com/pallets/jinja) in tandem with a Bootstrap template to quickly build a beautiful dynamic user interface.\n\nPandas DataFrames created using the [`SQLAlchemy`](https://github.com/sqlalchemy/sqlalchemy) ORM to query the database and pass the DataFrame into the Dash framework and quickly yield interactive 2D and 3D visualizations.\n\n![alt text](img/dashboard.JPG)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidzajac1%2Freptoro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidzajac1%2Freptoro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidzajac1%2Freptoro/lists"}