{"id":15041233,"url":"https://github.com/joekakone/db-analytics-tools","last_synced_at":"2025-08-01T18:31:42.097Z","repository":{"id":190057913,"uuid":"631620694","full_name":"joekakone/db-analytics-tools","owner":"joekakone","description":"Databases Analytics Tools - Data Integration - Data Visualization - Machine Learning","archived":false,"fork":false,"pushed_at":"2024-10-31T21:06:32.000Z","size":133,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-10-31T21:22:20.109Z","etag":null,"topics":["data-engineering","data-integration","data-visualization","etl","machine-learning","pipeline","sql"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/db-analytics-tools/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joekakone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-23T15:45:57.000Z","updated_at":"2024-10-31T21:06:36.000Z","dependencies_parsed_at":"2024-09-25T01:35:00.414Z","dependency_job_id":"c6f76e42-5f52-4caa-8716-d74784511714","html_url":"https://github.com/joekakone/db-analytics-tools","commit_stats":{"total_commits":47,"total_committers":1,"mean_commits":47.0,"dds":0.0,"last_synced_commit":"a2ac068264faa52afbfa6c3e3cc028b7422e0ef4"},"previous_names":["joekakone/db-analytics-tools"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joekakone%2Fdb-analytics-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joekakone%2Fdb-analytics-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joekakone%2Fdb-analytics-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joekakone%2Fdb-analytics-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joekakone","download_url":"https://codeload.github.com/joekakone/db-analytics-tools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228397819,"owners_count":17913540,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-engineering","data-integration","data-visualization","etl","machine-learning","pipeline","sql"],"created_at":"2024-09-24T20:45:47.476Z","updated_at":"2025-08-01T18:31:42.060Z","avatar_url":"https://github.com/joekakone.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/joekakone/db-analytics-tools/master/cover.png\"\u003e\u003cbr\u003e\n\u003c/div\u003e\n\n# DB Analytics Tools\nDatabases Analytics Tools is a Python open source micro framework for data analytics. DB Analytics Tools is built on top of Psycopg2, Pyodbc, Pandas, Matplotlib and Scikit-learn. It helps data analysts to interact with data warehouses as traditional databases clients.\n\n\n## Why adopt DB Analytics Tools ?\n- Easy to learn : It is high level API and doesn't require any special effort to learn.\n- Real problems solver : It is designed to solve real life problems of the Data Analyst\n- All in One : Support queries, Data Integration, Analysis, Visualization and Machine Learning\n\n\n## Core Components\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003e#\u003c/th\u003e\n    \u003cth\u003eComponent\u003c/th\u003e\n    \u003cth\u003eDescription\u003c/th\u003e\n    \u003cth\u003eHow to import\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e0\u003c/td\u003e\n    \u003ctd\u003edb\u003c/td\u003e\n    \u003ctd\u003eDatabase Interactions (Client)\u003c/td\u003e\n    \u003ctd\u003e\u003ccode\u003eimport db_analytics_tools as db\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e1\u003c/td\u003e\n    \u003ctd\u003edbi\u003c/td\u003e\n    \u003ctd\u003eData Integration \u0026 Data Engineering\u003c/td\u003e\n    \u003ctd\u003e\u003ccode\u003eimport db_analytics_tools.integration as dbi\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e2\u003c/td\u003e\n    \u003ctd\u003edba\u003c/td\u003e\n    \u003ctd\u003eData Analysis\u003c/td\u003e\n    \u003ctd\u003e\u003ccode\u003eimport db_analytics_tools.analytics as dba\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e3\u003c/td\u003e\n    \u003ctd\u003edbviz\u003c/td\u003e\n    \u003ctd\u003eData Visualization\u003c/td\u003e\n    \u003ctd\u003e\u003ccode\u003eimport db_analytics_tools.plotting as dbviz\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e4\u003c/td\u003e\n    \u003ctd\u003edbml\u003c/td\u003e\n    \u003ctd\u003eMachine Learning \u0026 MLOps\u003c/td\u003e\n    \u003ctd\u003e\u003ccode\u003eimport db_analytics_tools.learning as dbml\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\n## Install DB Analytics Tools\n### Dependencies\nDB Analytics Tools requires\n* Python\n* Psycopg2\n* Pyodbc\n* Pandas\n* SQLAlchemy\n* Streamlit\n\nDB Analytics Tools can easily installed using pip\n```sh\npip install db-analytics-tools\n```\n\n\n## Get Started\n### Setup client\nAs traditional databases clients, we need to provide database server ip address and port and credentials. DB Analytics Tools supports Postgres and SQL Server.\n```python\n# Import DB Analytics Tools\nimport db_analytics_tools as db\n\n# Database Infos \u0026 Credentials\nENGINE = \"postgres\"\nHOST = \"localhost\"\nPORT = \"5432\"\nDATABASE = \"postgres\"\nUSER = \"postgres\"\nPASSWORD = \"admin\"\n\n# Setup client\nclient = db.Client(host=HOST, port=PORT, database=DATABASE, username=USER, password=PASSWORD, engine=ENGINE)\n```\n\n### Data Definition Language\n```python\nquery = \"\"\"\n----- CREATE TABLE -----\ndrop table if exists public.transactions;\ncreate table public.transactions (\n    transaction_id integer primary key,\n    client_id integer,\n    product_name varchar(255),\n    product_category varchar(255),\n    quantity integer,\n    unitary_price numeric,\n    amount numeric\n);\n\"\"\"\n\nclient.execute(query=query)\n```\n\n### Data Manipulation Language\n```python\nquery = \"\"\"\n----- POPULATE TABLE -----\ninsert into public.transactions (transaction_id, client_id, product_name, product_category, quantity, unitary_price, amount)\nvalues\n\t(1,101,'Product A','Category 1',5,100,500),\n\t(2,102,'Product B','Category 2',3,50,150),\n\t(3,103,'Product C','Category 1',2,200,400),\n\t(4,102,'Product A','Category 1',7,100,700),\n\t(5,105,'Product B','Category 2',4,50,200),\n\t(6,101,'Product C','Category 1',1,200,200),\n\t(7,104,'Product A','Category 1',6,100,600),\n\t(8,103,'Product B','Category 2',2,50,100),\n\t(9,103,'Product C','Category 1',8,200,1600),\n\t(10,105,'Product A','Category 1',3,100,300);\n\"\"\"\n\nclient.execute(query=query)\n```\n\n### Data Query Language\n```python\nquery = \"\"\"\n----- GET DATA -----\nselect *\nfrom public.transactions\norder by transaction_id;\n\"\"\"\n\ndataframe = client.read_sql(query=query)\nprint(dataframe.head())\n```\n```txt\n   transaction_id  client_id product_name product_category  quantity  unitary_price  amount\n0               1        101    Product A       Category 1         5          100.0   500.0\n1               2        102    Product B       Category 2         3           50.0   150.0\n2               3        103    Product C       Category 1         2          200.0   400.0\n3               4        102    Product A       Category 1         7          100.0   700.0\n4               5        105    Product B       Category 2         4           50.0   200.0\n```\n\n## Show current queries\nYou can simply show current queries for current user.\n```py\nclient.show_sessions()\n```\n\nYou can cancel query by its session_id.\n```py\nclient.cancel_query(10284)\n```\n\nYou can go further cancelling on lock\n```py\nclient.cancel_locked_queries()\n```\nThis will canceled all current lockes queries.\n\n## Implement SQL based ETL\nETL API is in the integration module `db_analytics_tools.integration`. Let's import it ans create an ETL object.\n```python\n# Import Integration module\nimport db_analytics_tools.integration as dbi\n\n# Setup ETL\netl = dbi.ETL(client=client)\n```\n\nETLs for DB Analytics Tools consists in functions with date parameters. Everything is done in one place i.e on the database. So first create a function on the database like this :\n```python\nquery = \"\"\"\n----- CREATE FUNCTION ON DB -----\ncreate or replace function public.fn_test(rundt date) returns integer\nlanguage plpgsql\nas\n$$\nbegin\n\t--- DEBUG MESSAGE ---\n\traise notice 'rundt : %', rundt;\n\n\t--- EXTRACT ---\n\n\t--- TRANSFORM ---\n\n\t--- LOAD ---\n\n\treturn 0;\nend;\n$$;\n\"\"\"\n\nclient.execute(query=query)\n```\n### Run a function\nThen ETL function can easily be run using the ETL class via the method `ETL.run()`\n```python\n# ETL Function\nFUNCTION = \"public.fn_test\"\n\n## Dates to run\nSTART = \"2023-08-01\"\nSTOP = \"2023-08-05\"\n\n# Run ETL\netl.run(function=FUNCTION, start_date=START, stop_date=STOP, freq=\"d\", reverse=False)\n```\n```\nFunction    : public.fn_test\nDate Range  : From 2023-08-01 to 2023-08-05\nIterations  : 5\n[Runing Date: 2023-08-01] [Function: public.fn_test] Execution time: 0:00:00.122600\n[Runing Date: 2023-08-02] [Function: public.fn_test] Execution time: 0:00:00.049324\n[Runing Date: 2023-08-03] [Function: public.fn_test] Execution time: 0:00:00.049409\n[Runing Date: 2023-08-04] [Function: public.fn_test] Execution time: 0:00:00.050019\n[Runing Date: 2023-08-05] [Function: public.fn_test] Execution time: 0:00:00.108267\n```\n\n### Run several functions\nMost of time, several ETL must be run and DB Analytics Tools supports running functions as pipelines.\n```python\n## ETL Functions\nFUNCTIONS = [\n    \"public.fn_test\",\n    \"public.fn_test_long\",\n    \"public.fn_test_very_long\"\n]\n\n## Dates to run\nSTART = \"2023-08-01\"\nSTOP = \"2023-08-05\"\n\n# Run ETLs\netl.run_multiple(functions=FUNCTIONS, start_date=START, stop_date=STOP, freq=\"d\", reverse=False)\n```\n```\nFunctions   : ['public.fn_test', 'public.fn_test_long', 'public.fn_test_very_long']\nDate Range  : From 2023-08-01 to 2023-08-05\nIterations  : 5\n*********************************************************************************************\n[Runing Date: 2023-08-01] [Function: public.fn_test..........] Execution time: 0:00:00.110408\n[Runing Date: 2023-08-01] [Function: public.fn_test_long.....] Execution time: 0:00:00.112078\n[Runing Date: 2023-08-01] [Function: public.fn_test_very_long] Execution time: 0:00:00.092423\n*********************************************************************************************\n[Runing Date: 2023-08-02] [Function: public.fn_test..........] Execution time: 0:00:00.111153\n[Runing Date: 2023-08-02] [Function: public.fn_test_long.....] Execution time: 0:00:00.111395\n[Runing Date: 2023-08-02] [Function: public.fn_test_very_long] Execution time: 0:00:00.110814\n*********************************************************************************************\n[Runing Date: 2023-08-03] [Function: public.fn_test..........] Execution time: 0:00:00.111044\n[Runing Date: 2023-08-03] [Function: public.fn_test_long.....] Execution time: 0:00:00.123229\n[Runing Date: 2023-08-03] [Function: public.fn_test_very_long] Execution time: 0:00:00.078432\n*********************************************************************************************\n[Runing Date: 2023-08-04] [Function: public.fn_test..........] Execution time: 0:00:00.127839\n[Runing Date: 2023-08-04] [Function: public.fn_test_long.....] Execution time: 0:00:00.111339\n[Runing Date: 2023-08-04] [Function: public.fn_test_very_long] Execution time: 0:00:00.140669\n*********************************************************************************************\n[Runing Date: 2023-08-05] [Function: public.fn_test..........] Execution time: 0:00:00.138380\n[Runing Date: 2023-08-05] [Function: public.fn_test_long.....] Execution time: 0:00:00.111157\n[Runing Date: 2023-08-05] [Function: public.fn_test_very_long] Execution time: 0:00:00.077731\n*********************************************************************************************\n```\n\n## Get started with the UI\nDB Analytics Tools UI is a web-based GUI  (`db_analytics_tools.webapp.UI`). No need to code, all you need is a JSON config file. Run the command below :\n```sh\ndb_tools start --config config.json --address 127.0.0.1 --port 8050\n```\n![](https://raw.githubusercontent.com/joekakone/db-analytics-tools/master/db-analytics-tools-ui-screenshot.png)\n\n## Interact with Airflow\nWe also provide a class for interacting with the Apache Airflow REST API.\n```py\n# Import Airflow class\nfrom db_analytics_tools.airflow import AirflowRESTAPI\n\n# Create an instance\nairflow = AirflowRESTAPI(AIRFLOW_API_URL, AIRFLOW_USERNAME, AIRFLOW_PASSWORD)\n\n# Get list of dags\nairflow.get_dags_list(include_all=False).head(10)\n\n# Get a dag details\nairflow.get_dag_details(dag_id=\"my_airflow_pipeline\", include_tasks=False)\n\n# Get list of tasks of a dag\nairflow.get_dag_tasks(dag_id=\"my_airflow_pipeline\").head(10)\n\n# Trigger a Job\nairflow.trigger_dag(dag_id=\"my_airflow_pipeline\", start_date='2025-03-11', end_date='2025-03-12')\n```\n\n\n## Documentation\nDocumentation available on [https://joekakone.github.io/db-analytics-tools](https://joekakone.github.io/db-analytics-tools).\n\n\n## Help and Support\nIf you need help on DB Analytics Tools, please send me an message on [Whatsapp](https://wa.me/+22891518923) or send me a [mail](mailto:contact@josephkonka.com).\n\n\n## Contributing\n[Please see the contributing docs.](CONTRIBUTING.md)\n\n\n## Maintainer\nDB Analytics Tools is maintained by [Joseph Konka](https://www.linkedin.com/in/joseph-koami-konka/). Joseph is a Data Science Professional with a focus on Python based tools. He developed the base code while working at Togocom to automate his daily tasks. He packages the code into a Python package called **SQL ETL Runner** which becomes **Databases Analytics Tools**. For more about Joseph Konka, please visit [www.josephkonka.com](https://josephkonka.com).\n\n\n## Let's get in touch\n[![Github Badge](https://img.shields.io/badge/-Github-000?style=flat-square\u0026logo=Github\u0026logoColor=white\u0026link=https://github.com/joekakone)](https://github.com/joekakone) [![Linkedin Badge](https://img.shields.io/badge/-LinkedIn-blue?style=flat-square\u0026logo=Linkedin\u0026logoColor=white\u0026link=https://www.linkedin.com/in/joseph-koami-konka/)](https://www.linkedin.com/in/joseph-koami-konka/) [![Twitter Badge](https://img.shields.io/badge/-Twitter-blue?style=flat-square\u0026logo=Twitter\u0026logoColor=white\u0026link=https://www.twitter.com/joekakone)](https://www.twitter.com/joekakone) [![Gmail Badge](https://img.shields.io/badge/-Gmail-c14438?style=flat-square\u0026logo=Gmail\u0026logoColor=white\u0026link=mailto:joseph.kakone@gmail.com)](mailto:joseph.kakone@gmail.com)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoekakone%2Fdb-analytics-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoekakone%2Fdb-analytics-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoekakone%2Fdb-analytics-tools/lists"}