{"id":27335255,"url":"https://github.com/buabaj/xplore","last_synced_at":"2025-04-12T14:47:34.910Z","repository":{"id":52429444,"uuid":"287091839","full_name":"buabaj/xplore","owner":"buabaj","description":"A python package built for data scientist/analysts, AI/ML engineers for exploring features of a dataset in minimal number of lines of code for quick analysis before data wrangling and feature extraction.","archived":false,"fork":false,"pushed_at":"2021-04-29T14:21:52.000Z","size":1825,"stargazers_count":21,"open_issues_count":3,"forks_count":11,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-04T23:29:17.216Z","etag":null,"topics":["artificial-intelligence","data-preprocessing","data-science","data-wrangling","machine-learning"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/xplore/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/buabaj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-08-12T18:51:09.000Z","updated_at":"2025-02-03T15:22:34.000Z","dependencies_parsed_at":"2022-08-17T23:40:49.535Z","dependency_job_id":null,"html_url":"https://github.com/buabaj/xplore","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buabaj%2Fxplore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buabaj%2Fxplore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buabaj%2Fxplore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buabaj%2Fxplore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/buabaj","download_url":"https://codeload.github.com/buabaj/xplore/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248585279,"owners_count":21128974,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","data-preprocessing","data-science","data-wrangling","machine-learning"],"created_at":"2025-04-12T14:47:33.750Z","updated_at":"2025-04-12T14:47:34.895Z","avatar_url":"https://github.com/buabaj.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# xplore [![Downloads](https://static.pepy.tech/personalized-badge/xplore?period=total\u0026units=international_system\u0026left_color=brightgreen\u0026right_color=blue\u0026left_text=Downloads)](https://pepy.tech/project/xplore)\n---\nxplore is a python package built with Pandas for data scientist or analysts, AI/ML engineers or researchers for exploring features of a dataset in one line of code for quick analysis before data wrangling and feature extraction. You can also choose to generate a more detailed report on the exploration of your dataset upon request.\n---\n## Getting started\n\n### Install the package\n```bash\npip install xplore\n```\n\n### Import the package into your code\n```python\nfrom xplore.data import xplore\n```\n\n### Assign the read/open command to the file path or URL of your structured dataset to a variable name \n```python\ndata = \u003c Read in your dataset file here \u003e\n```\n\n### Explore your dataset using the xplore() method\n```python\nxplore(data)\n```\n---\n\n### Testing xplore\nNavigate to the test.py file after installing the package and run the code in that file to see and understand how xplore works.\n---\n\n## Sample Output\n```python\n------------------------------------\nThe fist 5 entries of your dataset are:\n\n   rank country_full country_abrv  total_points  ...  three_year_ago_avg  three_year_ago_weighted  confederation   rank_date\n0     1      Germany          GER           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n1     2        Italy          ITA           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n2     3  Switzerland          SUI           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n3     4       Sweden          SWE           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n4     5    Argentina          ARG           0.0  ...                 0.0                      0.0       CONMEBOL  1993-08-08\n\n[5 rows x 16 columns]\n\n\n------------------------------------\nThe last 5 entries of your dataset are:\n\n       rank country_full country_abrv  total_points  ...  three_year_ago_avg  three_year_ago_weighted  confederation   rank_date\n57788   206     Anguilla          AIA           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57789   206      Bahamas          BAH           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57790   206      Eritrea          ERI           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57791   206      Somalia          SOM           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57792   206        Tonga          TGA           0.0  ...                 0.0                      0.0            OFC  2018-06-07\n\n[5 rows x 16 columns]\n\n\n------------------------------------\nStats on your dataset:\n\n\u003cbound method NDFrame.describe of        rank country_full country_abrv  total_points  ...  three_year_ago_avg  three_year_ago_weighted  confederation   rank_date\n0         1      Germany          GER           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n1         2        Italy          ITA           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n2         3  Switzerland          SUI           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n3         4       Sweden          SWE           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n4         5    Argentina          ARG           0.0  ...                 0.0                      0.0       CONMEBOL  1993-08-08\n...     ...          ...          ...           ...  ...                 ...                      ...            ...         ...\n57788   206     Anguilla          AIA           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57789   206      Bahamas          BAH           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57790   206      Eritrea          ERI           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57791   206      Somalia          SOM           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57792   206        Tonga          TGA           0.0  ...                 0.0                      0.0            OFC  2018-06-07\n\n[57793 rows x 16 columns]\u003e\n\n\n------------------------------------\nThe Value types of each column are:\n\nrank                         int64\ncountry_full                object\ncountry_abrv                object\ntotal_points               float64\nprevious_points              int64\nrank_change                  int64\ncur_year_avg               float64\ncur_year_avg_weighted      float64\nlast_year_avg              float64\nlast_year_avg_weighted     float64\ntwo_year_ago_avg           float64\ntwo_year_ago_weighted      float64\nthree_year_ago_avg         float64\nthree_year_ago_weighted    float64\nconfederation               object\nrank_date                   object\ndtype: object\n\n\n------------------------------------\nInfo on your Dataset:\n\n\u003cbound method DataFrame.info of        rank country_full country_abrv  total_points  ...  three_year_ago_avg  three_year_ago_weighted  confederation   rank_date\n0         1      Germany          GER           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n1         2        Italy          ITA           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n2         3  Switzerland          SUI           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n3         4       Sweden          SWE           0.0  ...                 0.0                      0.0           UEFA  1993-08-08\n4         5    Argentina          ARG           0.0  ...                 0.0                      0.0       CONMEBOL  1993-08-08\n...     ...          ...          ...           ...  ...                 ...                      ...            ...         ...\n57788   206     Anguilla          AIA           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57789   206      Bahamas          BAH           0.0  ...                 0.0                      0.0       CONCACAF  2018-06-07\n57790   206      Eritrea          ERI           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57791   206      Somalia          SOM           0.0  ...                 0.0                      0.0            CAF  2018-06-07\n57792   206        Tonga          TGA           0.0  ...                 0.0                      0.0            OFC  2018-06-07\n\n[57793 rows x 16 columns]\u003e\n\n\n------------------------------------\nThe shape of your dataset in the order of rows and columns is:\n\n(57793, 16)\n\n\n------------------------------------\nThe features of your dataset are:\n\nIndex(['rank', 'country_full', 'country_abrv', 'total_points',\n       'previous_points', 'rank_change', 'cur_year_avg',\n       'cur_year_avg_weighted', 'last_year_avg', 'last_year_avg_weighted',\n       'two_year_ago_avg', 'two_year_ago_weighted', 'three_year_ago_avg',\n       'three_year_ago_weighted', 'confederation', 'rank_date'],\n      dtype='object')\n\n\n------------------------------------\nThe total number of null values from individual columns of your data set are:\n\nrank                       0\ncountry_full               0\ncountry_abrv               0\ntotal_points               0\nprevious_points            0\nrank_change                0\ncur_year_avg               0\ncur_year_avg_weighted      0\nlast_year_avg              0\nlast_year_avg_weighted     0\ntwo_year_ago_avg           0\ntwo_year_ago_weighted      0\nthree_year_ago_avg         0\nthree_year_ago_weighted    0\nconfederation              0\nrank_date                  0\ndtype: int64\n\n\n------------------------------------\nThe number of rows in your dataset are:\n\n57793\n\n\n------------------------------------\nThe values in your dataset are:\n\n[[1 'Germany' 'GER' ... 0.0 'UEFA' '1993-08-08']\n [2 'Italy' 'ITA' ... 0.0 'UEFA' '1993-08-08']\n [3 'Switzerland' 'SUI' ... 0.0 'UEFA' '1993-08-08']\n ...\n [206 'Eritrea' 'ERI' ... 0.0 'CAF' '2018-06-07']\n [206 'Somalia' 'SOM' ... 0.0 'CAF' '2018-06-07']\n [206 'Tonga' 'TGA' ... 0.0 'OFC' '2018-06-07']]\n\n\n------------------------------------\n\n\nDo you want to generate a detailed report on the exploration of your dataset?\n[y/n]: y\nGenerating report...\n\nSummarize dataset: 100%|████████████████████████████████████████████████████████████████████████████| 30/30 [03:34\u003c00:00,  7.14s/it, Completed] \nGenerate report structure: 100%|█████████████████████████████████████████████████████████████████████████████████| 1/1 [00:31\u003c00:00, 31.42s/it] \nRender HTML: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:12\u003c00:00, 12.07s/it] \nExport report to file: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00\u003c00:00,  8.00it/s] \nYour Report has been generated and saved as 'output.html'\n```\n---\n\n## Contributing to xplore\nFork and clone this repo if you have any contributions you want to make. \nPush your commits to a new branch and send a PR when done.\nI'll review your code and merge your PR as soon as possible.\n\n## Maintainers: \n[Jerry Buaba](https://www.linkedin.com/in/buabaj/) | \n[Labaran Mohammed](https://linkedin.com/in/adam-labaran-111358181) | \n[Benjamin Acquaah](https://linkedin.com/in/benjamin-acquaah-9294aa14b)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbuabaj%2Fxplore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbuabaj%2Fxplore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbuabaj%2Fxplore/lists"}