https://github.com/cbmira01/featureranking
A data science technique implemented in OpenCL.
https://github.com/cbmira01/featureranking
data-science entropy feature-reduction machine-learning-algorithms numpy opencl pyopencl python windows-10
Last synced: about 2 months ago
JSON representation
A data science technique implemented in OpenCL.
- Host: GitHub
- URL: https://github.com/cbmira01/featureranking
- Owner: cbmira01
- License: other
- Created: 2021-06-08T14:52:44.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2022-04-14T20:03:47.000Z (about 4 years ago)
- Last Synced: 2025-01-19T04:36:00.250Z (over 1 year ago)
- Topics: data-science, entropy, feature-reduction, machine-learning-algorithms, numpy, opencl, pyopencl, python, windows-10
- Language: Python
- Homepage:
- Size: 391 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Feature Ranking
This Python project implements a compute-intensive data mining task on
workstation devices capable of parallel computation. A particular data science
algorithm for feature ranking is implemented in the OpenCL framework and
compared to an unaccelerated version written in Python (NumPy).
## How to run this project
### Requirements
This project was built and tested on Windows 10 64-bit, with up-to-date
CPU and GPU drivers. You'll need at least one OpenCL device discoverable on
your workstation, a "recent" version of Anaconda (eg, 2021.05), a Git
client like [Git Bash](https://git-scm.com) available at the Anaconda command
line, and a local directory on your workstation where your code repositories
are housed.
### Setup from scratch
- Open the Anaconda CMD prompt into the Conda (base) environment.
- Change to your preferred local repository base directory:
> cd C:\\Users\\your-user-name\\...\\your-local-repos>
- Clone the FeatureRanking project GitHub repository:
> git clone https://github.com/cbmira01/FeatureRanking
- Change to the FeatureRanking project root directory:
> cd .\FeatureRanking
- Create the Conda environment for the project:
> conda env create --file environment.yml
- After packages have loaded, activate the new Conda environment:
> conda activate feature_ranking
### Test the installation
- To run Python programs, change to the .\src folder:
> cd .\src
- The quick-demo program will dump a list of available OpenCL devices
and run a very small OpenCL workload on each one:
> python quick_demo.py
- A few unit tests can be run:
> python run_tests.py
### After a successful installation
- Run the main program:
> python feature_ranking.py
- More information on setup is available in the project [Wiki](https://github.com/cbmira01/FeatureRanking/wiki).
## More information
More information about feature ranking, OpenCL, the experimental approach of
this project, and results are described in the project [Wiki](https://github.com/cbmira01/FeatureRanking/wiki).
## Credits
Credits, attributions, and works consulted are summarized in the project [Wiki](https://github.com/cbmira01/FeatureRanking/wiki).
## Notes for Code Louisville project grading
This project fullfilled requirements for the Summer 2021 Code Louisville Python session.
- Implements a “master loop” console application where the user can repeatedly
enter commands/perform actions, as seen in
> ./src/feature_ranking.py
- Creates and uses a dictionary or list, as seen in
> ./src/prep_data.py
- Reads data from external JSON and CSV files, as seen in
> ./src/prep_data.py
- Function calls that return values are used throughout, for example
> ./src/fr_opencl.py
> ./src/feature_ranking.py
- Implements a "stretch" goal: setup and testing of the OpenCL framework.