Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/codersales/unsupervised-learning-clustering
- tarball handler py ###########/ |||||||||| - shell script3.sh ########## |||||||||||||||||| - Unsupervised learning Clustering kmeans and hierarchical ##### |||||||||||||||| - topics: | 20-topics | may-2023-filtered | may-2023-filtered-2 | may-2023-filtered-3 | filtered-4
https://github.com/codersales/unsupervised-learning-clustering
1st-triple-quarter-2023 2023 291-commits a-top-repo april-2023 clustering code current h1-2023 jupyter machine-learning major over-200-commits python python3 q2-2023 ranked shell tarball unsupervised-learning
Last synced: 22 days ago
JSON representation
- tarball handler py ###########/ |||||||||| - shell script3.sh ########## |||||||||||||||||| - Unsupervised learning Clustering kmeans and hierarchical ##### |||||||||||||||| - topics: | 20-topics | may-2023-filtered | may-2023-filtered-2 | may-2023-filtered-3 | filtered-4
- Host: GitHub
- URL: https://github.com/codersales/unsupervised-learning-clustering
- Owner: CoderSales
- License: mit
- Created: 2023-04-11T11:02:08.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-11-24T05:10:49.000Z (about 2 months ago)
- Last Synced: 2024-11-24T06:18:40.496Z (about 2 months ago)
- Topics: 1st-triple-quarter-2023, 2023, 291-commits, a-top-repo, april-2023, clustering, code, current, h1-2023, jupyter, machine-learning, major, over-200-commits, python, python3, q2-2023, ranked, shell, tarball, unsupervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 58.7 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# unsupervised-learning-clustering
## Key elements in this repository:
### Setup:
1. References
2. `script3.sh` - activates virtual environment (or `source scripy3.sh`)
3. `.bashrc` - for virtual environment
- in .venv folder
4. Python Select Interpreter
5. pip install ipykernel
6. pip install jupyter
7. sh installer.sh
8. python.exe -m pip install --upgrade pip
9. pip install notebook
10. pip install pandas
11. python -m pip install -U pip
12. python -m pip install -U matplotlib
13. pip install seaborn
14. pip install -U scikit-learn
15. pip install openpyxl
16. pip install nb-black
17. pip install xlwings
18. xlwings addin install
19. pip install natsort
### Save setup
1. pip freeze > requirements.txt### Load setup
1. pip install -r requirements.txt### Analysis:
1. notebooks/K-Means.ipynb
2. data/technical_support_data-2.csv### Tarball Data Extraction:
1. python tarball-handler.py#### Add to gitignore:
##### 1. custom components
##### 1.1 large files
##### 1.1.1 tarball
TCGA-PANCAN-HiSeq-801x20531.tar.gz##### 1.1.2 large data from tarball
data/gene_data/TCGA-PANCAN-HiSeq-801x20531/TCGA-PANCAN-HiSeq-801x20531/data.csv
data/gene_data/TCGA-PANCAN-HiSeq-801x20531/TCGA-PANCAN-HiSeq-801x20531/labels.csv### Note on data files and Large Data sets on GitHub:
Add data in own commit
in case of 50 MB GitHub warning### update to .gitignore:
```
*.json
!spec/*.json
```adapted to:
`!*/ProcessedData.xlsx`
[git ignore all files of a certain type, except those in a specific subfolder](https://stackoverflow.com/questions/4621072/git-ignore-all-files-of-a-certain-type-except-those-in-a-specific-subfolder)