https://github.com/omarsar/data_mining_2017_fall_lab
Contains information and instructions for the first Data Mining lab session for 2017 Fall.
https://github.com/omarsar/data_mining_2017_fall_lab
data data-analysis data-mining data-science data-visualization
Last synced: about 1 month ago
JSON representation
Contains information and instructions for the first Data Mining lab session for 2017 Fall.
- Host: GitHub
- URL: https://github.com/omarsar/data_mining_2017_fall_lab
- Owner: omarsar
- Created: 2017-09-18T00:35:03.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-10-07T10:41:23.000Z (about 7 years ago)
- Last Synced: 2025-04-10T20:56:22.945Z (6 months ago)
- Topics: data, data-analysis, data-mining, data-science, data-visualization
- Language: Jupyter Notebook
- Homepage:
- Size: 1.1 MB
- Stars: 14
- Watchers: 2
- Forks: 40
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: news_data_mining.ipynb
Awesome Lists containing this project
README
### Lab For Data Mining 2017 Fall @ NTHU
This repository contains all the instructions and necessary code for Data Mining 2017 (Fall) lab session.---
### Computing Resources
- Operating system: Preferably Linux or MacOS
- RAM: 8GB
- Disk space: Minimum 8GB---
### Software Requirements
Here is a list of the required programs and libraries necessary for this lab session:
- [Python 3+](https://www.python.org/download/releases/3.0/) (Note: coding will be done strictly on Python 3)
- Install latest version of Python 3
- [Anaconda](https://www.anaconda.com/download/) environemnt or any other environement (recommended but not required)
- Install anaconda environment
- [Jupyter](http://jupyter.org/) (Strongly recommended but not required)
- Install jupyter
- [Scikit Learn](http://scikit-learn.org/stable/index.html)
- Install `sklearn` latest python library
- [Pandas](http://pandas.pydata.org/)
- Install `pandas` python library
- [Numpy](http://www.numpy.org/)
- Install `numpy` python library
- [Matplotlib](https://matplotlib.org/)
- Install `maplotlib` for python
- [Plotly](https://plot.ly/)
- Install and signup for `plotly`
- [NLTK](http://www.nltk.org/)
- Install `nltk` library
- [WordCloud](https://github.com/amueller/word_cloud)
- Install library for generating word clouds---
### Test script
Open a Jupyter notebook and run the following commands. If you have properly installed all the necessary libraries you should see no error.
```python
import pandas as pd
import numpy as np
import nltk
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
import plotly.plotly as py
import plotly.graph_objs as go
import math
%matplotlib inline
from wordcloud import WordCloud
# my functions
import helpers.data_mining_helpers as dmh
import helpers.text_analysis as ta
```---
### Preview of Complete Jupyter Notebook
https://github.com/omarsar/data_mining_2017_fall_lab/blob/master/news_data_mining.ipynb