Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rasbt/datacollect

A collection of tools to collect and download various data.
https://github.com/rasbt/datacollect

collect-lyrics python twitter-timeline

Last synced: 18 days ago
JSON representation

A collection of tools to collect and download various data.

Awesome Lists containing this project

README

        

# datacollect

**A collection of tools to collect and download various data.**

Often, I write simple scripts and tools to collect data for various "data science" tasks. I thought that it might be worthwhile to collect them in a central repository since they might be useful to others!

#### Contents
- [Collect Lyrics](./collect_lyrics)
- [Twitter Timeline](./twitter_timeline)
- [Collect Popular Music Tags](./collect_music_tags)
- [PDB Info Table](./pdb_infotable)
- [ZINC Molecule Downloader](./zinc_downloader)
- [Collect English Premier League Soccer Data](./collect_fantasysoccer)


**Important Note**
Please note that I developed and tested these tools in Python 3.x, and it could be possible that the scripts do not work flawlessly in Python 2.7.x due to the more challenging unicode handling.




## [Collect Lyrics](./collect_lyrics)

[[back to top](#contents)]

A [command line tool](./collect_lyrics) to download song lyrics given artist names and song titles.

![](./collect_lyrics/images/example_out.png)




## [Twitter Timeline](./twitter_timeline)

[[back to top](#contents)]

A [command line tool](./twitter_timeline) that downloads your personal twitter timeline in CSV format with optional keyword filter.

![](./twitter_timeline/images/python_tweets.png)

[Tutorial](http://nbviewer.ipython.org/github/rasbt/datacollect/blob/master/dataviz/twitter_cloud/twitter_wordcloud.ipynb) for turning your twitter timeline into a word cloud.
![](./dataviz/twitter_cloud/my_twitter_wordcloud_2_lowres.jpg)




## [Collect Popular Music Tags](./collect_music_tags)

[[back to top](#contents)]

A [command line tool](./collect_music_tags) to download popular tags for a list of songs from [last.fm](http://www.last.fm), e.g., for various data mining projects.

![](./collect_music_tags/images/example.png)




## [PDB Info Table](./pdb_infotable)
[[back to top](#contents)]

A [command line tool](./pdb_infotable) that creates an info table from a list of PDB files.

![](./pdb_infotable/images/example.png)

## [ZINC Molecule Downloader](./zinc_downloader)

[[back to top](#contents)]

A [command line tool](./zinc_downloader) for downloading 3D structures of small chemical molecules from http://zinc.docking.org.

![](./zinc_downloader/images/example-1.png)




## [Collect English Premier League Soccer Data](./collect_fantasysoccer)
[[back to top](#contents)]

A [command line tool](./collect_fantasysoccer) to Collect Fantasy Soccer data from the Premier League.
![](./collect_fantasysoccer/images/example_table.png)