Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition

The official online compendium for Mining the Social Web, 2nd Edition (O'Reilly, 2013)
https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition

Last synced: 3 months ago
JSON representation

The official online compendium for Mining the Social Web, 2nd Edition (O'Reilly, 2013)

Lists

README

        

HTTP 301: Don't Use This Repository - 17 Jan 2019
========================================

There's good news! Mining the Social Web is now availabe in it's 3rd Edition, and there's a fully updated repository available with all of the latest changes that you will definitely not want to miss out on: the code has been fully revised and ported to Python 3, the runtime has been converted to a more convenient Docker-based setup, and there's a brand new chapter on mining Instagram data.

My co-author, Mikhail Klassen, now maintains the code, and you can get it here: https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

Enjoy!

Matthew A. Russell

Jan 17, 2019

Mining the Social Web (2nd Edition)
=================================

## Summary

_Mining the Social Web, 2nd Edition_ is available through O'Reilly Media, Amazon, and other fine book retailers. [Purchasing the ebook directly from O'Reilly](http://bit.ly/135dHfs) offers a number of great benefits, including a variety of digital formats and continual updates to the text of book for life! Better yet, if you choose to use O'Reilly's DropBox or Google Drive synchronization, your ebooks will automatically update every time there's an update. In other words, you'll always have the latest version of the book if you purchase the ebook through O'Reilly, which is why it's the recommended option in comparison to a paper copy or other electronic version. (If you prefer a [paperback or Kindle version from Amazon](http://amzn.to/GPd59m), that's a fine option as well.)

There's an incredible turn-key virtual machine experience for this second edition of the book that provides you with a powerful social web mining toolbox. This toolbox provides the ability to explore and run all of the source code in a hassle-free manner. All that you have to do is [follow a few simple steps](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html) to get the virtual machine installed, and you'll be running the example code in as little as 20-30 minutes. (And by the way, most of that time is waiting for files to download.)

This [short screencast](https://vimeo.com/72383764) demonstrates the steps involved in installing the virtual machine, which installs every single dependency for you automatically and save you a lot of time. Even sophisticated power users tend to prefer using it versus using their own environments.

If you experience any problems at all with installation of the virtual machine, file an issue here on GitHub. Be sure to also follow [@SocialWebMining](http://twitter.com/socialwebmining) on Twitter and like http://facebook.com/MiningTheSocialWeb on Facebook.

Be sure to also visit http://MiningTheSocialWeb.com for additional content, news, and updates about the book and code in this GitHub repository.

## Preview the Full-Text of Chapter 1 (Mining Twitter)

Chapter 1 of the book provides a gentle introduction to hacking on Twitter data. It's available in a variety of convenient formats

* A free [PDF download](http://bit.ly/135dHfs)
* An [online ebook](http://bit.ly/1an184a) excerpt
* An [IPython Notebook (ipynb) file](http://bit.ly/1aIXjFf) (checked into this repository)

Choose one, or choose them all. There's no better way to get started than following along with the opening chapter.

## Preview the IPython Notebooks

This edition of _Mining the Social Web_ extensively uses [IPython Notebook](http://ipython.org/notebook.html) to facilitate the learning and development process. If you're interested in what the example code for any particular chapter does, the best way to preview it is with the links below. When you're ready to develop, pull the source for this GitHub repository and follow the instructions for installing the virtual machine to get started.

A bit.ly bundle of all of these links is also available: http://bit.ly/mtsw2e-ipynb

* [Chapter 0 - Preface](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 0 - Preface.html)
* [Chapter 1 - Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 1 - Mining Twitter.html)
* [Chapter 2 - Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 2 - Mining Facebook.html)
* [Chapter 3 - Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 3 - Mining LinkedIn.html)
* [Chapter 4 - Mining Google+: Computing Document Similarity, Extracting Collocations, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 4 - Mining Google+.html)
* [Chapter 5 - Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 5 - Mining Web Pages.html)
* [Chapter 6 - Mining Mailboxes: Analyzing Who's Talking To Whom About What, How Often, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 6 - Mining Mailboxes.html)
* [Chapter 7 - Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 7 - Mining GitHub.html)
* [Chapter 8 - Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing Over RDF, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 8 - Mining the Semantically Marked-Up Web.html)
* [Chapter 9 - Twitter Cookbook](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 9 - Twitter Cookbook.html)
* [Appendix A - Virtual Machine Experience](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html)
* [Appendix B - OAuth Primer](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix B - OAuth Primer.html)
* [Appendix C - Python & IPython Notebook Tips](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix C - Python & IPython Notebook Tips.html)

## Blog & Screencasts

Be sure to bookmark the [Mining the Social Web Vimeo Channel]() to stay up to date with short instructional videos that demonstrate how to use the tools in this repository. More screencasts are being added all the time, so check back often -- or better yet, subscribe to the channel.


Installing the Virtual Machine

A ~3 minute screencast on installing a powerful toolbox for social web mining.

View a collection of all available screencasts at http://bit.ly/mtsw2e-screencasts

You might also benefit from the content that is being regularly added to the companion blog at http://MiningTheSocialWeb.com

## The _Mining the Social Web_ Virtual Machine

_You may enjoy [this short screencast](https://vimeo.com/72383764) that demonstrates the step-by-step instructions involved in installing the book's virtual machine._

The code for _Mining the Social Web_ is organized by chapter in an [IPython Notebook](http://ipython.org/notebook.html) format to maximize enjoyment of following along with examples as part of an interactive experience. Unfortunately, some of the Python dependencies for the example code can be a little bit tricky to get installed and configured, so providing a completely turn-key virtual machine to make your reading experience as simple and enjoyable as possible is in order. Even if you are a seasoned developer, you may still find some value in using this virtual machine to get started and save yourself some time. The virtual machine is powered with [Vagrant](http://vagrantup.com/), an amazing development tool that you'll probably want to know about and arguably makes working with virtualization even easier than a native [Virtualbox](http://www.virtualbox.org/) or VMWare image.

## Quick Start Guide

The recommended way of getting started with the example code is by taking advantage of the Vagrant-powered virtual machine as illusrated in [this short screencast](https://www.youtube.com/watch?v=BTyKPMfi_JQ). After all, you're more interested in following along and learning from the examples than installing and managing all of the system dependencies just to get to that point, right?

[Appendix A - Virtual Machine Experience](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html) provides clear step-by-step instructions for installing the virtual machine and is intended to serve as a quick start guide.

## The _Mining the Social Web_ Wiki

This project takes advantage of its GitHub repository's [wiki](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki) to act as a point of collaboration for consumers of the source code. Feel free to use the wiki however you'd like to share your experiences, and create additional pages as needed to curate additional information.

One of the more important wiki pages that you may want to bookmark is the [Advisories](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki/Advisories) page, which is an archive of notes about particularly disruptive commits or other changes that may affect you.

Another page of interest is a listing of all [100+ numbered examples](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki/Numbered-Examples) from the book that conveniently hyperlink to read-only version of the IPython Notebooks

## "Premium Support"

The source code in this repository is free for your use however you'd like. If you'd like to complete a more rigorous study about social web mining much like you would experience by following along with a textbook in a classroom, however, you should consider picking up a copy of [Mining the Social Web](http://bit.ly/135dHfs) and follow along. _Think of the book as offering a form of "premium support" for this open source project._

The publisher's description of the book follows for your convenience:

How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

* Employ IPython Notebook, the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
* Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
* Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
* Build interactive visualizations with D3.js, a state-of-the-art HTML5 and JavaScript toolkit
* Take advantage of more than two-dozen Twitter recipes presented in O’Reilly’s popular and well-known cookbook format

The example code for this data science book is maintained in a public GitHub repository and is designed to be especially accessible through a turn-key virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.