https://github.com/davealdon/html-table-scraper
:chart_with_upwards_trend: A C# app that makes sense of HTML tables. Created using Xamarin for your cross-compatible convenience.
https://github.com/davealdon/html-table-scraper
database html mysql pattern scrape table
Last synced: about 2 months ago
JSON representation
:chart_with_upwards_trend: A C# app that makes sense of HTML tables. Created using Xamarin for your cross-compatible convenience.
- Host: GitHub
- URL: https://github.com/davealdon/html-table-scraper
- Owner: DaveAldon
- Created: 2016-09-21T15:59:57.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2017-07-13T14:20:31.000Z (almost 9 years ago)
- Last Synced: 2025-03-29T04:07:04.663Z (about 1 year ago)
- Topics: database, html, mysql, pattern, scrape, table
- Language: C#
- Homepage:
- Size: 12.5 MB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# HTML Table Scraper
**Why?**
If you have one or a thousand HTML tables that you need to scan through and have their values inserted into a MySQL database, **AND** if you do not have a CSV file or any control over the source data, then this application is the perfect starting point for you!
This is a project that uses a few interesting and elegant tricks to retrieve data from an HTML table and insert everything into a database. The HTML Agility Pack is used to initially grab the data in between the tags and classes, however it does not supply enough functionality in order to accurately retrieve the data from an HTML table that uses rowspans or calendar schedule formats extensively. This application fills in where the HTML Agility Pack leaves off, and essentially makes those pesky rowspans a thing of the past. Additionally, there is a lot of query functionality to ask the database various questions about the retrieved data, however that portion is entirely optional.
The application is currently setup to scan through a University's publicly available faculty page and grab all of their various availabilties, making it easy to query the data.
# Out-of-the-Box Requirements
**(v1.0)**
* Mac OS 10.5+
* A running MySQL server
**(v1.1+)**
* Mac OS 10.5+
* A running MySQL server
* If you want it to work correctly, please change the Globals.cs settings to accomodate your database setup. This means you will need to re-compile at some point, so please see below for additional details
# For Development
**Master Branch**
[](https://www.bitrise.io/app/8eb52e35de8c2067)
[](https://www.codacy.com/app/crawford_2/HTML-Table-Scraper?utm_source=github.com&utm_medium=referral&utm_content=DaveAldon/HTML-Table-Scraper&utm_campaign=badger)
**External Dependancies Required:**
HTMLAgilityPack-PCL
MySQL.Data
**Requirements for compiling non UI elements:**
C# development enviroment (preferably Xamarin or Visual Studio)
**Requirements for compiling the Cocoa app:**
Mac OS with XCode
Xamarin
**Special Requirements**
MySQL.Data does not add via the Nuget packages UI, so it must be referenced manually after downloading the .dll
You MUST target the Xamarin.Mac .NET 4.5 Framework, NOT the Xamarin.Mac Mobile Framework which is the default