https://github.com/pirtleshell/scrape-a-grave
Scrape and Retrieve FindAGrave memorial page data and save them to an SQL database
https://github.com/pirtleshell/scrape-a-grave
citation genealogy scraper web-scraping
Last synced: 5 months ago
JSON representation
Scrape and Retrieve FindAGrave memorial page data and save them to an SQL database
- Host: GitHub
- URL: https://github.com/pirtleshell/scrape-a-grave
- Owner: pirtleshell
- License: mit
- Created: 2016-11-03T19:10:17.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2016-11-03T20:25:56.000Z (almost 9 years ago)
- Last Synced: 2025-04-28T18:09:56.257Z (5 months ago)
- Topics: citation, genealogy, scraper, web-scraping
- Language: Python
- Size: 5.86 KB
- Stars: 21
- Watchers: 6
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# scrape-a-grave
> Scrape and Retrieve [FindAGrave](http://findagrave.com) memorial page data and save them to an SQL database.
## Scraping
[FindAGrave](http://findagrave.com) is an index of gravemarkers from cemeteries around the world. Often when doing genealogy research, you don't want to rely on a webpage's future and so you want to download the information to your local file. This python script takes a list of Grave Marker numbers, or FindAGrave urls, scrapes the site for data and prints out a citation of the information. It is currently setup to also save the data in an SQL database.## Requirements
You are expected to have [Python3](https://www.python.org/downloads/). It also requires the BeautifulSoup package, downloadable through pip:
```sh
$ pip3 install bs4
```## Usage
Download these files and change the contents of input text to be a list of FindAGrave ids, or FindAGrave urls. Then run
```sh
$ python3 getgraveids.py
```The citations will be printed to the console and saved in an SQL database named `graves.db`.
It is also possible to **read links from a GEDCOM** by un-highlighting the ["read from gedcom" section](https://github.com/PirtleShell/scrape-a-grave/blob/master/getgraveids.py#L88). This assumes your GEDCOM source citations have a LINK field with the FindAGrave site.
## License
This is intended as a convenient tool for personal genealogy research. Please be aware of FindAGrave's [Terms of Service](https://secure.findagrave.com/terms.html).
MIT © [Robert Pirtle](https://pirtle.xyz)