Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pirtleshell/rust-gedcom
a gedcom parser written in rust :crab:
https://github.com/pirtleshell/rust-gedcom
gedcom gedcom-parser genealogy rust rust-lang
Last synced: about 1 month ago
JSON representation
a gedcom parser written in rust :crab:
- Host: GitHub
- URL: https://github.com/pirtleshell/rust-gedcom
- Owner: pirtleshell
- License: mit
- Created: 2020-05-29T02:45:16.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-01-05T14:24:38.000Z (almost 2 years ago)
- Last Synced: 2024-04-25T20:46:20.450Z (8 months ago)
- Topics: gedcom, gedcom-parser, genealogy, rust, rust-lang
- Language: Rust
- Homepage: https://crates.io/crates/gedcom
- Size: 177 KB
- Stars: 5
- Watchers: 4
- Forks: 5
- Open Issues: 5
-
Metadata Files:
- Readme: readme.md
- License: license.md
Awesome Lists containing this project
- awesome-gedcom - rust-gedcom - Rust library for GEDCOM parsing, with optional serialization to JSON. (Parsers / Rust)
README
# rust-gedcom
> a gedcom parser written in rust 🦀
## About this project
GEDCOM is a file format for sharing genealogical information like family trees! It's being made obsolete by [GEDCOM-X](https://github.com/FamilySearch/gedcomx) but is still widely used in many genealogy programs.
I wanted experience playing with parsers and representing tree structures in Rust, and noticed a parser for Rust did not exist. And thus, this project was born! A fun experiment to practice my Rust abilities.
It hopes to be ~~fully~~ mostly compliant with the [Gedcom 5.5.1 specification](https://edge.fscdn.org/assets/img/documents/ged551-5bac5e57fe88dd37df0e153d9c515335.pdf).
I have found this [5.5.2 specification](https://jfcardinal.github.io/GEDCOM-5.5.2/gedcom-5.5.2.html) useful in its assessment of which tags are worth supporting or not.
## Usage
This crate comes in two parts. The first is a binary called `parse_gedcom`, mostly used for my testing & development. It prints the `GedcomData` object and some stats about the gedcom file passed into it:
```bash
parse_gedcom ./tests/fixtures/sample.ged# outputs tree data here w/ stats
# ----------------------
# | Gedcom Data Stats: |
# ----------------------
# submitters: 1
# individuals: 3
# families: 2
# repositories: 1
# sources: 1
# multimedia: 0
# ----------------------
```The second is a library containing the parser.
## JSON Serializing/Deserializing with `serde`
This crate has an optional feature called `json` that implements `Serialize` & `Deserialize` for the gedcom data structure. This allows you to easily integrate with the web.For more info about serde, [check them out](https://serde.rs/)!
The feature is not enabled by default. There are zero dependencies if just using the gedcom parsing functionality.
Use the json feature with any version >=0.2.1 by adding the following to your Cargo.toml:
```toml
gedcom = { version = "", features = ["json"] }
```## 🚧 Progress 🚧
There are still parts of the specification not yet implemented and the project is subject to change. The way I have been developing is to take a gedcom file, attempt to parse it and act on whatever errors or omissions occur. In it's current state, it is capable of parsing the [sample.ged](tests/fixtures/sample.ged) in its entirety.
Here are some notes about parsed data & tags. Page references are to the [Gedcom 5.5.1 specification](https://edge.fscdn.org/assets/img/documents/ged551-5bac5e57fe88dd37df0e153d9c515335.pdf).
### Top-level tags
* `HEAD.SOUR` - p.42 - The source in the header is currently skipped.
* `SUBMISSION_RECORD` - p.28 - No attempt at handling this is made.
* `MULTIMEDIA_RECORD` - p.26 - Multimedia (`OBJE`) is not currently parsed.
* `NOTE_RECORD` - p.27 - Notes (`NOTE`) are also unhandled. (except in header)Tags for families (`FAM`), individuals (`IND`), repositories (`REPO`), sources (`SOUR`), and submitters (`SUBM`) are handled. Many of the most common sub-tags for these are handled though some may not yet be parsed. Mileage may vary.
## Notes to self
* Consider creating some Traits to handle change dates, notes, source citations, and other recurring fields.
## License
© 2021, [Robert Pirtle](https://robert.pirtle.xyz/). licensed under [MIT](license.md).