Giter Club home page Giter Club logo

rust-gedcom's Introduction

rust-gedcom

a gedcom parser written in rust ๐Ÿฆ€

About this project

GEDCOM is a file format for sharing genealogical information like family trees! It's being made obsolete by GEDCOM-X but is still widely used in many genealogy programs.

I wanted experience playing with parsers and representing tree structures in Rust, and noticed a parser for Rust did not exist. And thus, this project was born! A fun experiment to practice my Rust abilities.

It hopes to be fully mostly compliant with the Gedcom 5.5.1 specification.

I have found this 5.5.2 specification useful in its assessment of which tags are worth supporting or not.

Usage

This crate comes in two parts. The first is a binary called parse_gedcom, mostly used for my testing & development. It prints the GedcomData object and some stats about the gedcom file passed into it:

parse_gedcom ./tests/fixtures/sample.ged

# outputs tree data here w/ stats
# ----------------------
# | Gedcom Data Stats: |
# ----------------------
#   submitters: 1
#   individuals: 3
#   families: 2
#   repositories: 1
#   sources: 1
#   multimedia: 0
# ----------------------

The second is a library containing the parser.

JSON Serializing/Deserializing with serde

This crate has an optional feature called json that implements Serialize & Deserialize for the gedcom data structure. This allows you to easily integrate with the web.

For more info about serde, check them out!

The feature is not enabled by default. There are zero dependencies if just using the gedcom parsing functionality.

Use the json feature with any version >=0.2.1 by adding the following to your Cargo.toml:

gedcom = { version = "<version>", features = ["json"] }

๐Ÿšง Progress ๐Ÿšง

There are still parts of the specification not yet implemented and the project is subject to change. The way I have been developing is to take a gedcom file, attempt to parse it and act on whatever errors or omissions occur. In it's current state, it is capable of parsing the sample.ged in its entirety.

Here are some notes about parsed data & tags. Page references are to the Gedcom 5.5.1 specification.

Top-level tags

  • HEAD.SOUR - p.42 - The source in the header is currently skipped.
  • SUBMISSION_RECORD - p.28 - No attempt at handling this is made.
  • MULTIMEDIA_RECORD - p.26 - Multimedia (OBJE) is not currently parsed.
  • NOTE_RECORD - p.27 - Notes (NOTE) are also unhandled. (except in header)

Tags for families (FAM), individuals (IND), repositories (REPO), sources (SOUR), and submitters (SUBM) are handled. Many of the most common sub-tags for these are handled though some may not yet be parsed. Mileage may vary.

Notes to self

  • Consider creating some Traits to handle change dates, notes, source citations, and other recurring fields.

License

ยฉ 2021, Robert Pirtle. licensed under MIT.

rust-gedcom's People

Contributors

pirtleshell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rust-gedcom's Issues

Unhandled Individual Tag

When trying to read a Gedcom file exported from ancestry.de the crate panics with multiple unhandled tags.

thread 'main' panicked at 'line 35: Unhandled Individual Tag: OCCU',

or

thread 'main' panicked at 'line 33: Unhandled Individual Tag: PROP',

maybe if the tag is unknown, it can just be skipped instead?

Parser Int Error at Tokenizer.rs 115:57

'called Result::unwrap() on an Err value: ParseIntError { kind: Empty }'

error message also detailed a file string on my machine containing a GitHub . com token ...

Accessing values of variables in data structures

Hi,

Is there a simpler way to get at the values of the data structure, such that when I do:

println!("Name: {:?}", &name.value);

which results in:

Name: Some("Robert Eugene /Williams/")

ideally results in

Name: "Robert Eugene /Williams/"

without the "Some()" parts?

Thanks!

Implementing header tag support

See pull request #7 or #6
I made a tiny mistake when submitting the pr so just fyi ( I accidentally also included the logging pr I made earlier) ๐Ÿ™ƒ

Add logging support (log crate) PRELIMINARY

I adding support for some of the things that log crate has for better logging, i.e. I replaced the println! with info! and panic! with error! but we can definitely do more robust things especially with slog in the future

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.