Giter Club home page Giter Club logo

bib2lod's People

Contributors

daveneiman avatar j2blake avatar rjyounes avatar zimeon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bib2lod's Issues

Change Configuration object to hold a key-value map

Remove getters and setters, store everything in a map, individual objects will use the parts of it they need and do relevant validation, throwing exceptions. Move all validation out of the Configuration object.

Convert minimal record

Create from minimal record: Instance and Title, Work and Title, Local Identifier, Item and Title (Is there always an item associated with a record?).

<?xml version="1.0" encoding="UTF-8"?>
<!-- Minimal MARC record. Cornell ILS requires only that a record have an  
  identifier (001), a LDR (which can have everything marked ‘no attempt to code’), 
  a 008 (ditto), and a title (130, 240, or 245).
-->
<collection xmlns="http://www.loc.gov/MARC21/slim">
  <record>
    <leader>01050cam a22003011  4500</leader>
    <controlfield tag="001">102063</controlfield>
    <controlfield tag="008">860506s1957    nyua     b    000 0 eng  </controlfield>
    <datafield tag="245" ind1="0" ind2="0">
      <subfield code="a">Clinical cardiopulmonary physiology.</subfield>
      <subfield code="c">Sponsored by the American College of Chest Physicians.  Editorial board: Burgess L. Gordon, chairman, editor-in-chief, Albert H. Andrews [and others]</subfield>
    </datafield>
  </record>  
</collection>

Functional tests not working

Second test fails:

def test02_cornell_ld4l_conversion(self):
"""Test Cornel LD4L conversions based on sample configuration file."""
indirs = 'sample-data/marcxml-to-ld4l/cornell'
outdir = self.tmpdir
for indir in glob.glob(os.path.join(indirs, '*')):
# FIXME - should look for *.xml in each dir and then build tests on that
src = os.path.join(indir, '102063.min.xml')
ref = os.path.join(indir, '102063.min.ttl')
dst = os.path.join(outdir, '102063.min.ttl')
config = example_config()
config['InputService']['source'] = src
config['OutputService']['destination'] = outdir
config_filepath = self.write_config(config)
out = run_bib2lod([config_filepath])
self.assertTrue(os.path.exists(dst))

      self.assertEqual(RDiffB(['data.ld4l.org/cornell']).compare_files([ref, dst]), 0)

E AssertionError: 20 != 0

tests_functional/test_bib2lod.py:64: AssertionError

Note that I modified the code in my directory to expect turtle output, since I found input/output comparisons easier. Change output format in the config to TURTLE.

I've examined the input and output and can't find any differences.

Connect Instance and Work

It seems that the current output for the minimal record has both a Work and an Instance but they are not connected. I assume there should be a bf:instanceOf or bf:hasInstance triple.

Add inferences to output

Add an inferencing module to generate inferences from the model before outputting. Start with inverse inferencing.

Define singleton classes

Not sure about this, but worth considering. Converter, Cleaner, Parser, InputService, OutputService, EntityBuilders, and EntityBuilder are all instantiated only once.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.