Giter Club home page Giter Club logo

olac-ng's People

Contributors

lingomat avatar

Watchers

 avatar  avatar

olac-ng's Issues

Improve re-harvest logic

Right now it's based on age of record. But specifically for dynamic archives, we can check identify and listrecords first hit to see if anything has changed, and if not, avoid reharvesting at all.

Obtain a basic design for search

This was axed from OLAC because it was broken, so I don't know what it looked like, or what it should look like.

It would be reasonable to search for language, and free text in the title and so on. Searching for content in the metadata block is probably not going to be very good given the variation in quality, with some exceptions like date and linguistic type (which could be a drop down since old only has three...).

remove xml view

not really any point to this unless we're extracting the item rather than the page (or entire static archive!)

Resumable incremental dynamic harvesting

This might not actually be possible because at first glance there seems to be no way to request specific pages.

But check anyway because it's not ideal that an error in a single api request from thousands will end up discarding everything. The harverster does not even keep disk copies and network cached hits are no good because resumption tokens must be freshly generated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.