Giter Club home page Giter Club logo

Comments (1)

pandzel-zz avatar pandzel-zz commented on June 24, 2024 1

The pattern language used here is called a "glob" syntax. More about it can be found here. Basically, it is a pattern very similar to the one which is used to filter files using 'dir' or 'ls' command on various operating systems. I doubt you can come up with any pattern which would skip a particular folder. Keep in mind that 'pattern' feature on WAF broker has been designed to filter individual files by it's extension rather than by folder it belongs to (by default the pattern is '**.xml' which only allows broker to grab xml files).

It doesn't mean it is not possible, but it requires a little bit of more explanation. All it harvester does is executes or schedules for execution something called: a task. Task represents a workflow of a data from the time it is acquired from to the input broker to the moment it is published to the output broker(s). In general it is as simply as that: take something from input and publish into the output. However, there is already existing concept of filters and transformers. Each of this entity is a little piece of code used to filter incoming data by some sort of predicate, while transformer can transform data from one form to another.

Both filters and transformers can be chained together to perform more complicated operations, moreover, they can be executed in some sort of parallel manner. For example, one could create task which would select only PDF files and publish to the local folder, at the same time it would select xml files then regardless what kind of metadata it is (FGDC, ISO, Dublin Core) it would normalize it to ISO only and publish to the instance of Geoportal Catalog 2.0, yet another thread would select only CSV files, publish it to the instance of 'koop' to create Feature Service based on the data from CSV, then register URL of each feature service to Geoportal Catalog 2.0 AND to the ArcGIS Online.

Such framework does already exist; what is missing is a rich collection of filters and transformers and a sophisticated UI alowed to build such tasks. At this moment only REGEX filter is available and one transformer which uses XSLT to transform one metadata format into another.

It's worth no mention that API harvester is being kept intentionally simple so such entities like filters, transformers, brokers can be developed as needed easily.

I hope my explanation shines some light on what is possibly coming in the future releases of harvester.

from geoportal-server-harvester.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.