Comments (10)
are you going to submit a pull request to both harvester and catalog?
from geoportal-server-harvester.
I just wanted to point out that this implementation is using 'XOAI' libraries released under https://raw.githubusercontent.com/DSpace/DSpace/master/LICENSE license. I would contact Esri legal department for advise whether we are allowed to mix such a library with our code.
from geoportal-server-harvester.
I can work to split into two change sets.
I need to bring them into line with the 2.5 release.
If there is a better OAI library, or codebase to utilize, it would be preferred. It only does oai_dc record, for now.
from geoportal-server-harvester.
Correction: it turns out that the particular OAI library used here (developed by lyncode) is released under Apache 2.0 licence (at least, this is what maven repository tells me).
from geoportal-server-harvester.
Then we can include this in the next release!
from geoportal-server-harvester.
If you issue a "Pull request", we will incorporate you changes into the code base.
from geoportal-server-harvester.
Put code on branch: https://github.com/CINERGI/geoportal-server-harvester/tree/oai_d1_harvest_2_5
having some issues with migrating to the 2.5 release.
from geoportal-server-harvester.
DataOne
Seems to be some XML parser conflicts.
30-Mar-2017 12:15:01.350 SEVERE [HARVESTING] com.esri.geoportal.harvester.support.ErrorLogger.logError Error processing task: PROCESS:: status: working, title: PROCESSOR: DEFAULT[], SOURCE: DataOne[d1-cn-url=https://cn.dataone.org/cn, d1-format-identifier=http://www.isotc211.org/2005/gmd-noaa], DESTINATIONS: [FOLDER[folder-root-folder=D:\dev_odm\geoportal_metadata, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true | Error reading data.
com.esri.geoportal.harvester.api.ex.DataInputException: Error reading data.
at edu.sdsc.dataone.source.DataOneBroker$DataOneIterator.hasNext(DataOneBroker.java:159)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43(DefaultProcessor.java:132)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess$$Lambda$608/18700217.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: org.xml.sax.SAXNotRecognizedException: Feature 'http://javax.xml.XMLConstants/feature/secure-processing' not recognized
at com.sun.xml.internal.bind.v2.util.XmlFactory.createParserFactory(XmlFactory.java:128)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.getXMLReader(UnmarshallerImpl.java:139)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:157)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:204)
at org.dataone.service.util.TypeMarshaller.unmarshalTypeFromStream(TypeMarshaller.java:297)
at org.dataone.client.rest.MultipartD1Node.deserializeServiceType(MultipartD1Node.java:870)
at org.dataone.client.v1.impl.MultipartCNode.listObjects(MultipartCNode.java:451)
at org.dataone.client.v1.impl.MultipartCNode.listObjects(MultipartCNode.java:473)
at edu.sdsc.dataone.source.DataOneBroker$DataOneIterator.hasNext(DataOneBroker.java:136)
... 3 more
Caused by: org.xml.sax.SAXNotRecognizedException: Feature 'http://javax.xml.XMLConstants/feature/secure-processing' not recognized
at com.ctc.wstx.sax.WstxSAXParserFactory.setFeature(WstxSAXParserFactory.java:144)
at com.sun.xml.internal.bind.v2.util.XmlFactory.createParserFactory(XmlFactory.java:121)
... 11 more
from geoportal-server-harvester.
OAI. Finding that the 4.x library is not supported.
Probably need to migrate to a lighter library, or just roll our own
https://github.com/xbib/oai
OAI error looks like it's breaking deep in the OAI fetch routines. The buffer needs to be increased:
30-Mar-2017 12:23:40.442 INFO [HARVESTING] com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43 Started harvest: PROCESSOR: DEFAULT[], SOURCE: OAI[oai-host-url=https://oai.datacite.org/oai, oai-metaFormat=oai_dc], DESTINATIONS: [FOLDER[folder-root-folder=D:\dev_odm\geoportal_metadata, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true
com.lyncode.xoai.serviceprovider.exceptions.InvalidOAIResponse: com.lyncode.xoai.serviceprovider.exceptions.HttpException: Error querying service. Returned HTTP Status Code: 502
at com.lyncode.xoai.serviceprovider.handler.ListRecordHandler.nextIteration(ListRecordHandler.java:100)
at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.hasNext(ItemIterator.java:40)
at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.<init>(ItemIterator.java:30)
at com.lyncode.xoai.serviceprovider.ServiceProvider.listRecords(ServiceProvider.java:65)
at edu.sdsc.oai.source.OAIBroker.initialize(OAIBroker.java:93)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.initialize(DefaultProcessor.java:98)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43(DefaultProcessor.java:128)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess$$Lambda$608/18700217.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.lyncode.xoai.serviceprovider.exceptions.HttpException: Error querying service. Returned HTTP Status Code: 502
at com.lyncode.xoai.serviceprovider.client.HttpOAIClient.execute(HttpOAIClient.java:46)
at com.lyncode.xoai.serviceprovider.handler.ListRecordHandler.nextIteration(ListRecordHandler.java:69)
... 8 more
30-Mar-2017 12:23:41.408 SEVERE [HARVESTING] com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43 Error harvesting of PROCESSOR: DEFAULT[], SOURCE: OAI[oai-host-url=https://oai.datacite.org/oai, oai-metaFormat=oai_dc], DESTINATIONS: [FOLDER[folder-root-folder=D:\dev_odm\geoportal_metadata, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true
com.esri.geoportal.harvester.api.ex.DataProcessorException: cannot connect to xOAI
at edu.sdsc.oai.source.OAIBroker.initialize(OAIBroker.java:97)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.initialize(DefaultProcessor.java:98)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43(DefaultProcessor.java:128)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess$$Lambda$608/18700217.run(Unknown Source)
at java.lang.Thread.run(Thread.java:745)
from geoportal-server-harvester.
OAI-PMH capabilities has been added with #70 pull request.
from geoportal-server-harvester.
Related Issues (20)
- Harvester not removing content from geoportal that has been removed from source WAF HOT 5
- Item type of tiled image layers in ArcGIS Image not properly maintained when harvesting into ArcGIS Portal/Online HOT 1
- Harvester Issue to ArcGIS Portal - The size of each typeKeyword cannot be more than 256 characters
- Translation for AGOL/Portal HOT 1
- Harvester CKAN Broker Iterator Error for Data.gov
- Upgrading to 2.7 issue HOT 2
- Parse markdown to HTML in metadata XML
- Associate harvested metadata to existing sub-layers HOT 1
- Enable ArcGIS Online/Portal authentication in the harvester HOT 2
- Support for records in ISO 19115-3? HOT 2
- Enable layers option on ArcGIS Portal input broker. HOT 2
- Use title as output file name
- include reference to source metadata when publishing fails
- Harvest full XML from ArcGIS Server services and layers when available HOT 1
- Use ArcGIS Server layer metadata if available
- translate metadata when harvesting into geoportal
- translate locale information when harvesting to ArcGIS Online/Portal HOT 1
- support harvesting from OGC API: Records
- give CSW input broker option to switch http client
- include explicit sign out from web app HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geoportal-server-harvester.