Comments (6)
does this also happen when harvesting into a local folder?
from geoportal-server-harvester.
Was running to both a folder and server
With 6million records, was going to rewrite the folder to break it into ~1k blocks (or make an s3 store endpoint)
from geoportal-server-harvester.
Assumed it's a connection to the csw server.
19-May-2017 12:36:53.488 INFO [HARVESTING] com.esri.geoportal.harvester.support.ProgressLogger.printStatusLog Harvesting of PROCESS:: status: working, title: PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://www.sciencebase.gov/catalog/csw, cred-username=, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:SCIENCBASE], DESTINATIONS: [GPT[gpt-host-url=http://localhost:8080/geoportal, cred-username=gptadmin, cred-password=*****, gpt-cleanup=false], FOLDER[folder-root-folder=/opt/tomcat/webapps/metadata/, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true progress: 141500
19-May-2017 12:38:28.398 SEVERE [HARVESTING] com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43 Error harvesting of PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://www.sciencebase.gov/catalog/csw, cred-username=, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:SCIENCBASE], DESTINATIONS: [GPT[gpt-host-url=http://localhost:8080/geoportal, cred-username=gptadmin, cred-password=*****, gpt-cleanup=false], FOLDER[folder-root-folder=/opt/tomcat/webapps/metadata/, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true
com.esri.geoportal.harvester.api.ex.DataInputException: Error reading data.
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.next(CswBroker.java:179)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43(DefaultProcessor.java:136)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.http.client.HttpResponseException: Not Found
at com.esri.geoportal.commons.csw.client.impl.Client.readMetadata(Client.java:155)
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.next(CswBroker.java:174)
... 2 more
19-May-2017 12:38:28.398 SEVERE [HARVESTING] com.esri.geoportal.harvester.support.ErrorLogger.logError Error processing task: PROCESS:: status: working, title: PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://www.sciencebase.gov/catalog/csw, cred-username=, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:SCIENCBASE], DESTINATIONS: [GPT[gpt-host-url=http://localhost:8080/geoportal, cred-username=gptadmin, cred-password=*****, gpt-cleanup=false], FOLDER[folder-root-folder=/opt/tomcat/webapps/metadata/, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true | Error reading data.
com.esri.geoportal.harvester.api.ex.DataInputException: Error reading data.
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.next(CswBroker.java:179)
at com.esri.geoportal.harvester.engine.defaults.DefaultProcessor$DefaultProcess.lambda$new$43(DefaultProcessor.java:136)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.http.client.HttpResponseException: Not Found
at com.esri.geoportal.commons.csw.client.impl.Client.readMetadata(Client.java:155)
at com.esri.geoportal.harvester.csw.CswBroker$CswIterator.next(CswBroker.java:174)
... 2 more
19-May-2017 12:38:28.399 INFO [HARVESTING] com.esri.geoportal.harvester.support.ReportLogger.completed Completed processing task: PROCESS:: status: completed, title: PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://www.sciencebase.gov/catalog/csw, cred-username=, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:SCIENCBASE], DESTINATIONS: [GPT[gpt-host-url=http://localhost:8080/geoportal, cred-username=gptadmin, cred-password=*****, gpt-cleanup=false], FOLDER[folder-root-folder=/opt/tomcat/webapps/metadata/, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true
19-May-2017 12:38:28.399 INFO [HARVESTING] com.esri.geoportal.harvester.support.ReportStatistics.completed Harvesting of PROCESS:: status: completed, title: PROCESSOR: DEFAULT[], SOURCE: CSW[csw-host-url=https://www.sciencebase.gov/catalog/csw, cred-username=, cred-password=*****, csw-profile-id=urn:ogc:CSW:2.0.2:HTTP:APISO:SCIENCBASE], DESTINATIONS: [GPT[gpt-host-url=http://localhost:8080/geoportal, cred-username=gptadmin, cred-password=*****, gpt-cleanup=false], FOLDER[folder-root-folder=/opt/tomcat/webapps/metadata/, folder-cleanup=false]], INCREMENTAL: false, IGNOREROBOTSTXT: true completed at Fri May 19 12:38:28 UTC 2017. No. succeded: 283135, no. failed: 2
from geoportal-server-harvester.
One is server issue. dies at record 166666
https://www.sciencebase.gov/catalog/csw
<csw:GetRecords
xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
maxRecords="1"
startPosition="166666"
outputFormat="application/xml"
outputSchema="http://www.isotc211.org/2005/gmd"
resultType="results" service="CSW" version="2.0.2">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>full</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc" xmlns="http://www.opengis.net/ogc"
xmlns:gml="http://www.opengis.net/gml">
<ogc:PropertyIsLike escape="" singleChar="_" wildCard="%">
<ogc:PropertyName>AnyText</ogc:PropertyName>
<ogc:Literal>well</ogc:Literal>
</ogc:PropertyIsLike>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
from geoportal-server-harvester.
Pull request #72 provides ability to define 'AnyText' literal for any CSW input broker.
from geoportal-server-harvester.
search text filter implemented in harvester.
from geoportal-server-harvester.
Related Issues (20)
- HTTP ERROR 500 when accessing http://localhost:8080/harvester HOT 2
- Geoportal - Why do some users not have the ability to edit/download metadata if they are the owner/admin
- Failed to remove task/execute task HOT 5
- PyCSW harvest ESRI Geoportal 2.6.5 (CSW) failed HOT 1
- Harvester not removing content from geoportal that has been removed from source WAF HOT 5
- Item type of tiled image layers in ArcGIS Image not properly maintained when harvesting into ArcGIS Portal/Online HOT 1
- Harvester Issue to ArcGIS Portal - The size of each typeKeyword cannot be more than 256 characters
- Translation for AGOL/Portal HOT 1
- Harvester CKAN Broker Iterator Error for Data.gov
- Upgrading to 2.7 issue HOT 2
- Parse markdown to HTML in metadata XML
- Associate harvested metadata to existing sub-layers HOT 1
- Enable ArcGIS Online/Portal authentication in the harvester HOT 2
- Support for records in ISO 19115-3? HOT 2
- Enable layers option on ArcGIS Portal input broker. HOT 2
- Use title as output file name
- include reference to source metadata when publishing fails
- Harvest full XML from ArcGIS Server services and layers when available HOT 1
- Use ArcGIS Server layer metadata if available
- translate metadata when harvesting into geoportal
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geoportal-server-harvester.