Comments (5)
Hi there. It looks like you've supplied an invalid SQL query:
SELECT top 100 percent primarykeyfieldvalue,updatetime FROM referrals WHERE CAS
it looks like there's a missing part to the WHERE clause. Should this be configured somewhere?
from cogstack-pipeline.
Sorry - to clarify, this SQL is generated by spring batch. I'll get back to you later when i can look at the Docman implementation
from cogstack-pipeline.
You need to configure tika to read documents from the filesystem. Checkout the example config used in the integration test:
You should also activate the "docman" profile. Can you post your example config, minus sensitive info?
from cogstack-pipeline.
See config attached.
docman_config.txt
Also receiving this error now:
07:42:22.864mainINFOuk.ac.kcl.cleanup.CleanupBean159****************STARTUP INITIATED********************* 07:42:22.922mainINFOorg.springframework.scheduling.annotation.ScheduledAnnotationBeanPostProcessor260No TaskScheduler/ScheduledExecutorService bean found for scheduled processing 07:42:23.005pool-4-thread-1INFOuk.ac.kcl.utils.BatchJobUtils87Looking for last successful job 07:42:32.092pool-4-thread-1INFOuk.ac.kcl.utils.BatchJobUtils75Looking for status of last job 07:42:49.194pool-4-thread-1INFOuk.ac.kcl.scheduling.SingleJobLauncher94Last job failed. Attempting restart 07:42:49.195pool-4-thread-1INFOorg.springframework.batch.core.launch.support.SimpleJobOperator340Locating parameters for next instance of job with name=docmanJob_testing2 07:43:20.952pool-4-thread-1INFOuk.ac.kcl.utils.BatchJobUtils75Looking for status of last job 07:43:40.786pool-4-thread-1INFOorg.springframework.batch.core.launch.support.SimpleJobOperator362Attempting to launch job with name=docmanJob_testing2 and parameters={jobName=docmanJob_testing2, first_run_of_job=true, run.id=1} 07:43:48.693pool-4-thread-1INFOorg.springframework.batch.core.launch.support.SimpleJobLauncher133Job: [FlowJob: [name=docmanJob_testing2]] launched with the following parameters: [{jobName=docmanJob_testing2, first_run_of_job=true, run.id=1}] 07:45:39.117pool-4-thread-1INFOorg.springframework.batch.core.job.SimpleStepHandler146Executing step: [docmanJob_testing2MasterStep] 07:45:39.134pool-4-thread-1INFOcom.zaxxer.hikari.HikariDataSource93HikariPool-2 - Started. 07:46:09.462pool-4-thread-1ERRORorg.springframework.batch.core.step.AbstractStep229Encountered an error executing step docmanJob_testing2MasterStep in job docmanJob_testing2 org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 30000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:394) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:474) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:484) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:494) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:500) at uk.ac.kcl.partitioners.CogstackJobPartitioner.getFirstTimestampInTable(CogstackJobPartitioner.java:281) at uk.ac.kcl.partitioners.CogstackJobPartitioner.configureForPKTimeStampPartitions(CogstackJobPartitioner.java:124) at uk.ac.kcl.partitioners.CogstackJobPartitioner.partition(CogstackJobPartitioner.java:105) at org.springframework.batch.core.partition.support.SimpleStepExecutionSplitter.getContexts(SimpleStepExecutionSplitter.java:234) at org.springframework.batch.core.partition.support.SimpleStepExecutionSplitter.split(SimpleStepExecutionSplitter.java:177) at org.springframework.batch.core.partition.support.AbstractPartitionHandler.handle(AbstractPartitionHandler.java:59) at org.springframework.batch.core.partition.support.PartitionStep.doExecute(PartitionStep.java:106) at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:200) at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64) at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:134) at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:306) at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:135) at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128) at org.springframework.batch.core.launch.support.SimpleJobOperator.startNextInstance(SimpleJobOperator.java:364) at uk.ac.kcl.scheduling.SingleJobLauncher.startNextInstance(SingleJobLauncher.java:141) at uk.ac.kcl.scheduling.SingleJobLauncher.launchJob(SingleJobLauncher.java:95) at uk.ac.kcl.scheduling.ScheduledJobLauncher.doTask(ScheduledJobLauncher.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.SQLTransientConnectionException: HikariPool-2 - Connection is not available, request timed out after 30000ms. at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:147) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77) ... 41 common frames omitted Caused by: java.sql.SQLException: ResultSet is from UPDATE. No Data. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:957) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:896) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:885) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:6333) at com.mysql.jdbc.ConnectionImpl.loadServerVariables(ConnectionImpl.java:3871) at com.mysql.jdbc.ConnectionImpl.initializePropsFromServer(ConnectionImpl.java:3290) at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2299) at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2085) at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:795) at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:44) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at com.mysql.jdbc.Util.handleNewInstance(Util.java:404) at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:400) at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:327) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101) at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:316) at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:173) at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:443) at com.zaxxer.hikari.pool.HikariPool.access$500(HikariPool.java:66) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:568) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:561) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 common frames omitted 07:46:09.499pool-4-thread-1INFOorg.springframework.batch.core.launch.support.SimpleJobLauncher136Job: [FlowJob: [name=docmanJob_testing2]] completed with the following parameters: [{jobName=docmanJob_testing2, first_run_of_job=true, run.id=1}] and the following status: [FAILED]
Please note - the MYSQL version is very old MYSQL 3.23.58-nt. Could this also be the reason for it failing?
from cogstack-pipeline.
Issue solved. Details at https://docs.google.com/document/d/1Pq60-D8diaDESaYdzWXAXMw4j3DI_u_lgiLdOmgrPtc/edit?usp=sharing
from cogstack-pipeline.
Related Issues (20)
- Add support for PDF Form Parsing HOT 1
- [Feature] Support arbitrary parameter for SQL INSERT statement for jdbc_out
- Default for scheduler.rate does not follow the cron syntax HOT 1
- Post-processing of bio yodie result HOT 1
- De-Identification
- Test LSTM OCR Engine in Tesseract HOT 4
- ElasticsearchRest Client not working with scheduler HOT 4
- ElasticsearchRest Client will fail silently if index contains invalid character HOT 2
- PDF and Thumbnail generation will fail if Tika throws length warning
- Tika_deid not working since ES Upgrade HOT 6
- Add PDF Table Extraction using Tabula
- Cogstack docker download issues HOT 6
- Refactor the build process HOT 1
- add Nginx proxy to the stack for basic Auth
- Mechanism to prevent stale CogStack structured data in Elasticsearch HOT 1
- Refactor Integration and acceptance tests HOT 1
- Can we add more than one elastic search hosts in the config ? HOT 4
- fix: read from filesystem or object-store
- Unable to view links on confluence HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cogstack-pipeline.