Giter Club home page Giter Club logo

solr-import-export-json's Introduction

solr-import-export-json

Import/Export (or Restore/Backup) a Solr collection from/to a json file.

As the title states, this little project will help you to save your collection in json format and restore it where and when you need.

Please report issues at https://github.com/freedev/solr-import-export-json/issues

Install

To execute this console app you need to satisfy few dependency (java 11, git, maven), if you are a java developer probably you already have everything, on the other hand if not if you have Linux execute the following commands:

sudo apt update
sudo apt install git openjdk-11-jdk maven

git clone https://github.com/freedev/solr-import-export-json.git
cd solr-import-export-json
mvn clean package

Now you're ready.

How to use it

This is the list of command line parameters.

usage: myapp [-a <arg>] [-b <arg>] [-C] [-c <arg>] [-d] [-D] [-f <arg>]
       [-F <arg>] [-h] [-i <arg>] [-k <arg>] [-o <arg>] [-p <arg>] [-s
       <arg>] [-S <arg>] [-u <arg>] [-x <arg>]
solr-import-export-json

 -a,--actionType <arg>           action type
                                 [import|export|backup|restore]
 -b,--blockSize <arg>            block size (default 5000 documents)
 -C,--disableCursors             disable Solr cursors while reading
 -c,--commitDuringImport <arg>   Commit progress after specified number of
                                 docs. If not specified, whole work will
                                 be committed.
 -d,--deleteAll                  delete all documents before import
 -D,--dryRun                     dry run test
 -f,--filterQuery <arg>          filter Query during export
 -F,--dateTimeFormat <arg>       set custom DateTime format (default
                                 yyyy-MM-dd'T'HH:mm:ss.SSS'Z')
 -h,--help                       help
 -i,--includeFields <arg>        simple comma separated fields list to be
                                 used during export. if not specified all
                                 the existing fields are used
 -k,--uniqueKey <arg>            specify unique key for deep paging
 -o,--output <arg>               output file
 -p,--password <arg>             basic auth password
 -s,--solrUrl <arg>              solr url -
                                 http://localhost:8983/solr/collection_nam
                                 e
 -S,--skipFields <arg>           comma separated fields list to skip
                                 during export/import, this field list
                                 accepts for each field prefix/suffix a
                                 wildcard *. So you can specify skip all
                                 fields starting with name_*
 -u,--user <arg>                 basic auth username
 -x,--skipCount <arg>            Number of documents to be skipped when
                                 loading from file. Useful when an error
                                 occurs, so loading can continue from last
                                 successful save.

Real life examples

export all documents into a json file

./run.sh -s http://localhost:8983/solr/collection -a export -o /tmp/collection.json

import documents from json

./run.sh -s http://localhost:8983/solr/collection -a import -o /tmp/collection.json 

export part of documents, like adding a few fq Solr parameters to the export

 ./run.sh -s http://localhost:8983/solr/collection -a export -o /tmp/collection.json --filterQuery field1:A  --filterQuery field2:B

import documents from json but first delete all documents in the collection

 ./run.sh -s http://localhost:8983/solr/collection -a import -o /tmp/collection.json --deleteAll

export documents and skip few fields. In the example the will be skipped the fields: field1_a, all the fields starting with field2_ and all the fields ending with _date

 ./run.sh -s http://localhost:8983/solr/collection -a export -o /tmp/collection.json --skipFields field1_a,field2_*,*_date

Import documents, skip first 49835000 records from file, commit every 200000 documents, block size 5000 (faster than default 500)

./run.sh -s http://localhost:8983/solr/collection -a import -o /tmp/collection.json -x 49835000 -c 200000 -b 5000 

solr-import-export-json's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar freedev avatar ivanbaricic avatar schnoddelbotz avatar spyk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

solr-import-export-json's Issues

Unable to import with Solr auth enabled

Firstly, many thanks for this project. Its proving extremely useful. Currently I'm experiencing an issue whilst attempting an import with Solr Basic Auth enabled.

The command I am attempting to execute...

./run.sh -s http://username:[email protected]/solr/my-amazing-collection -a import -o ~/www/dk-solrcloud-data/my-amazing-collection.json

Relevant output below...

[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building import-export 1.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ import-export ---
14:55:43,425 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]
14:55:43,425 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
14:55:43,425 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [file:/Users/simonharper/www/dk-solrcloud-persist/target/classes/logback.xml]
14:55:43,473 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - Setting ReconfigureOnChangeFilter scanning period to 30 seconds
14:55:43,473 |-INFO in ReconfigureOnChangeFilter{invocationCounter=0} - Will scan for changes in [[/Users/simonharper/www/dk-solrcloud-persist/target/classes/logback.xml]] every 30 seconds.
14:55:43,473 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - Adding ReconfigureOnChangeFilter as a turbo filter
14:55:43,479 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
14:55:43,481 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]
14:55:43,492 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
14:55:43,511 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [org.apache.zookeeper] to INFO
14:55:43,511 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [org.apache.zookeeper] to false
14:55:43,512 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[org.apache.zookeeper]
14:55:43,512 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [org.apache.http] to INFO
14:55:43,512 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [org.apache.http] to false
14:55:43,512 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[org.apache.http]
14:55:43,512 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [it.damore.solr] to DEBUG
14:55:43,512 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [it.damore.solr] to false
14:55:43,512 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[it.damore.solr]
14:55:43,512 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
14:55:43,512 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]
14:55:43,513 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
14:55:43,513 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@1814dedc - Registering current configuration as safe fallback point
14:55:43.731 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.c.ConfigFactory - Current configuration Config [actionType=IMPORT, solrUrl=http://username:[email protected]/solr/dkhub, fileName=/Users/simonharper/www/dk-solrcloud-data/dkhub-staging.json, deleteAll=false, skipFieldsSet=[], filterQuery=null, uniqueKey=null, dryRun=false]
14:55:43.733 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.App - Found config: Config [actionType=IMPORT, solrUrl=http://username:[email protected]/solr/dkhub, fileName=/Users/simonharper/www/dk-solrcloud-data/dkhub-staging.json, deleteAll=false, skipFieldsSet=[], filterQuery=null, uniqueKey=null, dryRun=false]
[WARNING]
java.io.IOException: Server returned HTTP response code: 401 for URL: http://username:[email protected]/solr/my-amazing-collection/schema/uniquekey?wt=json
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0 (HttpURLConnection.java:1876)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream (HttpURLConnection.java:1474)
    at java.net.URL.openStream (URL.java:1045)
    at it.damore.solr.importexport.App.readUrl (App.java:144)
    at it.damore.solr.importexport.App.readUniqueKeyFromSolrSchema (App.java:127)
    at it.damore.solr.importexport.App.main (App.java:82)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:748)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.135 s
[INFO] Finished at: 2017-11-22T14:55:43Z
[INFO] Final Memory: 14M/303M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project import-export: An exception occured while executing the Java class. Server returned HTTP response code: 401 for URL: http://username:[email protected]/solr/my-amazing-collection/schema/uniquekey?wt=json -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Output when run via curl...

{"responseHeader":{"status":0,"QTime":1},"uniqueKey":"id"}

Any ideas?

Clarify Licensing on GitHub

The project in GitHub is listed as having an Apache 2.0 license, but I see GPLv3 in code and in a gpl.txt file. If it is still GPLv3 licensed and contains other Apache 2.0 licensed code then I would suggest then renaming the gpl.txt file to "gpl-3.0-license.txt" so that GitHub detects it as having multiple licenses.

An example screenshot of what that looks like.

Screen Shot 2022-07-26 at 5 15 43 PM

When you enabled Solr auth it could be export but not import

Thanks for your project! It helps me to solved my job very usefully.But when I import data for solrcloud, 401 was happened.
Perform the following:
./run.sh -s http://localhost:8983/solr/hdqs_solr_collection1 -a import -o /home/leq/solr.json -u solr -p password
problem occured
image
may you give some advices?

--skipCount on exporting

It is a great practical tool, it work great bu I have a record of 1.5 billion. I suppose it will take 5-6 days just import to json file. The problem is I want to skip some number of line . Because there can be errors and this will make me start all over. So how can I skip recorded Items. or lines.

Potential issue with exporting content

Hi, I have a potential issue with exporting content. Was wondering if you consider this a bug or an issue with my setup. Stacktrace below...

> bash run.sh -s https://my-solr-host.com/solr/my-solr-collection -a export -o tmp/output.json --uniqueKey id
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building import-export 1.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ import-export ---
10:20:10,456 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]
10:20:10,456 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
10:20:10,456 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [file:/Users/simonharper/www/dk-solrcloud-persist/target/classes/logback.xml]
10:20:10,501 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - Setting ReconfigureOnChangeFilter scanning period to 30 seconds
10:20:10,501 |-INFO in ReconfigureOnChangeFilter{invocationCounter=0} - Will scan for changes in [[/Users/simonharper/www/dk-solrcloud-persist/target/classes/logback.xml]] every 30 seconds.
10:20:10,501 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - Adding ReconfigureOnChangeFilter as a turbo filter
10:20:10,507 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
10:20:10,508 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]
10:20:10,518 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
10:20:10,535 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [org.apache.zookeeper] to INFO
10:20:10,535 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [org.apache.zookeeper] to false
10:20:10,535 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[org.apache.zookeeper]
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [org.apache.http] to INFO
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [org.apache.http] to false
10:20:10,536 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[org.apache.http]
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [it.damore.solr] to DEBUG
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [it.damore.solr] to false
10:20:10,536 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[it.damore.solr]
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
10:20:10,536 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]
10:20:10,536 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
10:20:10,537 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@1d63f4f2 - Registering current configuration as safe fallback point
10:20:10.761 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.c.ConfigFactory - Current configuration Config [actionType=EXPORT, solrUrl=https://solr.dkforeveryone.com/solr/tdk, fileName=tmp/tdk--20171207.json, deleteAll=false, skipFieldsSet=[], filterQuery=null, uniqueKey=id, dryRun=false]
10:20:10.765 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.App - Found config: Config [actionType=EXPORT, solrUrl=https://solr.dkforeveryone.com/solr/tdk, fileName=tmp/tdk--20171207.json, deleteAll=false, skipFieldsSet=[], filterQuery=null, uniqueKey=id, dryRun=false]
10:20:11.208 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.App - Found 3356 documents
10:20:11.208 [it.damore.solr.importexport.App.main()] INFO  i.d.s.i.App - Creating tmp/tdk--20171207.json
[WARNING]
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at https://solr.dkforeveryone.com/solr/tdk: Cursor functionality requires a sort containing a uniqueKey field tie breaker
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod (HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request (HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request (HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process (SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:943)
    at org.apache.solr.client.solrj.SolrClient.query (SolrClient.java:958)
    at it.damore.solr.importexport.App.readAllDocuments (App.java:283)
    at it.damore.solr.importexport.App.main (App.java:110)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:748)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.431 s
[INFO] Finished at: 2017-12-07T10:20:11Z
[INFO] Final Memory: 20M/312M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project import-export: An exception occured while executing the Java class. Error from server at https://solr.dkforeveryone.com/solr/tdk: Cursor functionality requires a sort containing a uniqueKey field tie breaker -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Export being attempted using 'id' as the unique key.

Logging is broken in latest version

My organization recently had to rebuild the docker container we use to run our SOLR backup. We now see this when the container starts:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

As our backup takes some time to run, we like to see the progress. This logging error turns off all logging. We had to revert to an image built earlier this year.

Had to build with solr 7.7.2 and jackson 2.9.6

I could not build the project out of the box due to error 1. I tried switching to solr 8.3.1 and received error 2. Finally able to build by using 7.7.2 and pegging jackson at 2.9.6.

Build errors:

1/ [ERROR] Failed to execute goal on project import-export: Could not resolve dependencies for project it.damore.solr:import-export:jar:1.0: Failed to collect dependencies at org.apache.solr:solr-test-framework:jar:8.1.1 -> com.jayway.jsonpath:json-path:jar:2.4.0: Failed to read artifact descriptor for com.jayway.jsonpath:json-path:jar:2.4.0: Failure to find org.eclipse.sensinact.gateway.nthbnd:parent:pom:1.5-SNAPSHOT in http://mvn-repo.wvrgroup.internal/maven was cached in the local repository, resolution will not be reattempted until the update interval of homeaway has elapsed or updates are forced -> [Help 1]

2/ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project import-export: Compilation failure: Compilation failure:
[ERROR] /Users/jgoyer/work/workspace-137-solr-exim/solr-import-export-json/src/main/java/it/damore/solr/importexport/App.java:[332,17] cannot find symbol
[ERROR] symbol: method setDateFormat(java.text.DateFormat)
[ERROR] location: variable objectMapper of type com.fasterxml.jackson.databind.ObjectMapper
[ERROR] /Users/jgoyer/work/workspace-137-solr-exim/solr-import-export-json/src/main/java/it/damore/solr/importexport/App.java:[333,17] cannot find symbol
[ERROR] symbol: method configure(com.fasterxml.jackson.databind.SerializationFeature,boolean)
[ERROR] location: variable objectMapper of type com.fasterxml.jackson.databind.ObjectMapper
[ERROR] -> [Help 1]

Issue with default DateTimeFormat

First of all, thank you for the very useful project!

I noticed there was an issue during dates import, that would set some dates one year ahead, ones after the 28th of December.

The default Solr datetime format is currently set to "YYYY-MM-dd'T'HH:mm:sss'Z'". The YYYY is the cause of the issue here, as it actually denotes the week of the year (more details here: https://dangoldin.com/2019/01/06/javas-simpledateformat-yyyy-vs-yyyy/)

The correct alternative should be: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" (also adding support for milliseconds).

Fails to build, ubuntu 20.04

When I try running mvn clean package I get the following error

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project import-export: Fatal error compiling: invalid flag: --release -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project import-export: Fatal error compiling
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.apache.maven.plugin.MojoExecutionException: Fatal error compiling
    at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute (AbstractCompilerMojo.java:1191)
    at org.apache.maven.plugin.compiler.CompilerMojo.execute (CompilerMojo.java:212)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.codehaus.plexus.compiler.CompilerException: invalid flag: --release
    at org.codehaus.plexus.compiler.javac.JavaxToolsCompiler.compileInProcess (JavaxToolsCompiler.java:179)
    at org.codehaus.plexus.compiler.javac.JavacCompiler.performCompile (JavacCompiler.java:169)
    at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute (AbstractCompilerMojo.java:1188)
    at org.apache.maven.plugin.compiler.CompilerMojo.execute (CompilerMojo.java:212)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.IllegalArgumentException: invalid flag: --release
    at com.sun.tools.javac.api.JavacTool.processOptions (JavacTool.java:206)
    at com.sun.tools.javac.api.JavacTool.getTask (JavacTool.java:156)
    at com.sun.tools.javac.api.JavacTool.getTask (JavacTool.java:107)
    at com.sun.tools.javac.api.JavacTool.getTask (JavacTool.java:64)
    at org.codehaus.plexus.compiler.javac.JavaxToolsCompiler.compileInProcess (JavaxToolsCompiler.java:125)
    at org.codehaus.plexus.compiler.javac.JavacCompiler.performCompile (JavacCompiler.java:169)
    at org.apache.maven.plugin.compiler.AbstractCompilerMojo.execute (AbstractCompilerMojo.java:1188)
    at org.apache.maven.plugin.compiler.CompilerMojo.execute (CompilerMojo.java:212)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Any ideas?

How fast if I want to export all collection data

Hi:
I have two question to know,
the first one: How fast if I want to export all collection data, if the collection have 1 billion records, and have 10 fields, and have 1T size?

the second one: Is the tool send request to solr to process? Because i find some solution is direct take Lucene data, and transform to json data?

Hope you can response me, Thanks very much.

date format

In App.java line 256, should it be the following instead?
DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'");

NullPointerException in readAllDocuments()

When I run the first example,
./run.sh -s http://localhost:8983/solr/collection -a export -o /tmp/collection.json

I get the following error:

15:07:40.329 [main] INFO  i.d.s.i.App - Found config: CommandLineConfig{actionType=EXPORT, solrUrl='http://localhost:8983/solr/collection', fileName='/tmp/collection.json', deleteAll=false, disableCursors=false, skipFieldSet=[], includeFieldSet=[], filterQuery='null', uniqueKey='null', dryRun=false, skipCount=0, commitAfter=null, blockSize=5000, dateTimeFormat='YYYY-MM-dd'T'HH:mm:sss'Z'', user='null', password='**********'}
15:07:40.348 [main] WARN  i.d.s.i.App - unable to find valid uniqueKey defaulting to "id".
Exception in thread "main" java.lang.NullPointerException
	at it.damore.solr.importexport.App.readAllDocuments(App.java:324)
	at it.damore.solr.importexport.App.main(App.java:116)

I'm using Java 8 and running the master branch of this project. (To be clear, I've replaced "collection" with a real collection name when I get this error.)

Note that the following command succeeds (no NPE):
./run.sh -s http://localhost:8983/solr/collection -a export -o /tmp/collection.json --user= --password=

create a release

would be good to create a release. so people can use this without having to build it

Issue building on Centos

[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /opt/solr-import-export-json/src/main/java/it/damore/solr/importexport/App.java:[32,36] cannot access org.apache.solr.client.solrj.SolrQuery
bad class file: /root/.m2/repository/org/apache/solr/solr-solrj/9.0.0/solr-solrj-9.0.0.jar(org/apache/solr/client/solrj/SolrQuery.class)
class file has wrong version 55.0, should be 52.0
Please remove or make sure it appears in the correct subdirectory of the classpath.
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.216 s
[INFO] Finished at: 2022-06-20T01:47:02Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.10.1:compile (default-compile) on project import-export: Compilation failure
[ERROR] /opt/solr-import-export-json/src/main/java/it/damore/solr/importexport/App.java:[32,36] cannot access org.apache.solr.client.solrj.SolrQuery
[ERROR] bad class file: /root/.m2/repository/org/apache/solr/solr-solrj/9.0.0/solr-solrj-9.0.0.jar(org/apache/solr/client/solrj/SolrQuery.class)
[ERROR] class file has wrong version 55.0, should be 52.0
[ERROR] Please remove or make sure it appears in the correct subdirectory of the classpath.
[ERROR]
[ERROR] -> [Help 1]

not even sure where to start.

i agree, a pre-built version would be awesome.

import json file

Could you please provide a sample json file that has multiple json objects as input with each object having multi-valued field. For example

[{"id": "1", "keywords_dict.en": ["losing money", "businessman and wins", "stock exchange trading", "stock traders", "new york stock exchange", "global financial crisis", "financial traders", "stock market"]}, {"id": "2", "keywords_dict.en": ["woman", "gift"]}]

Empty file on export without errors

Hello,

Thanks for sharing your tool on Github!

I did a first test on Windows and a then on a linux server.

First collection export went without visible error (1 million doc) but second export on another collection (2.5 million doc) gave a 0 bytes file without errors. Checked file rights, parameters, etc.

Exemple of what I see anonymised a a little bit:

image

First file was generated without error but I suspect that something is wrong as the same file on windows is 7 go (instead of 2.9 that I see on linux).

I suspect a memory problem as the system seems to be tight. I will report if I'm able to find an explanation.

Eric

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.