typo3-solr / nutch-typo3-cms Goto Github PK
View Code? Open in Web Editor NEWApache Nutch plugins for TYPO3 CMS
License: Apache License 2.0
Apache Nutch plugins for TYPO3 CMS
License: Apache License 2.0
Hi there !
I would like to ask, if the project is still maintained ?
TYPO3 7 ? TYPO3 8 ?
Support of newer Solr versions ?
New versions of Nutch ?
Thank You very much for Your answer.
Best regards,
Fedir
Hi,
I've tried to build on Ubuntu 16.04 Server LTS. But ant throws an error. Would you please give me a hint whats going wrong?
thanks in advance,
Jan
Buildfile: /home/xyz/Downloads/nutch-typo3-cms-master/build.xml
init:
[mkdir] Created dir: /home/xyz/Downloads/nutch-typo3-cms-master/build
[mkdir] Created dir: /home/xyz/Downloads/nutch-typo3-cms-master/build/nutch
[mkdir] Created dir: /home/xyz/Downloads/nutch-typo3-cms-master/build/dist
compile-checkout-src:
[exec] A build/nutch/lib
[exec] A build/nutch/lib/native
[exec] A build/nutch/ivy
[exec] A build/nutch/src
[exec] A build/nutch/src/plugin
[exec] A build/nutch/src/plugin/index-more
[exec] A build/nutch/src/plugin/index-more/src
[exec] A build/nutch/src/plugin/index-more/src/test
[exec] A build/nutch/src/plugin/index-more/src/test/org
[exec] A build/nutch/src/plugin/index-more/src/test/org/apache
[exec] A build/nutch/src/plugin/index-more/src/test/org/apache/nutch
[exec] A build/nutch/src/plugin/index-more/src/test/org/apache/nutch/indexer
[exec] A build/nutch/src/plugin/index-more/src/test/org/apache/nutch/indexer/more
[exec] A build/nutch/src/plugin/index-more/src/java
[exec] A build/nutch/src/plugin/index-more/src/java/org
[exec] A build/nutch/src/plugin/index-more/src/java/org/apache
[exec] A build/nutch/src/plugin/index-more/src/java/org/apache/nutch
[exec] A build/nutch/src/plugin/index-more/src/java/org/apache/nutch/indexer
[exec] A build/nutch/src/plugin/index-more/src/java/org/apache/nutch/indexer/more
[exec] A build/nutch/src/plugin/parse-ext
[exec] A build/nutch/src/plugin/parse-ext/src
[exec] A build/nutch/src/plugin/parse-ext/src/test
[exec] A build/nutch/src/plugin/parse-ext/src/test/org
[exec] A build/nutch/src/plugin/parse-ext/src/test/org/apache
[exec] A build/nutch/src/plugin/parse-ext/src/test/org/apache/nutch
[exec] A build/nutch/src/plugin/parse-ext/src/test/org/apache/nutch/parse
[exec] A build/nutch/src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext
[exec] A build/nutch/src/plugin/parse-ext/src/java
[exec] A build/nutch/src/plugin/parse-ext/src/java/org
[exec] A build/nutch/src/plugin/parse-ext/src/java/org/apache
[exec] A build/nutch/src/plugin/parse-ext/src/java/org/apache/nutch
[exec] A build/nutch/src/plugin/parse-ext/src/java/org/apache/nutch/parse
[exec] A build/nutch/src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext
[exec] A build/nutch/src/plugin/urlnormalizer-pass
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org/apache
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org/apache/nutch
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer/pass
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org/apache
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org/apache/nutch
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer
[exec] A build/nutch/src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer/pass
[exec] A build/nutch/src/plugin/parse-html
[exec] A build/nutch/src/plugin/parse-html/src
[exec] A build/nutch/src/plugin/parse-html/src/test
[exec] A build/nutch/src/plugin/parse-html/src/test/org
[exec] A build/nutch/src/plugin/parse-html/src/test/org/apache
[exec] A build/nutch/src/plugin/parse-html/src/test/org/apache/nutch
[exec] A build/nutch/src/plugin/parse-html/src/test/org/apache/nutch/parse
[exec] A build/nutch/src/plugin/parse-html/src/test/org/apache/nutch/parse/html
[exec] A build/nutch/src/plugin/parse-html/src/java
[exec] A build/nutch/src/plugin/parse-html/src/java/org
[exec] A build/nutch/src/plugin/parse-html/src/java/org/apache
[exec] A build/nutch/src/plugin/parse-html/src/java/org/apache/nutch
[exec] A build/nutch/src/plugin/parse-html/src/java/org/apache/nutch/parse
[exec] A build/nutch/src/plugin/parse-html/src/java/org/apache/nutch/parse/html
[exec] A build/nutch/src/plugin/protocol-httpclient
[exec] A build/nutch/src/plugin/protocol-httpclient/src
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org/apache
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient
[exec] A build/nutch/src/plugin/parse-html/ivy.xml
[exec] A build/nutch/src/plugin/parse-html/plugin.xml
[exec] A build/nutch/CHANGES.txt
[exec] A build/nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java
[exec] svn: E200014: Checksum mismatch for '/home/siepmannj/Downloads/nutch-typo3-cms-master/build/nutch/lib/native/README.txt':
[exec] expected: 2c629b324b2e63e16b28d635ae1e384a
[exec] actual: 2703395cb677b36bbe04f0e868a10d2b
[exec]
[exec] Result: 1
compile-patch-nutch:
[patch] can't find file to patch at input line 5
[patch] Perhaps you used the wrong -p or --strip option?
[patch] The text leading up to this was:
[patch] --------------------------
[patch] |Index: conf/nutch-default.xml
[patch] |===================================================================
[patch] |--- conf/nutch-default.xml (Revision 1380217)
[patch] |+++ conf/nutch-default.xml (Arbeitskopie)
[patch] --------------------------
[patch] File to patch:
[patch] Skip this patch? [y]
[patch] Skipping patch.
[patch] 1 out of 1 hunk ignored
[patch] can't find file to patch at input line 39
[patch] Perhaps you used the wrong -p or --strip option?
[patch] The text leading up to this was:
[patch] --------------------------
[patch] |Index: src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java
[patch] |===================================================================
[patch] |--- src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java (Revision 1380217)
[patch] |+++ src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java (Arbeitskopie)
[patch] --------------------------
[patch] File to patch:
[patch] Skip this patch? [y]
[patch] Skipping patch.
[patch] 6 out of 6 hunks ignored
BUILD FAILED
/home/xyz/Downloads/nutch-typo3-cms-master/build.xml:59: 'patch' failed with exit code 1
Total time: 26 seconds
https://github.com/features/actions
https://github.com/actions/setup-java
Setup GitHub Actions for this repo as shown in the links above
With the update from Nutch 1.8 to Nutch 1.19, this patch had to be taken out as it was breaking the build process.
It should be implemented again.
The blog post "Release des Nutch-TYPO3-CMS Plugins 2.3.0" refers in the section "Aufsetzen der solr-ddev-site" to several ddev ssh
commands which are missing.
Maybe it's possible to update the blog post so it works also for newbies that are not so deep into Solr and Nutch ๐
I then tried to run the ddev command solr:examples:nutch
but that has issues as well:
The Nutch plugin 2.3.0 is only available as zip
file at GitHub but the script refers to a tar.gz
version. => that's fixed already, thx!
After manual unpacking and running the script again I end up with this error:
/mnt/ddev_config/commands/web/examples-nutch: line 84: cd: /var/www/html/.ddev/nutch/logs: No such file or directory
For the logs problem I already provided a pull request: TYPO3-Solr/solr-ddev-site#47
Injecting seed URLs
/var/www/html/.ddev/nutch/bin/nutch inject -Dmapreduce.job.reduces=2 -Dmapreduce.reduce.speculative=false -Dmapreduce.map.speculative=false -Dmapreduce.map.output.compress=true shop.dkd.de/crawldb /var/www/html/.ddev/nutch/urls/seed.txt
/var/www/html/.ddev/nutch/bin/nutch: line 184: /usr/lib/jvm/java-11-openjdk-amd64/bin/java: No such file or directory
/var/www/html/.ddev/nutch/bin/nutch: line 333: /usr/lib/jvm/java-11-openjdk-amd64/bin/java: No such file or directory
Error running:
/var/www/html/.ddev/nutch/bin/nutch inject -Dmapreduce.job.reduces=2 -Dmapreduce.reduce.speculative=false -Dmapreduce.map.speculative=false -Dmapreduce.map.output.compress=true shop.dkd.de/crawldb /var/www/html/.ddev/nutch/urls/seed.txt
Failed with exit value 127.
The reason for the second problem may be that I'm using DDEV on a Apple M2 Pro. There the path is /usr/lib/jvm/java-11-openjdk-arm64
instead of /usr/lib/jvm/java-11-openjdk-amd64
I'll try to come up with a fix for that as well.
When I fix the path, the script works!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.