Giter Club home page Giter Club logo

typo3-solr / ext-solr Goto Github PK

View Code? Open in Web Editor NEW
136.0 19.0 245.0 22.56 MB

A TYPO3 extension that integrates the Apache Solr search server with TYPO3 CMS. dkd Internet Service GmbH is developing the extension. Community contributions are welcome. See CONTRIBUTING.md for details.

License: GNU General Public License v3.0

PHP 81.76% HTML 2.78% CSS 0.27% JavaScript 14.59% Shell 0.54% Dockerfile 0.04% Makefile 0.02%
solr typo3-cms php search cms cms-extension typo3 typo3-cms-extension ext backend

ext-solr's Introduction

Latest Stable Version Latest Unstable Version TYPO3 13 Total Downloads Monthly Downloads Build Status

Code Coverage

Apache Solr for TYPO3 CMS

A TYPO3 extension that integrates the Apache Solr Enterprise Search Server into the TYPO3 CMS.

This extension serves as a base module that covers the most frequently used functionalities.

Additional features can be obtained through the following free add-ons:

  1. Apache Tika for TYPO3
  2. Apache Solr for TYPO3 - More Like This

and many more by searching for "solr" in the TYPO3 Extension Repository (TER).

In case you need access to additional features, consider to become a funding partner of the program. Further details including a comparison chart are provided at the program homepage.

URL
Repository: https://github.com/TYPO3-Solr/ext-solr
Read online: https://docs.typo3.org/p/apache-solr-for-typo3/solr/main/en-us/
TER: https://extensions.typo3.org/extension/solr
Homepage: https://www.typo3-solr.com/
Fund: https://shop.dkd.de/Produkte/Apache-Solr-fuer-TYPO3/

Powered by the TYPO3 community and

dkd Internet Service GmbH

ext-solr's People

Contributors

3l73 avatar astehlik avatar bmack avatar danielsiepmann avatar dkd-dobberkau avatar dkd-friedrich avatar dkd-kaehm avatar dkd-private-packagist avatar dkd-schmidt avatar dmitryd avatar frans-beech-it avatar froemken avatar georgringer avatar goldi42 avatar gordon81 avatar irnnr avatar jacobsenj avatar liwo avatar mabahe avatar markuskobligk avatar neufeind avatar nxpthx avatar peterkraume avatar sascha-egerer avatar saschanowak avatar sfroemkenjw avatar sgalinski avatar spoonerweb avatar thomashohn avatar timohund avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ext-solr's Issues

Use inline language labels for fronted Javascript

7LTS introduced inline language labels for Javascript. Certain components like facets, date picker, ... use language files based on JS.

Move those to language files and include them via inline language labels.

Example:

page = PAGE 
page.inlineLanguageLabelFiles {
  someLabels = EXT:myExt/Resources/Private/Language/locallang.xlf
  someLabels.selectionPrefix = idPrefix
  someLabels.stripFromSelectionName = strip_me
  someLabels.errorMode = 2
}

Status report crashes with exception when wrong solr port configured

Solr response does not appear to be valid JSON, please examine the raw response with getRawResponse() method

Apache_Solr_ParserException thrown in file
/var/www/7.5.0.local.typo3.org/typo3conf/ext/solr/Resources/Private/Php/SolrPhpClient/Apache/Solr/Response.php in line 189.

I think we should catch the exception and let the report fail.

Facet ViewHelper does not Fallback to current Page if no targetPage for Search is set

If plugin.tx_solr.search.targetPage is not set the facetRenderers targetLinkPageId will be set to 0 which will result in Facets rendered without any link.

Changing solr/Classes/ViewHelper/Facet.php line 87 from

$facetRenderer->setLinkTargetPageId($this->configuration['search.']['targetPage']);

to

$facetRenderer->setLinkTargetPageId($this->configuration['search.']['targetPage'] ? $this->configuration['search.']['targetPage'] : $GLOBALS['TSFE']->id);

fixes the Problem.

Apart from that, if plugin.tx_solr.search.targetPage is set and the targetPage in the FlexForm of the Plugin is also set, the Value of the FlexForm is ignored and the default TS value is used to render the link. I could not find a solution for that one yet.

[BUG] Frontend not rendering in TYPO3 version 7.5

Was fixed with a workaround in 18a57e6 we need to find a solution for this.

Currently we get the following error in the frontend:

Fatal error: Class 'Tx_Solr_ViewHelper_Lll' not found in /var/www/7.5.0.local.typo3.org/typo3_src/typo3/sysext/core/Classes/Utility/GeneralUtility.php on line 4345

ViewHelpers are loaded by using the old class names

ApacheSolrForTypo3\Solr\Template::addViewHelper

uses the old ViewHelper classnames like Tx_Solr_ViewHelper_Lll

The issue is with loadViewHelper

Furthermore, the new ViewHelper classNames actually shouldn't need to manually include the .php fiel anymore as it should already be loaded via Typo3 classloader.

Port documentation to rst

Port documentation from forge wiki to rst format.

(This is an easy task for anyone wanting to help out)

Namespace all the classes

Move all classes into namespaces. Namespace root is "ApacheSolrForTypo3\Solr".

  • Class names should not be changed
  • Priority is to have this work with TYPO3 6.2 first as a strict requirement.
  • If things work with TYPO3 7.x, that's cool, but optional and.
  • TYPO3 4.5 specific code, class footers, and php end tags can be removed.

PRs should be roughly one per folder/namespace and 1 commit per class to keep reviews managable.

Currently set a goal for 3.5, but 3.1 would be fine, too.

Use & #124; instead of & #166; to escape pipe

Is there a special reason why you use & #166; (broken vertical bar) instead of & #124; (simple vertical bar) to escape the pipe symbol in Classes/Template.php?

Background to my question:
I crawl an external website with Nutch. The external website uses <title>site name | page title</title>. On the result page this is shown with a ugly broken vertical bar: site name ¦ page title.

PHP Error on Admin Tools > Search page

After going through the installation manual, I see the following PHP error:

PHP Catchable Fatal Error: Argument 1 passed to ApacheSolrForTypo3\Solr\Domain\Model\ModuleData::setSite() must be an instance of Tx_Solr_Site, null given, called in /webroot/typo3conf/ext/solr/Classes/Backend/SolrModule/AbstractModuleController.php on line 141 and defined in /webroot/typo3conf/ext/solr/Classes/Domain/Model/ModuleData.php line 50

Using Typo3 6.2.14, and ext-solr 3.0.1.

(Attached the reports page section ... It is strange that there is no connection error because I put in a bad url on purpose.)

auswahl_302

Shell script not working anymore

Hello,

your shell script install-solr-existing-tomcat.sh is not working anymore, because you removed solr-4.8.1 from your mirror.

pages:extendToSubpages not recognized

When I deactivate a page and additionally set the flag extendToSubpages to 1 (quite handy if I want to hide a big part of a website), its subpages will be deactivated, too.

But Solr still lists them in the search results, which leads to 404 errors.
Since the records of the subpages aren't changed, Solr doesn't notice this, and the scheduler task remains at 100%.

Clearing the index and reindexing all pages at least removes the hidden pages from the search results, but also leads to indexing errors for all those subpages:

1319116885: exception 'RuntimeException' with message 'Failed to execute Page Indexer Request. See log for details. Request ID: 55c1b8c640395' in /var/www/html/t3kons/typo3conf/ext/solr/Classes/IndexQueue/PageIndexerRequest.php:163
Stack trace:
#0 /var/www/html/t3kons/typo3conf/ext/solr/Classes/IndexQueue/PageIndexer.php(409): Tx_Solr_IndexQueue_PageIndexerRequest->send('https://www.kon...')
#1 /var/www/html/t3kons/typo3conf/ext/solr/Classes/IndexQueue/PageIndexer.php(55): Tx_Solr_IndexQueue_PageIndexer->getAccessGroupsFromContent(Object(Tx_Solr_IndexQueue_Item), 0)
#2 /var/www/html/t3kons/typo3conf/ext/solr/Scheduler/IndexQueueWorkerTask.php(115): Tx_Solr_IndexQueue_PageIndexer->index(Object(Tx_Solr_IndexQueue_Item))
#3 /var/www/html/t3kons/typo3conf/ext/solr/Scheduler/IndexQueueWorkerTask.php(78): Tx_Solr_Scheduler_IndexQueueWorkerTask->indexItem(Object(Tx_Solr_IndexQueue_Item))
#4 /var/www/html/t3kons/typo3conf/ext/solr/Scheduler/IndexQueueWorkerTask.php(57): Tx_Solr_Scheduler_IndexQueueWorkerTask->indexItems()
#5 /var/www/typo3/typo3-6.2.14/typo3/sysext/scheduler/Classes/Scheduler.php(148): Tx_Solr_Scheduler_IndexQueueWorkerTask->execute()
#6 /var/www/typo3/typo3-6.2.14/typo3/sysext/scheduler/Classes/Controller/SchedulerModuleController.php(873): TYPO3\CMS\Scheduler\Scheduler->executeTask(Object(Tx_Solr_Scheduler_IndexQueueWorkerTask))
#7 /var/www/typo3/typo3-6.2.14/typo3/sysext/scheduler/Classes/Controller/SchedulerModuleController.php(191): TYPO3\CMS\Scheduler\Controller\SchedulerModuleController->executeTasks()
#8 /var/www/typo3/typo3-6.2.14/typo3/sysext/scheduler/Classes/Controller/SchedulerModuleController.php(137): TYPO3\CMS\Scheduler\Controller\SchedulerModuleController->getModuleContent()
#9 /var/www/typo3/typo3-6.2.14/typo3/sysext/scheduler/mod1/index.php(22): TYPO3\CMS\Scheduler\Controller\SchedulerModuleController->main()
#10 /var/www/typo3/typo3-6.2.14/typo3/mod.php(32): require('/var/www/typo3/...')
#11 {main}

Problem with TYPO3 7.4 and Initialize Solr Connections

I installed the current git-version with TYPO3 7.4. and initialized the connection on the rootpage with 'Initialize Solr Connections'. I've got the response Solr Connections initialized. Switching to: Admin Tools -> Search there's an exception:
#1: PHP Catchable Fatal Error: Argument 1 passed to ApacheSolrForTypo3\Solr\Domain\Model\ModuleData::setSite() must be an instance of ApacheSolrForTypo3\Solr\Site, null given, called in […]/typo3conf/ext/solr/Classes/Backend/SolrModule/AbstractModuleController.php on line 146 and defined in […]/typo3conf/ext/solr/Classes/Domain/Model/ModuleData.php line 53 (More information)

Do you have any advices?
(I think there's no problem with the connection, the same Solr connection works with 3.0.0 and TYPO3 6.2.)

Provide Solr Docker containers through install script

Add install scripts that simply load/install ready made Docker containers with the correct Solr version for the extension.

This can also help during development to quickly switch between multiple Solr versions.

Indexing of new records will crash if the name of the Indexing Queue Configuration is different from tablename

When you have in your SOLR configuration a index queue item for a table:

plugin.tx_solr.index.queue {
    news {
        table = tt_news
        fields {
            ...
        }
    }
}

the indexing of a new record for this table will insert the wrong indexing configuration name (indexing_configuration) in the tx_solr_indexqueue_item table. It will insert "tt_news" instead of "news" and with this indexing_configuration the Indexer gets NULL instead of an array of the itemIndexingConfiguration because there is no configuration plugin.tx_solr.index.queue.tt_news.fields.

The problem is in the Hook "processDatamap_afterDatabaseOperations" of ext-solr/Classes/IndexQueue/RecordMonitor.php line 232:

if ($this->isEnabledRecord($recordTable, $record)) {
    $configurationName = NULL;
    if ($recordTable !== 'pages') {
/* #232 */      $configurationName = $this->getIndexingConfigurationName($table, $uid);
    }
    $this->indexQueue->updateItem($recordTable, $recordUid, $configurationName);
}

To get the correct indexing_configuration ($configurationName) you have to use $recordTable and $recordUid:

if ($this->isEnabledRecord($recordTable, $record)) {
    $configurationName = NULL;
    if ($recordTable !== 'pages') {
/* #232 Changed: */     $configurationName = $this->getIndexingConfigurationName($recordTable, $recordUid);
    }
    $this->indexQueue->updateItem($recordTable, $recordUid, $configurationName);
}

Do I see that right?

Lowercase synonyms

Since the lowercase filter is executed before the synonym filter, the synonyms have to to be inserted in lowercase.

The example in the backend module should be fixed, a hint added and the synonyms automatically lowercased.

Multiple Hooks for detectSerializedValue will override individual return value

The Hook for detectSerializedValue will always override the detected value which means that even if the first returns TRUE as long as the last hook returns FALSE, it will still be false.

if ($serializedValueDetector instanceof Tx_Solr_SerializedValueDetector) {
    $isSerialized = (boolean) $serializedValueDetector->isSerializedValue($indexingConfiguration, $solrFieldName);
}

Not sure what the best solution is:
1.) Only execute hook if $isSerialized is FALSE as most likely the value should be serialized, if one hook determines serialization
2.) Pass $isSerialized to hook
3.) Allow NULL return and do not override value if NULL

Content-length mismatch when using javascriptFiles loadIn footer

At the time of the Content-length header processing @ typo3/sysext/frontend/Classes/Controller/TypoScriptFrontendController.php:3905, the Javascript files added by solr are not within $this->content.

They are added after the fact, resulting in cut-off content with all clients respecting the Content-length header.

locallangXMLOverride don't work correctly in Templates

Hey. i found a bug.
The feature to override an XML file don't work in Template.

$GLOBALS['TYPO3_CONF_VARS']['SYS']['locallangXMLOverride']['EXT:solr/Resources/Private/Language/PluginResults.xml'][] = 'EXT:test/Resources/Private/Extensions/Solr/Language/PluginResults.xml';

All marker with ###LLL:...### became the wrong XML file.
All marker handled with PHP ( ###RESULTS.SEARCHED_FOR### ) are fine.

solr-lll

typo3conf/ext/solr/Classes/PluginBase/PluginBase.php

$template->addViewHelper('LLL', array(
'languageFile' => $GLOBALS['PATH_solr'] .'Resources/Private/Language/' . str_replace('Pi', 'Plugin', $this->getPluginKey()) . '.xml',
'llKey' => $this->LLkey
));
I know that I can handle this with TypoScript but there are some problems with file encoding and turkish chars.

Moving "Lib" to "Resources/Private/Php/Lib" (or similar) ?

I think it would make sense to move Lib to Resources/Private/Php/Lib (or similar) as only the classes are needed and by moving it there it would be impossible to access anything inside the folder (e.g. PDFs or similiar) from the webserver directly (default htaccess rules).

HierarchicalFacetHelper renders wrong menu

Say we have a hierarchical ID structure like this:

array (
  '0-267' => [],
  '0-26' => [],
  '1-26/38' => [],
  '1-267/122' => [],
  '1-267/212' => [],
  '1-267/268' => []
)

The hierarchy will not be properly rendered, as the children of the node 267 are rendered also under the node 26.
The HierarchicalFacetHelper checks for isFirstPartOfStr which is wrong in that case.

AuthorizationService Frontend helper causes database error

The Tx_Solr_IndexQueue_FrontendHelper_AuthorizationService returns a user array that does not contain a value for uid.

This causes a problem when the AbstractUserAuthentication->createUserSession() method is called during the Frontend user initialization because the system tries to insert a NULL value in the ses_userid column:

Core: Session data could not be written to DB. Error: Column 'ses_userid' cannot be null

The solution would be to return a non empty uid value, e.g. 0 in the getUser() method.

TYPO3 7.5 "PHP Warning: array_flip() expects parameter 1 to be array, string given in ...." in Backend Module -> Index Queue

When clicking on Index Queue with Solr Collections already initialized, i get the following error in the backend:

PHP Warning: array_flip() expects parameter 1 to be array, string given in /usr/www/users/fhpooz/typo3_src/typo3_src-7.5.0/typo3/sysext/backend/Classes/Form/Element/SelectCheckBoxElement.php line 45

The stack trace shows the following method call as being responsible for the warning:

ApacheSolrForTypo3\Solr\Backend\IndexingConfigurationSelectorField::renderSelectCheckbox(array, "")

I did a typo3 upgrade from 7.4 to 7.5 with already initalized queue (pages only) - indexing and rendering seems to work as expected for now

Request for Hook in UsedFacetRenderer

Request from Mickael VANCLOOSTER over slack

Hi guys. Is it possible to add a hook in this Class UsedFacetRenderer after $facetText implement to create a custom text for different or custom FacetType. I do it like this and it works fine:

if (is_array($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['processUsedFacetText'])) {
            foreach($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['processUsedFacetText'] as $classReference) {
                $params = array(
                    'facetName' => $this->facetName,
                    'facetValue' => $this->filterValue,
                    'facetConfiguration' => $this->facetConfiguration
                );
                $procObj = &t3lib_div::getUserObj($classReference);
                $newText = $procObj->getUsedFacetText($params, $this);
                if(!empty($newText))
                    $facetText = $newText;
            }
        }

Index title and alttext attributes on HTML pages

Currently, some attributes are not indexed, although they may contain content. Examples are the title and the alt attributes.

This could probably be done by parsing the HTML document using libxml/DOMDocument instead of doing magic with regular expressions.

Add \Tx_Solr_ExtractingQuery

EXT:tika v2.0 will need an extracting query. EXT:solrfile already provides that query class.

Add \Tx_Solr_ExtractingQuery to EXT:solr 3.0.2, then extend as a namespaced shell ApacheSolrForTypo3\Tika\Service\Tika\SolrCellQuery in EXT:tika 2.0

[BUG] enabling all cores from solr-example-all-languages.xml produces solr error

When i copy solr-example-all-language-xml and the core configuration to a solr server with version 4.10.4 and the solr plugin i get the following error:

SolrCore Initialization Failures

core_my: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core core_my: Plugin init failure for [schema.xml] fieldType "text": Plugin init failure for [schema.xml] analyzer/tokenizer: Error loading class 'solr.ICUTokenizerFactory'. Schema file is /var/solr/typo3cores/conf/burmese/schema.xml
core_km: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core core_km: Plugin init failure for [schema.xml] fieldType "text": Plugin init failure for [schema.xml] analyzer/tokenizer: Error loading class 'solr.ICUTokenizerFactory'. Schema file is /var/solr/typo3cores/conf/khmer/schema.xml
core_lo: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core core_lo: Plugin init failure for [schema.xml] fieldType "text": Plugin init failure for [schema.xml] analyzer/tokenizer: Error loading class 'solr.ICUTokenizerFactory'. Schema file is /var/solr/typo3cores/conf/lao/schema.xml
core_pl: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load conf for core core_pl: Plugin init failure for [schema.xml] fieldType "text": Plugin init failure for [schema.xml] analyzer/filter: Error loading class 'solr.StempelPolishStemFilterFactory'. Schema file is /var/solr/typo3cores/conf/polish/schema.xml 

Use SYS_LASTCHANGED for latest changes on pages

Right now, pages get indexed with the tstamp value as "changed" in the Solr index.
This means that only changes to the page record itself are considered, and facets for the age of pages are basically worthless if the actual content is not taken into account.

Wouldn't it be better if SYS_LASTCHANGED is used, since it considers both changes to pages and tt_content?

Editing

$document->setField('changed',     $pageRecord['SYS_LASTCHANGED']);

in /typo3conf/ext/solr/Classes/Typo3PageIndexer.php, line 242 solved the problem for me.
I'm not sure if that was all there was to do.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.