Giter Club home page Giter Club logo

silverstripe-solr-search's Introduction

MOVED TO CODEBERG

https://codeberg.org/Firesphere/silverstripe-solr

ISSUES AND PULL REQUESTS WILL NOT BE RESPONDED TO

Reopen issues on Codeberg.

License

[LICENSE.md](GPL v3)

ko-fi

Modern SilverStripe Solr Search

Full documentation or see the docs folder. Please read the documentation before asking questions. A lot of the questions are answered by reading the documentation.

Solarium documentation:

https://solarium.readthedocs.io

API Docs:

https://firesphere.github.io/solr-api/

Usage and installation docs:

https://firesphere.github.io/solr-docs/

Supports

Solr4 backward compatibility is available, default support is Solr8

Installation

composer require firesphere/solr-search

More details can be found in the docs.

Cow?

Cow!


             /( ,,,,, )\
            _\,;;;;;;;,/_
         .-"; ;;;;;;;;; ;"-.
         '.__/`_ / \ _`\__.'
            | (')| |(') |
            | .--' '--. |
            |/ o     o \|
            |           |
           / \ _..=.._ / \
          /:. '._____.'   \
         ;::'    / \      .;
         |     _|_ _|_   ::|
       .-|     '==o=='    '|-.
      /  |  . /       \    |  \
      |  | ::|         |   | .|
      |  (  ')         (.  )::|
      |: |   |;  U U  ;|:: | `|
      |' |   | \ U U / |'  |  |
      ##V|   |_/`"""`\_|   |V##
         ##V##         ##V##

silverstripe-solr-search's People

Contributors

andrewandante avatar dependabot[bot] avatar elliot-sawyer avatar firesphere avatar kinglozzer avatar marczhermo avatar petro-ivvysoft avatar phptek avatar rvxd avatar saibamen avatar ssmarco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

silverstripe-solr-search's Issues

Add Atomic indexing

Solr allows atomic indexing, which means only changed fields are updated.

This can add a lot of efficiency to the Solr Update that hooks in to DataObject save operations.

Geospatial search

Adding geospatial awareness to the search would be great. Solr can do this easily and thereby reducing the need for PHP overhead

Relations are all the same

For some reason, messages on my test environment are all marked to be in #general, and not in the actual channel they were posted in

Add queued job

A queued job to run at specific intervals to index a specific group instead of just running a CLI command.

Benefits:
No user interaction needed to index everything
Solr and PHP are less likely to run out of memory
More control from the CMS

Catch errors

Indexing can error out due to Java IO or PHP memory errors. This should be caught by the system and create a new queuedjob, depending #20 so the group that failed would be attempted to be re-added/updated soon.

Return a datalist of results

Currently, we're returning an ArrayData set, which is nice, but it doesn't contain the actual objects and therefore possibly missing the Link() etc.

Solution:
Use both. The Data set is small enough and highlighting etc. can be done differently

Use the schema API to update the managed schema

Schema.xml is ignored after Solr switched it to the managed schema.

This should be covered in the code changes as well. Right now, any changes to Schema.xml are ignored because Solr isn't aware of them, once a managed schema is created

Add Elevation

Results are currently not controlled by the CMS user

Problem:

  • Elevating results to the top, disregarding the score, is not possible

Solution:

  • Add Elevation and upload the elevation to Solr automatically after saving
  • Reload the elevation settings after every change

FTS Stubs

A lot of the features are "duplicating" from the old Fulltext Search module. But under a new name.

Some stubs need to be added to support backward compatibility, making it easier to migrate. Possibly as a "Solr-FTS-Compat" module?

CMS Management

Items that need to be managable:

  • Elevation
  • Boosting
  • Fields to be indexed
  • Facets

Search history

Add search history to the BaseQuery.

Search History gives engineers the option to show a user's most recent searches. An often required feature.

The value is available but not yet used as protected $history on the BaseQuery

Add support for SubSites

SubSites should do this themselves obviously, but adding support for the SubsiteID is not a bad idea.

There is no impact on the index, but useable.

Ongoing: PHP Docblock

There's a lack of documentation around methods at the moment, this should be fixed before a release.

Please keep updating docblocks while working on this

Add Post Store

Post store is the primary way to create cores where possible. The File Storage is not ideal

AC's:

  • Use a post
  • Enable uploading custom files through post
  • Let Solr store the files in it's own data folder
  • Skip .solr folder creation in project

Search reports

Option to review searches and query count, timings and efficiency of boosting/elevation

Index from 1 - endless

Indexing should simply add all documents. The FTS implementation of %% id % count is not reliable and might skip items that are added after start. A cleaner solution would be to simply build batches of 1000 documents, starting from 0, limit to 1000 and start from there.

This would prevent overhead of calculation as well.

WebDav store

Preferably avoided, but if needed it can be added.

Add background task

To prevent PHP errors, the index task should be optional to run in the background, as FTS originally did.

Unit testing

There are no tests yet, as I'm more discovering how Solarium works etc. than actually testing things.

Unit tests to be done. I'd like some input from Guy for this, as he seems to know a thing or two about testing the old FTS module

Manage schema

There's currently no process in place to manage the schema, even though it's a wished feature

Extract xml generation

XML Generation is cluttering up the index, using an abstraction would clear things up massively

Add boosting-at-query-time

Boosting implementation currently adds boosting at index time. Adding at query time boosting would be very valuable.
Resolved in #c613ac5e87348eda7eccfa6f89f562b56060472f

Use YML for configuration

Using YML is faster than running an init.

This is resolved in #c613ac5e87348eda7eccfa6f89f562b56060472f (Which has a wrong commit message, apologies)

Pagination

Yeah.... ehm, it doesn't do that properly yet

Add exclude option

Figure out how to do an exclude filter, as Solarium does not have this by default

Configurable Types.ss

types.ss is hardcoded in the base XML. This needs to be split back in to an editable/configurable file.

Index documents

No documents are indexed yet
Requirements:

  • Get fields to index from Solr
  • Determine classes to index
  • Get lookup chain
  • Generate doc
  • Update docs in batches of {Solr limit} (~1000)

Determine what needs testing

Standard get/set/add methods that do nothing but update the protected variable do not need actual testing. A simple testing stub that does them all would be enough.

e.g.

class SetGetAddTester extends SapphireTest
{
    testAll()
    {
         foreach($class as $testClass) { foreach ($variable as $testVar) { $this->assertIsset()); }}}

^^ Above code is a stub, not actually functional :)

Add boosting

No boosting is currently in place.

Solarium does support boosting, it's just not yet implemented

Problem:

  • Boosting is not working
  • Clients/users would like to have control over what is and isn't boosted

Solution:

  • Boosting has a sane default (e.g. Title)
  • Boosting is CMS controllable on a high level
  • Boosting applies the proper setting to filterQueries and Terms

Add CI

No tests are written yet, nor is there any integration with a CI.

Problem:

  • Lack of knowledge on what's going on actually

Solution:

  • Add CircleCI

Caveats:

  • CircleCI is not free for closed source projects it seems

Add "can" option

Current indexing and results do not take the canView in to account.

This can be mitigated by adding the canView as a field in Solr to filter on.

Requirements:

  • canView is added as a field
  • canView contains a list of groups that are allowed to view
  • canView returns a group-level or logged-in level value

Outcome:
Solr can filter on a multi-valued field. E.g. if the user is administrator, the filterQuery should say group:administrator
This should always return true from the method, and be in the index.
But, Group:content-authors, might not always be in the list of allowed viewers, therefore, Solr would automatically not select these items through a filterQuery.

Solarium has the support to do the above

Accept empty values

The query is passed on to the index which doesn't do a check for empty values at the moment. This should be fixed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.