Comments (2)
Thank you for your quick reply. I used the wonderful service https://bleveanalysis.couchbase.com/analysis with export custom mapping to stepwise reduce the source which causes in my opinion the non-optimal search results and I came up with
err = index.AddCustomAnalyzer("thesaurus_german_wortlautmapping_2",
map[string]interface{}{
"type": custom.Name,
"tokenizer": unicode.Name,
"token_filters": []interface{}{
// de.NormalizeName,
de.StopName,
de.LightStemmerName,
lowercase.Name,
},
})
So de.NormalizeName is in my opinion causing unwanted Normalizations.
Thank you for your quick reply.
from bleve.
@the42 Your code looks good. My german
unfortunately is not :)
Per the rules set for de
, the token stream generated for :
Nägel
->["nagel"]
Fußpflege
->["fusspfleg"]
Fuß
->["fuss"]
You can test out analyzer behavior here - https://bleveanalysis.couchbase.com/analysis
So it seems - if just Fuß
does not exist in your documents, your search for Fuß
will not find anything.
As you can see here, these are the components used by the de
analyzer -
- Tokenizer:
unicodeTokenizer
- Tokenfilters:
toLowerFilter
stopDeFilter
normalizeDeFilter
lightStemmerDeFilter
If these rules do not suffice, I'd recommend building a custom analyzer with these rules and any additional ones that you see fit for your use case. Also, we're happy to accept any contributions in making the stock de
analyzer more accurate.
from bleve.
Related Issues (20)
- Vulnerability of dependency "golang.org/x/net" HOT 1
- Vulnerability of dependency "golang.org/x/net"
- website https://www.blevesearch.com is messy due to invalid(?) ssl cert HOT 5
- bug:Unstable, keywords that can be searched for in the first few meters and seconds, but cannot be indexed
- Very poor search performance
- github.com/blevesearch/bleve/v2/http/search.go bug? HOT 1
- Current `bleve` release broken by more recent change? HOT 4
- search_geoshape.go:107:32: not enough arguments in call to geojson.FilterGeoShapesOnRelation HOT 5
- NewRegexpQuery cannot return the correct result HOT 4
- Panic in ParseFromKeyValue HOT 1
- can not search bool fields HOT 1
- ANSI Windows Console Highlighting
- runtime error: invalid memory address or nil pointer dereference HOT 2
- html search error HOT 1
- How to lower hit's score/position if specific field is empty? HOT 2
- Adding synonyms to search queries HOT 3
- Configuration for only in-memory storage HOT 2
- [Bug] Spanish analyzer not normalizing all accented words. HOT 3
- How can I use the Vector Search feature? HOT 1
- Handling a field whose value may be a different language? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bleve.