Comments (5)
@Shazwazza i've done some extra investigation and tried to simplify the above query.
I've changed my code a bit and my query now looks like this:
+(FullTextContent_nl:ventilatie* FullTextContent:ventilatie* nodeName_nl:ventilatie* nodeName:ventilatie*) +(__NodeTypeAlias:contentpage __NodeTypeAlias:labitem __NodeTypeAlias:newsitem __NodeTypeAlias:project __NodeTypeAlias:technicalcommittee __NodeTypeAlias:standardsantennaitem __NodeTypeAlias:jobitem __NodeTypeAlias:tool __NodeTypeAlias:contact __NodeTypeAlias:technologiesdetail)
I've noticed that our CPU usage is also spiking when using this query in de examine management tab in our externalIndex to about 80-90%.
from examine.
Hi, you'll need to do some more investigating on this one to determine the underlying problem.
- What happens when you only search non 'FullTextContent' fields?
- What happens when you only do a very simple search?
- Is the problem directly related to the FullTextContent or not?
- Is the problem related to building the query or executing the query?
- Can you create a dump file for analysis of where the CPU bottleneck is?
- Can you replicate this from scratch with a simple setup?
from examine.
Hi @Shazwazza ,
I've done some more investigation.
We've got some pages with a lot of content (like a lot lot :-) ).
The FullTextSearch package creates an extra property in the externalIndex with a string that is generated based on the content of that page.
We notice that some pages have a fullTextSearch value in examine of 1 million+.
I'm guessing that this can cause some serious issues in performance when quering the fullTextSearch property with an examine query?
And if so - What would be the ideal max length (if there is any) that a examine property value can have?
Thanks a lot!
from examine.
You should also check if the fulltextsearch project specifies LuceneSearchOptions.AllowLeadingWildcard = true. If that is the case, this is most likely the cause for the performance degredation. Lucene specifically says that enabling this
// When set, * or ? are allowed as the first character of a PrefixQuery and WildcardQuery.
// Note that this can produce very slow queries on big indexes.
from examine.
I will close this issue since I don't think this is an issue with Examine but more the usage of Examine via the FullTextSearch package. As for what is an ideal max length in a field, you'll need to investigate lucene docs/forums regarding that. I assume there is some performance penalties for indexing values of such huge sizes.
from examine.
Related Issues (20)
- Examine GetFieldInternalQuery error HOT 20
- And( q=> q.GroupedOr(...)) adds and (+) to the first term of the groupedOr HOT 3
- Abstaction of LuceneIndex.cs HOT 5
- Any plan to release new Version HOT 2
- Sorting and paging highlight same both menu items HOT 6
- Content without an English (default language) version is not indexed HOT 1
- Indexing new valuesets adds unique documents instead of updating existing with the same __NodeId HOT 8
- Synchronous indexing HOT 8
- Failed to retrieve indexer details. HOT 3
- Query by Id does not return search result HOT 5
- ❓ How to tell if an Examine Index is Healthy? Possible ASPNET HealthCheck 💡 HOT 2
- Same query but different results if executed as NativeQuery vs Fluent API HOT 7
- Hardcoded default limit of max 500 search results is not obvious
- GetMultiFieldQuery shouldn't return an empty lucene query if there are no field values
- Examine on load balanced environment HOT 8
- Getting Searcher Synchronously? HOT 4
- Lucene.Net.Index.CorruptIndexException: invalid deletion count: 2 vs docCount=1 HOT 7
- Wildcard search in `GroupedOr()` HOT 13
- How to make a boosted phrase with FluentAPI? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from examine.