Comments (7)
Just adding a note that we see this same issue with our Umbraco Cloud (13.4.0) sites using the default configuration. Not surprising as Cloud uses Azure App Services.
from examine.
regarding slot swapping - this was created/fixed in Umbraco: https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Examine.Lucene/UmbracoTempEnvFileSystemDirectoryFactory.cs with Umbraco PR umbraco/Umbraco-CMS#15571
which takes into account the site Id - perhaps, part of this needs to be ported to the built-in SyncedTempFileSystemDirectoryFactory. I'm not sure off the top of my head but presumably since SyncedTempFileSystemDirectoryFactory doesn't know anything about Umbraco's own UmbracoTempEnvFileSystemDirectoryFactory to sync locally too, it might be part of the solution
... ah, but i see, this is already taken into account since Umbraco creates the SyncedTempFileSystemDirectoryFactory itself here https://github.com/umbraco/Umbraco-CMS/pull/15571/files#diff-5bf01722a10a69e85bb95802d6e0e7e4baf9b8e6a4abd1caf184f8cdc73c0124R29-R43
from examine.
Thanks for reporting. The answer to this will be 'it depends on a lot of things'.
I would strongly encourage you to fully understand the challenges of Lucene in Azure, I did a full talk on this at CodeGarden: https://youtu.be/qXKGVjTlEOk?si=uq7UQ9J5Ka4lTp-j
This and similar issues could occur depending on:
- How you are doing deployments
- If you are running via zip deployments
- If you are doing slot swapping
- If you are load balancing and have some misconfigurations (most common)
This PR will fix the slot swapping issue umbraco/Umbraco-CMS#15571 which is part of Umbraco 13.2 There's a good long thread on a related issue here too umbraco/Umbraco-CMS#15783 but I believe that thread is fixed with umbraco/Umbraco-CMS#15571.
As for the error, the top part of the error is what is important:
Lucene.Net.Index.CorruptIndexException: invalid deletion count: 2 vs docCount=1 (resource: BufferedChecksumIndexInput(SimpleFSIndexInput(path="C:\home\site\wwwroot\umbraco\Data\TEMP\ExamineIndexes\MembersIndex\segments_vd")))
at Lucene.Net.Index.SegmentInfos.Read(Directory directory, String segmentFileName)
at Lucene.Net.Index.IndexFileDeleter..ctor(Directory directory, IndexDeletionPolicy policy, SegmentInfos segmentInfos, InfoStream infoStream, IndexWriter writer, Boolean initialIndexExists)
at Lucene.Net.Index.IndexWriter..ctor(Directory d, IndexWriterConfig conf)
at Examine.Lucene.Directories.SyncedFileSystemDirectoryFactory
I've described what the SyncedFileSystemDirectoryFactory
is here: umbraco/Umbraco-CMS#15783 (comment)
... Most importantly - this setting can ONLY be applied to the primary node, it cannot be used on several nodes. If you are slot swapping the primary node, than you are in fact load balancing and that means that this setting is being used by 2x nodes. As above - 'it depends on a lot of things'.
What could be done in Examine since the SyncedFileSystemDirectoryFactory is part of this codebase (but only Umbraco uses it), is to add some better error handling and diagnostics to output what might be going on. Potentially this is an issue with the files in the main storage and not the local temp storage, but without adding this info to the log, we won't know so that is something I can look into.
Many of these reasons is why ExamineX was created.
from examine.
We get this on an Umbraco 13.1.0 website if we leave the site alone and dont do anything for a while.
We deploy via Devops /CI to a linux webapp.
There are hardly any content changes on the site - just a few news items here and there.
The app plan is a p0v3.
Everything is fine for the first few weeks after a deployment, and then the examine back office drops off, and the FE search stops working.
Its a single app and the settings are:
"MainDomLock": "FileSystemMainDomLock",
"LocalTempStorageLocation": "EnvironmentTemp"
"LuceneDirectoryFactory": "SyncedTempFileSystemDirectoryFactory"
With azure blob storage
We have quite a few sites with similar setups that dont emit this behaviour, so its puzzling.
Funnily enough our usual deployment process is to
- start the preprod slot
- deploy the code to it
- smoke test
- swap slots
- turn off the preprod slot
But in this case (by accident) our infra guy has set it up to deploy direct, so that is not in play here.
from examine.
@binraider Its hard to say why this would happen. Do you have any logs or anything that can help? This is really an Umbraco specific thing even though the code that Umbraco uses is in this repo. As above, the only thing I can do at this stage would be to add more logging and checks to see how in sync the main storage is vs the local file storage - but at this moment, I don't have the time to do this. One suggestion would be to use TempFileSystemDirectoryFactory
instead of SyncedTempFileSystemDirectoryFactory
and pay the overhead of index rebuilding when appservices moves your site and see if that resolves the problem. If it does, than its an issue with syncing or having corrupt files in the main storage. see https://docs.umbraco.com/umbraco-cms/reference/configuration/examinesettings
from examine.
The Umbraco logs mirror the OP's errors largely:
Lucene.Net.Index.CorruptIndexException: invalid deletion count: 2 vs docCount=1
I will change the factory to see if it makes a difference.
from examine.
Umbraco has it's own TempFileSystemDirectoryFactory
which i think should be used if you are using the non 'Synced' directory factory: https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Examine.Lucene/UmbracoTempEnvFileSystemDirectoryFactory.cs
from examine.
Related Issues (20)
- Examine GetFieldInternalQuery error HOT 20
- And( q=> q.GroupedOr(...)) adds and (+) to the first term of the groupedOr HOT 3
- Abstaction of LuceneIndex.cs HOT 5
- Any plan to release new Version HOT 2
- Sorting and paging highlight same both menu items HOT 6
- Content without an English (default language) version is not indexed HOT 1
- Indexing new valuesets adds unique documents instead of updating existing with the same __NodeId HOT 8
- Synchronous indexing HOT 8
- Failed to retrieve indexer details. HOT 3
- Query by Id does not return search result HOT 5
- ❓ How to tell if an Examine Index is Healthy? Possible ASPNET HealthCheck 💡 HOT 2
- NativeQuery performance CPU usage HOT 5
- Same query but different results if executed as NativeQuery vs Fluent API HOT 7
- Hardcoded default limit of max 500 search results is not obvious
- GetMultiFieldQuery shouldn't return an empty lucene query if there are no field values
- Examine on load balanced environment HOT 8
- Getting Searcher Synchronously? HOT 4
- Wildcard search in `GroupedOr()` HOT 13
- How to make a boosted phrase with FluentAPI? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from examine.