Giter Club home page Giter Club logo

Comments (8)

Shazwazza avatar Shazwazza commented on July 29, 2024 1

Its just your original question is directly relating to your indexes getting out of sync. It sounds like the health checks you have implemented don't actually check for whether they are in sync or not, and only if they are empty - Please note, Umbraco will automatically rebuild them if they are empty on startup so you shouldn't have to handle that yourself either.

To determine if your indexes are in sync would require some custom logic that doesn't currently exist. There would be a few ways to try to do that but the most ideal way would be for the node to simply query the local index for a specific record with a specific value and if it didn't match it would mean it is out of sync. How you would do that or other alternatives I'll have to leave up to you. Again, the most ideal way to deal with indexes, load balancing and azure is to use a hosted search service like Azure/Elastic search and use ExamineX, then there's nothing to worry about.

from examine.

Shazwazza avatar Shazwazza commented on July 29, 2024

Hi, yes this is all a known issue with using Lucene based indexes in Azure and load balancing. This is the reason why I created ExamineX so that you can have centrally managed hosted indexes instead of local Lucene indexes per node.

I've talked about this in great detail in a couple of talks:

There is no silver bullet to using Lucene based indexes on Azure, especially if you are load balancing. In order to even make Lucene indexes work in Azure even without load balancing a bunch of trickier needs to happen behind the scenes (i.e. %Temp% storage is required, etc...), then when you add in Load Balancing it gets even tricker because there is no central index, there is an index per node and as you say they could get out of sync for all sorts of reasons. They only stay in sync in Umbraco based on Umbraco's cache refreshers. Now if you add slot swapping to the mix, then things probably get even more complex.

What is the answer? Well, ExamineX with Azure or Elastic search is the best answer since it solves all of these issues. However, if you choose to continue to try to use Lucene based indexes in Azure + Load Balancing than there might be some options but will require custom implementations. For the most part, indexes will stay in sync with the CM but due to slot swapping, here's what happens:

When you swap your staging for your live, your staging site will have a local index based on the staging information from your CM staging site since it has only been kept in sync with your staging CM. This means that the local index on this node will need to be rebuilt so that it is in sync with your live database data. Similarly, the nucache file will also only be in sync with your staging CM, not your live CM so I'm not sure how you are currently working around this?

Indexes (and nucache) will be rebuilt automatically by Umbraco based on whether it is a cold boot ... This would be the ideal way to deal with this scenario. If your staging site (which is in sync with your staging DB) becomes live, then it will not be in sync and a cold boot should be executed. I'm not sure why this isn't documented anywhere on Umbraco docs site but to force a cold boot, you can clear out the umbraco/Data/Temp/DistCache folder. This is the folder that maintains txt files that indicate which 'instruction' Id in the database that it is in sync with. If this file doesn't exist, then a cold boot will be initiated (based on this code https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Infrastructure/Sync/LastSyncedFileManager.cs). You can see this technique is used in some Umbraco tests themselves: https://github.com/umbraco/Umbraco-CMS/blob/ce769ffff4613904bcc8c65103166a722db388a5/tests/Umbraco.TestData/LoadTestController.cs#L325

Perhaps when a site it swapped, it is not restarted which would mean that a cold boot doesn't take place since there is not re-boot? That is something you would need to investigate and would also depending on how you are doing the swap. If it is done programmatically, then you could probably force delete that folder and then do the swap.

from examine.

Shazwazza avatar Shazwazza commented on July 29, 2024

Actually, looking at the swap docs, the source site is restarted, but a cold boot will probably not occur because it has it's last synced file. If you could programmatically swap the slots, then you could first clear that folder and initiate the swap, this should cause a cold boot during its restart while it is now pointing to your production database. Alternatively, you could use utilize custom warmup https://learn.microsoft.com/en-us/azure/app-service/deploy-staging-slots?tabs=portal#Warm-up and initiate index rebuilds. FYI, this is how Umbraco rebuilds indexes on startup so you probably don't want to conflict with its own operations https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Infrastructure/Examine/RebuildOnStartupHandler.cs

This handler waits for one minute after the first http request is made to initiate the rebuilds (so that it doesn't interfere with site bootup/loading). If it is a cold boot, it will rebuild all of them, else only empty ones. You could potentially copy the code in this handler, remove the default one and add your own with custom logic to force cold boot re-indexing if you know the site has just been swapped.

These are only ideas I'm coming up with, but essentially, this is all based on Umbraco logic, not Examine.

from examine.

IOSven avatar IOSven commented on July 29, 2024

Hi @Shazwazza ,

first of all thank you for the clear explanation!
We've currently added a custom workaround where we use hangfire to run a recurring examine health check background task every hour.
On top of that we also run a single examine health check background task on startup with a delay of 3 minutes to give Umbraco some time to startup.

This health check is based on this piece of Umbraco code.
We simply check the document count / fieldContent / isHealth.Success bool, and if there is any problem we execute the following code:

if (_indexRebuilder.CanRebuild(indexName))
{
    _indexRebuilder.RebuildIndex(indexName);

    _logger.WriteHangfireConsole(LogLevel.Information, performContext, $"The index '{indexName}' is being rebuilt in the background.");
}

We're also looking to upgrade to Umbraco v13 and we're wondering if this bugfix would help us at all since they mention the following:

This should help with Azure App Slot Swapping indexing locking as the SiteName property can be made sticky to each slot with a different value (as we can do for the published cache already). It also can be used when debugging locally with multiple launch profiles

Thanks!
Sven

from examine.

Shazwazza avatar Shazwazza commented on July 29, 2024

@IOSven yes that change will help because of how the DistCache files are named along with the naming conventions for the index folders. This is probably why nucache works for you today with slot swaps but not Umbraco.

We've currently added a custom workaround where we use hangfire to run a recurring examine health check background task every hour.
On top of that we also run a single examine health check background task on startup with a delay of 3 minutes to give Umbraco some time to startup.

Please be aware of over index rebuilding. Rebuilding should only be done when necessary. There is a heavy database penalty for the queries it executes, plus this can cause your editors to have db lock timeouts because of how long the query takes and if someone is actively trying to edit content.

We simply check the document count / fieldContent / isHealth.Success bool, and if there is any problem we execute the following code

But how does this check if the index is in sync with the CM database?

from examine.

IOSven avatar IOSven commented on July 29, 2024

Hi @Shazwazza,

We're indeed only rebuilding our indexes if we really have to when the document count is 0.

We're running this examine health check on both our CD & CM environments.
The health check doesn't currently check if the index is in sync with the CM database, but instead the automatic job only checks if their own indexes are healthy with correct document/field count etc.

Would there be a better alternative workaround maybe that you could think of?

Thanks!

from examine.

IOSven avatar IOSven commented on July 29, 2024

Hi @Shazwazza,

We've been experimenting with examine x and we also bought a paid license.
Everything was up-and-running without problems on our test/acc environments but unfortunatly we noticed performance issues on our production environment.

The implementation of Examine X in our project is currently on hold until further investigation.
We will create an issue under the Examine X issue tracker when we have more information.

from examine.

Shazwazza avatar Shazwazza commented on July 29, 2024

Thanks @IOSven for the info. Happy to assist on the ExamineX tracker regarding any performance investigations. The only performance concerns with ExamineX would simply be latency due to HTTP requests when searching or indexing but there is far less overhead on the local CPU than Examine since there is no underlying Lucene engine. Would be interesting to see where your bottlenecks are/were.

from examine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.