Giter Club home page Giter Club logo

Comments (8)

colinmollenhour avatar colinmollenhour commented on June 22, 2024

Hi Melvyn. I agree that the merit of locking is arguable given the performance/complexity downsides. However, if one wants a redis session store that does not use locking they could just use the handler built into phpredis which I know some people are already doing. So although I hadn't strongly considered supporting a non-locking version I am not opposed to it (as long as it isn't the default). The only two benefits I can see over the built-in handler would be support without the phpredis module (no idea how many people go this route) and compression of the session data. I think the data compression is very valuable since it will let you store more sessions or for longer in the same amount of memory which is a limited resource on a large site.

Regarding the use-case for locking, the typical basic ecommerce site would work fine without it for the most part. But, if a user were to for example add two products to the cart in quick succession (or your site was really slow ;) then they might for example see two success messages on one page and none on the other, or see only one success message, or possibly worse (only one product gets added?). Another example might be the order success page.. If while an order was processing another page load occurred it could clobber the "last order id" and then the success page would throw an error and redirect and the user would never know if their order completed. I haven't tested all of the scenarios and instead have just opted to go the safe/proper route. It is also quite possible that some other feature or third-party module really depends on stable session data and you wouldn't know it until it started causing problems. For what it's worth, the "files" session handler does use locking (with no break/timeout mechanism) and the other handlers do not support any locking.

In short: If I receive a well-formatted pull request that adds a 'use_locking' option that defaults to '1' (for all of the poeple out there expecting to get locking) then I will merge it. :)

from cm_redissession.

melvyn-sopacua avatar melvyn-sopacua commented on June 22, 2024

Hi Collin,

The session data can become quite big, because of site-specific data being stored in there so compression is needed. Adding two products to the cart I can see how that's an issue with Ajax-based cartadd. As for page loads during order completion, since the write locking is still in place, I don't think a conflict would occur, but it's an interesting test.
I'll work on properly formatted patch and perhaps revisit my initial idea about rwlocks, so we only have to do accounting for the write locks.
I'm still not sure why we're seeing queues between 6 and 32 long (upping concurrency to 32 got rid of 503's), but due to time constraints at present, not able to dive into that.

from cm_redissession.

colinmollenhour avatar colinmollenhour commented on June 22, 2024

The checkout success scenario:

  • User starts load of slow page, session is read
  • User completes checkout, session is read again, last order id is written to session
  • Slow page load completes, session is written, clobbers last order id
  • Checkout success page is rendered, last order id doesn't exist

It is a very unlikely race condition, but a race condition with negative consequences nevertheless.

The adminhtml probably has more scenarios with all of the "form_data" and grids and ajax features. Perhaps the configuration setting could be separate for each "area"?

from cm_redissession.

colinmollenhour avatar colinmollenhour commented on June 22, 2024

Added a config option and also a constant to allow disabling locking in 0a03d8c.

from cm_redissession.

melvyn-sopacua avatar melvyn-sopacua commented on June 22, 2024

Today at another site we replicated the issue. It was caused by a security scanner. It fires requests in parallel, causing the lock to occur. The issue to solve here is the lock wait timeout. When two requests are fired simultaneously, the lock wait timeout is the one causing the problem:

        if ($tries == 1 /* TODO - $tries % 10 == 0 ? */) {
            $detectZombies = TRUE;
            // TODO: allow configuration of sleep period?
            usleep(1500000); // 1.5 seconds
        }

If this were a spinlock that bailed out to a time wait, we'd solve it, as can be observed:

2014-07-11T06:14:16+00:00 DEBUG (7): web1.example.org|22261: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00224 seconds
2014-07-11T06:14:28+00:00 DEBUG (7): web1.example.org|22261: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00102 seconds
2014-07-11T06:14:34+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00097 seconds
2014-07-11T06:14:49+00:00 DEBUG (7): web1.example.org|22261: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00107 seconds
2014-07-11T06:15:01+00:00 DEBUG (7): web1.example.org|21734: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00112 seconds
2014-07-11T06:15:15+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00493 seconds
2014-07-11T06:15:17+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00133 seconds
2014-07-11T06:15:23+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00105 seconds
2014-07-11T06:15:43+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00108 seconds
2014-07-11T06:15:49+00:00 DEBUG (7): web1.example.org|21734: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00435 seconds
2014-07-11T06:15:53+00:00 DEBUG (7): web1.example.org|22104: Data read for ID sess_p9i3vvr885k8548pf5ghvu3ke1 after 0.00533 seconds

These are all reads without locking. They are in milliseconds, yet we sleep for 1.5 seconds. That's enough time for additional requests to come in and so the waits pile up.
While this was artificially created by a security scanner, the same will happen when asynchronous requests come in, for example when using Varnish/ESI blocks or Ajax requests for product data on a category page.
I see two solutions:

  1. configurable sleep time, like you already mentioned in the code (if you already did that, this is the Magento 1.8.1 bundled redis session module and we'll merge in upstream code)
  2. spinlocks with a configurable max spins that on failure go to the sleeping code

Or even both. What do you think?

from cm_redissession.

colinmollenhour avatar colinmollenhour commented on June 22, 2024

You have to be careful to not ignore other situations when trying to optimize another.. IIRC, the motivation behind a 1.5 second timeout is:

  1. Resources, namely number of req/sec that the Redis server/network can handle, are obviously limited so from this perspective, longer lock times means fewer requests (in theory based on the assumption that a lock is held by a long-running process).
  2. If there is a session locking issue it is due to either: a long running process (e.g. render cart with many shipping quotes) or multiple parallel requests. In eCommerce it seems that the former is more likely.
  3. On the first spin, wait a little longer than subsequent spins to allow time for all other waiters to increment the lock. That is, if 'waiting' is 5, by the time the newest process checks again, 'lock' should be old_lock+waiting or else there is something wrong.

All of that said, this was addressed already (0.51 second sleep time) so you should be using the github version. :)

from cm_redissession.

colinmollenhour avatar colinmollenhour commented on June 22, 2024

I just found a bug where there should have been a "continue" after the sleep, so it was actually sleeping twice. Fixing this should further help your issue with high concurrency so give the latest version a go.

from cm_redissession.

melvyn-sopacua avatar melvyn-sopacua commented on June 22, 2024

That may explain why I always saw a minimum delay of 3 seconds. Good catch.

from cm_redissession.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.