Giter Club home page Giter Club logo

Comments (1)

kr avatar kr commented on August 19, 2024

Some discussion pasted from http://groups.google.com/group/beanstalk-talk/t/5d318b9847dc93a8:

Zhu Han wrote:

Hi,

I just found beanstalkd is quite simple but very useful. Thank you for
such a great contribution.

I'm not familiar with the design and implementation of beanstalkd
because I just went through the code yesterday. So if I made any
mistake or invalid consumption, please correct me.

I noticed that beanstalkd uses normal read/write style file IO on
the binlog and keeps all the job data in the memory, including the
bookmarking information and body data. If binlog is activated, is it a
good idea to map the binlog file into memory directly? The advantage
of this approach is:

  1. No double cache of the job. Just leave it in the binlog and map it
    to memory, the VFS laryer will manage it for beanstalkd. The current
    approach caches the job in heap memory, which is backed up by swap
    space, it also caches the job in the VFS layer of file system.

  2. The number of system call can be decreased, so that the latency of
    single operation might be better.

  3. Binlog can be used as the default option. And even the memory
    cannot hold all the jobs, that's fine. Let the VFS layer to do the
    cache management. Varnish uses the same approach and it's reported
    with good performance number. Refer:
    http://varnish-cache.org/wiki/ArchitectNotes

The only disadvantage I can though of is if the OS swapped the memory
page, due to the single thread implementation of beanstalkd, all
operations by different client connections may be affected... We can
tune the memory settings of process to avoid it for low latency
environment or disable the binlog at all.

(As a tiny bit of background, in case there's any doubt, I usually
assume that deployments of beanstalkd are configured so that they
never swap. If they do swap, performance is likely to go down the
toilet.)

Using mmap is worth thinking about if we want to allow storing more
jobs than fit in memory. Until then, I think the advantages of using
mmap are too slight to worry about. Right now beanstalkd uses the
binlog for only recovery. For this purpose, writing to a file
descriptor is not really easier or harder than mmap. Having fewer
system calls will make things a bit faster, but I think our
performance is already pretty good and our time is probably better
spent on bug fixing and features. Far more important for speed is
maintaining sequential access to avoid disk seeks. I think having an
extra copy of some jobs in the page cache makes little difference for
good or bad.

If we want to use the binlog as the primary store of jobs (thus
letting us accept more jobs than memory can hold), we'll be both
reading and writing, which can cause seeks, which will kill
performance. This might be a desirable tradeoff for some, but we must
make sure, if we do it at all, this doesn't cause any problems for
those who don't need the extra storage space.

from beanstalkd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.