Giter Club home page Giter Club logo

Comments (7)

cmatthias avatar cmatthias commented on June 18, 2024 1

Hi @philippnormann, thanks for the reply! I first tried making that change but was still receiving the same error, unfortunately.

However, I'm now testing an additional fix, which involves adding a couple of arguments to create_async_engine in monitor/database/__init__.py. Setting up the engine using the following line has resulted in no "database is locked" errors for about 20 minutes, which is way better than what I was seeing before.

engine = create_async_engine(DATABASE_URL, echo=False, connect_args={'timeout': 30.0, 'isolation_level': None})

The reason I tried this was that I guessed that somewhere in the code a commit() was being missed, resulting in a transaction locking the DB for longer than it should. As far as I understand it, the timeout: 30 arg means "wait for the lock to be released for up to 30 sec before throwing an exception", and the isolation_level: None arg forces the python sqlite library to use autocommit mode.

I also have the "sequential await" fix that you suggested in place still. Not sure if that's required with the above changes; I'll test removing it at some point.

I am sure that the timeout arg is probably too long, and I'm not sure how you feel about autocommit mode, but these changes make things more stable for me. I will let this run for some time today and report back. Thanks again!

from chia-monitor.

philippnormann avatar philippnormann commented on June 18, 2024

Hey @cmatthias, thanks for the elaborate bug report and your initial debugging efforts 🙂
You are the first one reporting this kind of issue, but I agree with your suspicion that too many concurrent queries to the DB might be the root cause. We could try to reduce the number of simultaneous queries by awaiting the notifications sequentially.

Could you try to replace the following lines in notifier.py

tasks = [n.run() for n in self.notifications]
await asyncio.gather(*tasks)

with

for notification in self.notifications:
    await notification.run()

and report if that helps to resolve your issue?

from chia-monitor.

philippnormann avatar philippnormann commented on June 18, 2024

Hey @cmatthias, I don't feel great about this kind of workaround, but I also failed to find any mistake in our DB usage and therfore suspect the bug to lie somewhere in the async implementation of SQLAlchemy. For now, enabling AUTOCOMMIT and increasing the timeout seems like a good way to mitigate the issue. The documentation I found suggests slightly different parameters:

engine = create_async_engine(DATABASE_URL,
                             echo=False,
                             isolation_level="AUTOCOMMIT",
                             connect_args={'timeout': 30.0})

Not sure if it makes any difference, but I would definitely be curious to hear your results for this variant and the one you suggested without the "sequential await" fix. Thanks again for being so helpful!

from chia-monitor.

cmatthias avatar cmatthias commented on June 18, 2024

Hey again @philippnormann, thanks for the extra info! It turns out I did eventually get DB lock errors (after about an hour) with my original set of changes. I just changed the isolation_level argument to the style you suggested (which I agree looks right based on the docs) and I'll let you know if that does any better.

from chia-monitor.

cmatthias avatar cmatthias commented on June 18, 2024

Looks like I'm still seeing this issue even with your suggested change. I'm racking my brain trying to figure out why no one else sees this, because it's consistently reproducible on my machine. I'm on an Ubuntu 21.04 VM running python 3.9.5, which doesn't seem like it'd be an unusual setup. The underlying storage for the VM is an SSD (so it's not unusually slow). I am running my full chia node, wallet, and harvester on the same machine, so maybe that has something to do with it?

Any other thoughts/ideas would be most welcome :) Thanks for your help so far!

from chia-monitor.

lludlow avatar lludlow commented on June 18, 2024

Was seeing something similar, no errors, but I need to restart a few times a day to get notifications back

from chia-monitor.

cmatthias avatar cmatthias commented on June 18, 2024

Hi @philippnormann, wanted to updated you on this issue: I discovered there was a configuration issue with the VM that I was running my chia node and chia-monitor on. The VM's disk image was living on an NFS-mounted directory on the host, meaning the VM's storage was not local to the host and instead was going out over a (probably overtaxed) gigabit network. I discovered this after noticing some other weird I/O behavior on the VM (unrelated to chia-monitor).

After moving the VM's disk to be local on the VM host, this issue disappeared and I have been successfully receiving notifications, even after backing out the changes you suggested earlier in the thread. I'm not 100% sure if there is still an underlying issue in the code that was exposed by my probably-close-to-worst-case I/O scenario, but I'll close this issue for now since the issue as I was experiencing it is resolved.

If you'd still like to investigate more, feel free to reopen this, and now that I know how to reproduce the issue I could set a VM back up for testing any potential fixes.

Thanks so much for your help!

from chia-monitor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.