Comments (7)
Hi @philippnormann, thanks for the reply! I first tried making that change but was still receiving the same error, unfortunately.
However, I'm now testing an additional fix, which involves adding a couple of arguments to create_async_engine
in monitor/database/__init__.py
. Setting up the engine using the following line has resulted in no "database is locked" errors for about 20 minutes, which is way better than what I was seeing before.
engine = create_async_engine(DATABASE_URL, echo=False, connect_args={'timeout': 30.0, 'isolation_level': None})
The reason I tried this was that I guessed that somewhere in the code a commit()
was being missed, resulting in a transaction locking the DB for longer than it should. As far as I understand it, the timeout: 30
arg means "wait for the lock to be released for up to 30 sec before throwing an exception", and the isolation_level: None
arg forces the python sqlite library to use autocommit mode.
I also have the "sequential await" fix that you suggested in place still. Not sure if that's required with the above changes; I'll test removing it at some point.
I am sure that the timeout arg is probably too long, and I'm not sure how you feel about autocommit mode, but these changes make things more stable for me. I will let this run for some time today and report back. Thanks again!
from chia-monitor.
Hey @cmatthias, thanks for the elaborate bug report and your initial debugging efforts 🙂
You are the first one reporting this kind of issue, but I agree with your suspicion that too many concurrent queries to the DB might be the root cause. We could try to reduce the number of simultaneous queries by awaiting the notifications sequentially.
Could you try to replace the following lines in notifier.py
tasks = [n.run() for n in self.notifications]
await asyncio.gather(*tasks)
with
for notification in self.notifications:
await notification.run()
and report if that helps to resolve your issue?
from chia-monitor.
Hey @cmatthias, I don't feel great about this kind of workaround, but I also failed to find any mistake in our DB usage and therfore suspect the bug to lie somewhere in the async implementation of SQLAlchemy. For now, enabling AUTOCOMMIT
and increasing the timeout seems like a good way to mitigate the issue. The documentation I found suggests slightly different parameters:
engine = create_async_engine(DATABASE_URL,
echo=False,
isolation_level="AUTOCOMMIT",
connect_args={'timeout': 30.0})
Not sure if it makes any difference, but I would definitely be curious to hear your results for this variant and the one you suggested without the "sequential await" fix. Thanks again for being so helpful!
from chia-monitor.
Hey again @philippnormann, thanks for the extra info! It turns out I did eventually get DB lock errors (after about an hour) with my original set of changes. I just changed the isolation_level
argument to the style you suggested (which I agree looks right based on the docs) and I'll let you know if that does any better.
from chia-monitor.
Looks like I'm still seeing this issue even with your suggested change. I'm racking my brain trying to figure out why no one else sees this, because it's consistently reproducible on my machine. I'm on an Ubuntu 21.04 VM running python 3.9.5, which doesn't seem like it'd be an unusual setup. The underlying storage for the VM is an SSD (so it's not unusually slow). I am running my full chia node, wallet, and harvester on the same machine, so maybe that has something to do with it?
Any other thoughts/ideas would be most welcome :) Thanks for your help so far!
from chia-monitor.
Was seeing something similar, no errors, but I need to restart a few times a day to get notifications back
from chia-monitor.
Hi @philippnormann, wanted to updated you on this issue: I discovered there was a configuration issue with the VM that I was running my chia node and chia-monitor on. The VM's disk image was living on an NFS-mounted directory on the host, meaning the VM's storage was not local to the host and instead was going out over a (probably overtaxed) gigabit network. I discovered this after noticing some other weird I/O behavior on the VM (unrelated to chia-monitor).
After moving the VM's disk to be local on the VM host, this issue disappeared and I have been successfully receiving notifications, even after backing out the changes you suggested earlier in the thread. I'm not 100% sure if there is still an underlying issue in the code that was exposed by my probably-close-to-worst-case I/O scenario, but I'll close this issue for now since the issue as I was experiencing it is resolved.
If you'd still like to investigate more, feel free to reopen this, and now that I know how to reproduce the issue I could set a VM back up for testing any potential fixes.
Thanks so much for your help!
from chia-monitor.
Related Issues (20)
- Near future changes to INFO log-level by chia devs HOT 3
- Reduce history.sqlite size HOT 2
- chives-monitor? HOT 1
- When running alembic upgrade head, get error HOT 3
- Optional additions?
- API Foxypool
- error when using command pipenv run alembic upgrade head HOT 10
- Updated But Harvesters Show No Data
- What populates chia_farmed_total_mojos? HOT 1
- incorporation with machinaris container
- Ask to add Uptime for full node, farmer, and harvester threads.
- Req: sharded process support
- Req: Add CPU and Mem stats?
- Req: Add unpaid coins
- Failed to persist event in DB HOT 4
- is there a log file location I can troubleshoot from? HOT 2
- pipenv install HOT 4
- only showing 1 harvester per ip HOT 1
- Failed to create any collector
- pipenv install
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chia-monitor.