Giter Club home page Giter Club logo

Comments (29)

Code-Slave avatar Code-Slave commented on May 22, 2024

to add. It did the first pass. then never again. half the sources have stuff to grab but there are no tasks for any channel except one

from tubesync.

meeb avatar meeb commented on May 22, 2024

Do you have "scheduled" tasks on your Tasks page? Is there anything of note in your container logs? Can you provide the URL to a channel or playlist that isn't being indexed and also upload a screenshot of your source settings page for the same source.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

only the one scheduled task after intial run. Its like they ran to do the initial load, then never created the refresh task every 24hrs
https://www.youtube.com/c/halfasinteresting/

Screenshot from 2021-01-11 21-18-50
Screenshot from 2021-01-11 21-18-03

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

table background_task

Screenshot from 2021-01-11 21-21-41

from tubesync.

meeb avatar meeb commented on May 22, 2024

Odd, does the "reset tasks" button fix it? I did encounter some weird race conditions with task management through Django signals when I was building it which is why there's a "reset tasks" button.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

There we go. reset tasks and db lock errors. Likely why nothings being added. this is running in docker just fyi. The db locks while adding tasks after a reset it looks like, Only the first source gets added. If i add a new source its fine and adds the task. if i hit reset it only readds the first source and errors due to dblock. So there is somewhere either during update source or resetting thats encountering a lock. (i had not reset tasks before but i did update sources)

Traceback (most recent call last):


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute


    return self.cursor.execute(sql, params)


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/sqlite3/base.py", line 413, in execute


    return Database.Cursor.execute(self, query, params)


sqlite3.OperationalError: database is locked



The above exception was the direct cause of the following exception:



Traceback (most recent call last):


  File "/usr/local/lib/python3.7/dist-packages/background_task/tasks.py", line 43, in bg_runner


    func(*args, **kwargs)


  File "/app/sync/tasks.py", line 159, in index_source_task


    source.save()


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 754, in save


    force_update=force_update, update_fields=update_fields)


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 803, in save_base


    update_fields=update_fields, raw=raw, using=using,


  File "/usr/local/lib/python3.7/dist-packages/django/dispatch/dispatcher.py", line 179, in send


    for receiver in self._live_receivers(sender)


  File "/usr/local/lib/python3.7/dist-packages/django/dispatch/dispatcher.py", line 179, in <listcomp>


    for receiver in self._live_receivers(sender)


  File "/app/sync/signals.py", line 63, in source_post_save


    media.save()


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 754, in save


    force_update=force_update, update_fields=update_fields)


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 792, in save_base


    force_update, using, update_fields,


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 873, in _save_table


    forced_update)


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 926, in _do_update


    return filtered._update(values) > 0


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 803, in _update


    return query.get_compiler(self.db).execute_sql(CURSOR)


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 1522, in execute_sql


    cursor = super().execute_sql(result_type)


  File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 1156, in execute_sql


    cursor.execute(sql, params)


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 66, in execute


    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers


    return executor(sql, params, many, context)


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute


    return self.cursor.execute(sql, params)


  File "/usr/local/lib/python3.7/dist-packages/django/db/utils.py", line 90, in __exit__


    raise dj_exc_value.with_traceback(traceback) from exc_value


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute


    return self.cursor.execute(sql, params)


  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/sqlite3/base.py", line 413, in execute


    return Database.Cursor.execute(self, query, params)


django.db.utils.OperationalError: database is locked


Rescheduling task Index media from source "731woodworks" for 0:00:06 later at 2021-01-12 13:48:04.688548+00:00


[2021-01-12 08:48:06 -0500] [437] [CRITICAL] WORKER TIMEOUT (pid:460)

from tubesync.

meeb avatar meeb commented on May 22, 2024

Ah, is your SQLite database / /config volume for TubeSync on an NFS share or something similar?

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

nope just a normal docker mount. i have 25-30 containers running that use sqlite with no issues

from tubesync.

meeb avatar meeb commented on May 22, 2024

Odd, I've not encountered this myself. The gunicorn workers that host the front end and the background task workers do all run in the shared container and access the same sqlite db, but it really shouldn't be enough load to cause locking issues. Do you have anything else accessing the sqlite database? Sqlite viewer tool or anything? Did you tweak any of the advanced experimental options like GUNICORN_WORKERS or TUBESYNC_WORKERS env vars? There's nothing else fancy in TubeSync that would make it different from any other container that uses sqlite. You could try making sure nothing has the sqlite database open at all, setting TUBESYNC_WORKERS and GUNICORN_WORKERS both to 1, restarting the container and then try the "reset tasks" button again. If not, check fuser /path/to/db.sqlite or similar commands on the host as something must be locking it somewhere. If you have an SQLite database viewer connected to it, that will be the cause of the locking.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

nope i left the dockerfile pretty much standard. Ill reset the container and change those to 1 and see how that works. the viewer is using a copy of the db just for that reason (Im a dba by trade)

from tubesync.

meeb avatar meeb commented on May 22, 2024

Ah, useful for debugging. Well, it's really just standard Django with nothing fancy and no long term open transactions or anything weird. All queries are atomic by default and unless a write takes an absurdly long amount of time or some process somewhere is locking the db I can't see how it would get locked to the state it can't handle single threaded writes from one worker. Typically, almost all the writes come from a single background worker process. The front end doesn't really do writes other than a few of the admin-ish buttons like "reset tasks" so it's really not designed in a way that can get easily contended on the db. Let me know if the above fixes it for you or you find out any additional info.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

its pulling from a new source right now and got the same error when it went to reschedule a specific media item. i havent reset it yet as its in the middle of a pull for a new source

from tubesync.

meeb avatar meeb commented on May 22, 2024

I've absolutely no idea how a single worker process issuing writes locks the database if nothing else at all is accessing the db. I'll put it on my todo list to check the code and make sure it's properly transactional everywhere though.

from tubesync.

meeb avatar meeb commented on May 22, 2024

You could also always try the old echo ".dump" | sqlite existing.db | sqlite rebuilt.db trick to see if it was a corruption issue with db itself as well of course. On a copy of the db and switch it out (with the container stopped) as a test.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

ive been able to trigger it while a task is running and trying to update that source at same time. not always but enough times to be obvious. because of all the goofing around ive now reset it all, imported my source table from the old one and reset jobs. working so far but im just updating sources or anything either. I think what started it maybe was I added a bunch of sources then while the jobs were running i was editing setting cause i goofed up on naming and thats when the jobs started not showing up

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

update. It seems if an indexing job is running (right now thumbnails) something is triggering a reschedule for another indexing job on a different source. that reschedule fails with a dblock

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

`
2021-01-12 10:31:34,745 [tubesync/INFO] Indexed media: 731woodworks / DCyBplfL2dU

2021-01-12 10:31:35,232 [tubesync/INFO] Scheduling task to download thumbnail for: What Tools Do You Need For Woodworking? from: https://i.ytimg.com/vi_webp/D7syYARDiug/maxresdefault.webp?v=5f3c9a6c

2021-01-12 10:31:35,789 [tubesync/INFO] Indexed media: 731woodworks / D7syYARDiug

2021-01-12 10:31:35,954 [tubesync/INFO] Scheduling task to download thumbnail for: Top 5 Woodworking Projects That Sell from: https://i.ytimg.com/vi/7pIKH-BCLoA/maxresdefault.jpg

2021-01-12 10:31:36,582 [tubesync/INFO] Indexed media: 731woodworks / 7pIKH-BCLoA

Rescheduling Index media from source "anawhitediy"

Traceback (most recent call last):

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute

return self.cursor.execute(sql, params)

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/sqlite3/base.py", line 413, in execute

return Database.Cursor.execute(self, query, params)

sqlite3.OperationalError: database is locked

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "/usr/local/lib/python3.7/dist-packages/background_task/tasks.py", line 43, in bg_runner

func(*args, **kwargs)

File "/app/sync/tasks.py", line 207, in index_source_task

media.save()

File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 754, in save

force_update=force_update, update_fields=update_fields)

File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 792, in save_base

force_update, using, update_fields,

File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 895, in _save_table

results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)

File "/usr/local/lib/python3.7/dist-packages/django/db/models/base.py", line 935, in _do_insert

using=using, raw=raw,

File "/usr/local/lib/python3.7/dist-packages/django/db/models/manager.py", line 85, in manager_method

return getattr(self.get_queryset(), name)(*args, **kwargs)

File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 1254, in _insert

return query.get_compiler(using=using).execute_sql(returning_fields)

File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql

cursor.execute(sql, params)

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 66, in execute

return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers

return executor(sql, params, many, context)

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute

return self.cursor.execute(sql, params)

File "/usr/local/lib/python3.7/dist-packages/django/db/utils.py", line 90, in exit

raise dj_exc_value.with_traceback(traceback) from exc_value

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute

return self.cursor.execute(sql, params)

File "/usr/local/lib/python3.7/dist-packages/django/db/backends/sqlite3/base.py", line 413, in execute

return Database.Cursor.execute(self, query, params)

django.db.utils.OperationalError: database is locked

2021-01-12 10:31:37,002 [tubesync/INFO] Scheduling task to download thumbnail for: Woodworking Projects for Beginners from: https://i.ytimg.com/vi_webp/6cpktYBUPv8/maxresdefault.webp

2021-01-12 10:31:37,372 [tubesync/INFO] Indexed media: 731woodworks / 6cpktYBUPv8

2021-01-12 10:31:37,772 [tubesync/INFO] Scheduling task to download thumbnail for: DIY Floating Nightstand With Storage from: https://i.ytimg.com/vi_webp/3k9CZ059gAo/maxresdefault.webp

2021-01-12 10:31:38,147 [tubesync/INFO] Indexed media: 731woodworks / 3k9CZ059gAo

2021-01-12 10:31:38,583 [tubesync/INFO] Scheduling task to download thumbnail for: Make Your Workbench Mobile from: https://i.ytimg.com/vi_webp/uNmDuevPlR8/maxresdefault.webp`

from tubesync.

meeb avatar meeb commented on May 22, 2024

Might be something damaged with your SQLite db (e.g. hard kill on a process which was writing to the db at the time), try a "repair":

  1. Stop the container
  2. Copy the db.sqlite3 file somewhere
  3. Run echo ".dump" | sqlite db.sqlite3 | sqlite db-fixed.sqlite3 on the copy (this command might be sqlite3 not sqlite depending on your setup)
  4. Check db-fixed.sqlite3 is valid (opens in the SQLite CLI or SQLite viewer etc.)
  5. Delete the old db.sqlite3 file and make sure to delete any hidden .*-journal files that might be hanging about
  6. Move the db-fixed.sqlite3 file back to where the original db.sqlite3 file was
  7. Check permissions and ownership on db.sqlite3 are correct
  8. Start the container again
  9. Click "reset tasks" again

If that doesn't work, I'll have to put the ticket on my backlog to experiment with later at some point as I'll need to see if I can replicate your issue to have any hope of finding a possible cause.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

this is from a brand new db. I will also do above. but every time ive done checks db has been fine. I spent time in the code today and yea its pretty standard django. after it catches up im going to dump sources and the tasks so in case it happends again. You might want to offer a clear and reschedule tasks on a souce basis instead of global

from tubesync.

meeb avatar meeb commented on May 22, 2024

The next release will have two split background workers, one that indexes media and one that downloads media, rather than N pools of workers doing generic tasks. This could probably help with locking issues on busy writes. I'll also spend some time wrapping a bunch of the heavier events in explicit transactions rather than leaving it up to Django magic internals.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

I think that will help. To me it looks like things are just stepping on each other at certain times when its busy

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

Im wondering if this is a quantity issue. I have 2800 media items and every reset task dies after inserting the first refresh media task where it downloads the pages. i can send you my db if you like. its 128mb but should zip well

from tubesync.

meeb avatar meeb commented on May 22, 2024

The SQLite db should be able to handle millions of entries on paper. The issue with locking multiple writes at once. Assuming you have TUBESYNC_WORKERS env set to 1 this really shouldn't be possible as it'll be one single non-threaded worker writing to it. Either way, it should get better when I next get a chance to go through these issues for the next release. Probably OK without the DB file for now, thanks for the offer and we can look at that if issues persist after the next release. Cheers for the feedback so far.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

np. both the env vars are set to 1 right now. still happening will wait for next release as it keeps killing my tasks

from tubesync.

meeb avatar meeb commented on May 22, 2024

Have you tried :v0.9 or :latest? Any improvements with your issues?

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

Running latest Sometimes better sometimes the same. Basically it still gets to a point where i have to reset tasks cause the locks and then it can never add all the new tasks again from the locks. Im not using network shares or anything. just docker on synology nas. I do want to test on a straight ubuntu server soon to see if it something particular with synology

from tubesync.

meeb avatar meeb commented on May 22, 2024

I've not got a NAS handy to test it on, however I would suspect the advice for you would be use Postgres once support is added.

from tubesync.

Code-Slave avatar Code-Slave commented on May 22, 2024

Thats my plan as I hoard lots of channels.

from tubesync.

cleverestx avatar cleverestx commented on May 22, 2024

Thanks for this thread. How do I add postgres to Tubesync? I'm a noob when it comes to Postgres, and have just installed it (both are running in containers on the same network, but how do I get them to "talk".)

from tubesync.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.