Comments (14)
It didn't crash after 150 minutes, so I assume the issue has something to do with my specific code (maybe the cursor stuff? not sure.)
Either ways, thanks for the help! ^^
from atproto.
We stumbled upon the same error and for now circumvent the error by just catching the panic (which throws a BaseException and not a normal Exception). This way the malformed message just gets ignored and so far it runs stable since 3 days now.
try:
commit = parse_subscribe_repos_message(message)
# we need to be sure that it's commit message with .blocks inside
if not isinstance(commit, models.ComAtprotoSyncSubscribeRepos.Commit):
return
repo = commit.repo
car = CAR.from_bytes(commit.blocks)
except Exception as e:
print(f"Error parsing Commit\n{e}")
return
except BaseException as e:
print(f"Error parsing Commit\n{e}")
return
from atproto.
We stumbled upon the same error and for now circumvent the error by just catching the panic (which throws a BaseException and not a normal Exception). This way the malformed message just gets ignored and so far it runs stable since 3 days now.
try: commit = parse_subscribe_repos_message(message) # we need to be sure that it's commit message with .blocks inside if not isinstance(commit, models.ComAtprotoSyncSubscribeRepos.Commit): return repo = commit.repo car = CAR.from_bytes(commit.blocks) except Exception as e: print(f"Error parsing Commit\n{e}") return except BaseException as e: print(f"Error parsing Commit\n{e}") return
Could you try to replace try-catch with "if not commit.blocks: return" pls
from atproto.
Full stacktrace with RUST_BACKTRACE=1
, maybe something here helps:
thread '<unnamed>' panicked at src/lib.rs:164:67:
called `Result::unwrap()` on an `Err` value: Parsing("failed to parse uvarint for header")
stack backtrace:
0: rust_begin_unwind
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
1: core::panicking::panic_fmt
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
2: core::result::unwrap_failed
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1652:5
3: libipld::_::__pyfunction_decode_car
4: pyo3::impl_::trampoline::trampoline
5: libipld::_::<impl libipld::decode_car::MakeDef>::DEF::trampoline
6: _PyEval_EvalFrameDefault
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:5096:29
7: _PyEval_EvalFrame
at /tmp/python-build.20230707110648.1880/Python-3.11.4/./Include/internal/pycore_ceval.h:73:16
8: _PyEval_Vector
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:6439:24
9: _PyVectorcall_Call
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/call.c:245:16
10: _PyObject_Call
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/call.c:328:16
11: do_call_core
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:7357:12
12: _PyEval_EvalFrameDefault
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:5381:22
13: _PyEval_EvalFrame
at /tmp/python-build.20230707110648.1880/Python-3.11.4/./Include/internal/pycore_ceval.h:73:16
14: _PyEval_Vector
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:6439:24
15: _PyObject_FastCallDictTstate
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/call.c:141:15
16: _PyObject_Call_Prepend
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/call.c:482:24
17: slot_tp_init
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/typeobject.c:7863:15
18: type_call
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/typeobject.c:1112:19
19: _PyObject_MakeTpCall
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Objects/call.c:214:18
20: _PyEval_EvalFrameDefault
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:4774:23
21: _PyEval_EvalFrame
at /tmp/python-build.20230707110648.1880/Python-3.11.4/./Include/internal/pycore_ceval.h:73:16
22: _PyEval_Vector
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:6439:24
23: PyEval_EvalCode
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/ceval.c:1154:21
24: run_eval_code_obj
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/pythonrun.c:1714:9
25: run_mod
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/pythonrun.c:1735:19
26: PyRun_StringFlags
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/pythonrun.c:1605:15
27: PyRun_SimpleStringFlags
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Python/pythonrun.c:487:9
28: pymain_run_command
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Modules/main.c:255:11
29: pymain_run_python
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Modules/main.c:592:21
30: Py_RunMain
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Modules/main.c:680:5
31: pymain_main
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Modules/main.c:710:12
32: Py_BytesMain
at /tmp/python-build.20230707110648.1880/Python-3.11.4/Modules/main.c:734:12
33: __libc_start_main
34: _start
from atproto.
That's pretty strange!
My code is a little different for handling the firehose. I removed saving the cursor, and it certainly doesn't continually update the client's params (this was back before we had a wonderful rust library to speed up the python sdk.) Could that be the issue?
There's also only one worker thread (because of AWS limitations, can't use multiprocessing.Queue) and stuff is passed from the main thread to the worker with a multiprocessing.Pipe object instead. I'd be surprised if that's an issue, though, as I bet Queue uses pipes to communicate between threads too...
I'll try running the minimal example linked above (process_commits.py) for a while and see if I can reproduce the issue on my machine. I at least now have a bugfix that stops CAR.from_bytes()
getting called on empty strings, so my feed is working for now =)
from atproto.
Could you try to replace try-catch with "if not commit.blocks: return" pls
This is almost exactly what I've been doing (and the feed is stable), may be a bit safer than having a catch all for all kinds of exception:
if commit.blocks == b'' or len(commit.ops) == 0:
return operation_by_type
if not commit.blocks: return
should work too (actually, that's neater, and I might switch to that...)
from atproto.
pls add if statement like i did in the updated firehose example. feed generator repo has been updated too. i am closing it, thanks 42b74d4
from atproto.
Let's isolate from the feed-generator and try to reproduce it with this example: https://github.com/MarshalX/atproto/blob/main/examples/firehose/process_commits.py
i changed the 90th line from this:
to this:
client = FirehoseSubscribeReposClient(params, 'wss://bsky.network/xrpc')
and run locally on a mac; the error doesn't happen at least at start. could you try too?
from atproto.
the source of error comes from iroh-car lib:
but the main question, for now, is what the value of "commit.blocks" passed to CAR.from_bytes
. could you pls log it?
from atproto.
example works fine for more than 20 mins. but i reproduce it. pls double-check did you modified it somehow or not in comparison with the example that I mentioned above. the error happens because of an empty binary string that passed for decoding
from atproto import CAR
CAR.from_bytes(b"")
from atproto.
I tried running the firehose for a bit and yep, it looks like an empty binary string. These are the arguments in the commit that broke it:
commit.repo: 'did:plc:7ccq5bfdp3jk4acvcnnevds4'
commit.ops: [] # i.e. an empty list
commit.blocks: b'' # empty binary string - same as you
I get the same results when doing CAR.from_bytes(b''):
from atproto.
my firehose example still running fine. idk what kind of commit you are receiving :(
from atproto.
Author
@emilyhunt Did you have to change anything to not do multiprocessing? I think I may have that same limitation and my firehose has not been acting right in production.
from atproto.
@emilyhunt Did you have to change anything to not do multiprocessing? I think I may have that same limitation and my firehose has not been acting right in production.
Do you mean with not using a multiprocessing.Queue
to allow the firehose to run on AWS? If that, then yes - here is the current firehose code I'm running, which just uses a single worker process and communicates between the main process and the worker with a multiprocessing.Pipe
. This StackOverflow post has more context on this. As it stands, I don't think the default Python feed generator project can run on AWS at all, the code in firehose.py
has to be modified. (It may be worth opening a new issue/discussion to chat about this further, or suggest a fix)
from atproto.
Related Issues (20)
- Uploading an image as a blob to be used as a card image doesn't work ! HOT 1
- Python Version Depency Issue HOT 2
- Would you recommend any specific gunicorn settings for a feed? HOT 1
- AtUri.from_str() returns invalid host for some AT uri's (Fix included) HOT 2
- Implement autogenerated Record Namespaces HOT 1
- Add the ability to submit posts that include labels
- Delete deprecated "subject" argument of .like() and .repost() methods
- Delete deprecated record models called "Main" instead of "Record"
- Auth token handling improvements HOT 10
- Delete deprecated SessionString class
- decode_dag_multi does not decode fully HOT 4
- Misspelling in get_author_feed HOT 3
- Subscribing to feed? HOT 4
- Parsing Alt Text HOT 2
- Add support for event stream HOT 8
- get_blob errors on redirects HOT 2
- High memory usage: from atproto import Client HOT 3
- Failing on authentication HOT 2
- IpldLink model validation fails HOT 8
- delete_post not work HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from atproto.