Comments (10)
It should be the responsibility of the caller to not cause duplicates, which is the same semantics of Phoenix.PubSub.subscribe. Could you provide the stacktrace of the crash? This is not likely something I will address, but we can document is similarly to how we document subscribe:
https://github.com/phoenixframework/phoenix_pubsub/blob/master/lib/phoenix/pubsub.ex#L119-L126
I believe that operation should be idempotent(except for phx_ref), replacing the old record with new meta.
The Tracker.update
api accommodates this.
from phoenix_pubsub.
@chrismccord Even in that case, I believe server should return error tuple, instead of accepting state that will cause crash later. I fixed the bug in the app, but some sort of checking internal consistency would be useful in my opinion. Here is stacktrace:
Error: "GenServer MyApp.Presence terminating no match of right hand side value: 2"
(phoenix_pubsub) lib/phoenix/tracker/state.ex:331→ Phoenix.Tracker.State.remove/4
(elixir) lib/enum.ex:1623→ Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker.ex:420→ Phoenix.Tracker.drop_presence/2
(phoenix_pubsub) lib/phoenix/tracker.ex:329→ Phoenix.Tracker.handle_info/2
/(stdlib) gen_server.erl:615→ :gen_server.try_dispatch/4
/(stdlib) gen_server.erl:681→ :gen_server.handle_msg/5
/(stdlib) proc_lib.erl:240→ :proc_lib.init_p_do_apply/3
I also do have another error on my tracker, but since I haven't got to the bottom of it yet, I only paste stacktrace for documentation purposes:
(phoenix_pubsub) lib/phoenix/tracker/state.ex:294→ anonymous fn/3 in Phoenix.Tracker.State.remove_down_replicas/2
(elixir) lib/enum.ex:1623→ Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker/state.ex:293→ Phoenix.Tracker.State.remove_down_replicas/2
(phoenix_pubsub) lib/phoenix/tracker.ex:518→ Phoenix.Tracker.permdown/2
(phoenix_pubsub) lib/phoenix/tracker.ex:435→ Phoenix.Tracker.handle_heartbeat/2
(phoenix_pubsub) lib/phoenix/tracker.ex:289→ Phoenix.Tracker.handle_info/2
/(stdlib) gen_server.erl:615→ :gen_server.try_dispatch/4
/(stdlib) gen_server.erl:681→ :gen_server.handle_msg/5
from phoenix_pubsub.
Even in that case, I believe server should return error tuple, instead of accepting state that will cause crash later.
The issue is tracking becomes more expensive since we have to read to check if we are already tracked, then write. This also opens up race conditions which again are fixed by further serializing the requests thru a server, but at a perf cost. The cause of the crash is an easy fix since I am matching on 1 = :ets.select_delete(...)
as a sanity check, but I'm not still convinced this is something we should check for as there is no circumstance where the code is correct when duping track calls.
from phoenix_pubsub.
@chrismccord I think that this sanity check is very much in place. If it wasn't async operation, I would suggest returning the number of untracked instances, so the client could perform that check.
The question is: should the server be crashing because of it or not? If you think it should(as an easy precaution against unnoticed tracking issues), then this issue can be closed. Otherwise, the problem is how to (and if) inform the programmer, that there is a possible issue with duplicated tracking.
from phoenix_pubsub.
from phoenix_pubsub.
I forgot we were already serializing track
thru the tracker server, so checking for dups is a relatively cheap ets lookup, which is the same we do for update
. Thanks!
from phoenix_pubsub.
@chrismccord That was truly fast. Thanks a lot!
from phoenix_pubsub.
Dear Phoenix PubSub team,
we're having recurring Presence related errors which refer to @chrismccord 's quote above
The cause of the crash is an easy fix since I am matching on 1 = :ets.select_delete(...) as a sanity check, but I'm not still convinced this is something we should check for as there is no circumstance where the code is correct when duping track calls.
specifically
Child Phoenix.Tracker of Supervisor #PID<0.1949.0> (Supervisor.Default) terminated ** (exit) an exception was raised:
** (MatchError) no match of right hand side value: 6
(phoenix_pubsub) lib/phoenix/tracker/state.ex:294: anonymous fn/3 in Phoenix.Tracker.State.remove_down_replicas/2
(elixir) lib/enum.ex:1755: Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker/state.ex:293: Phoenix.Tracker.State.remove_down_replicas/2 (phoenix_pubsub) lib/phoenix/tracker.ex:528: Phoenix.Tracker.permdown/2
(phoenix_pubsub) lib/phoenix/tracker.ex:447: Phoenix.Tracker.handle_heartbeat/2
(phoenix_pubsub) lib/phoenix/tracker.ex:305: Phoenix.Tracker.handle_info/2
(stdlib) gen_server.erl:601: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:667: :gen_server.handle_msg/5 Pid: #PID<0.1951.0> Start Call: Phoenix.Tracker.start_link(Radio.Presence, [name: Radio.Presence, otp_app: :radio, pubsub_server: Journal.PubSub], [name: Radio.Presence, otp_app: :radio, pubsub_server: Journal.PubSub])
Restart: :permanent Shutdown: 5000 Type: :worker
the match error is due to
https://github.com/phoenixframework/phoenix_pubsub/blob/v1.0.1/lib/phoenix/tracker/state.ex#L294
i.e.
1 = :ets.select_delete(pids, [{{pid, topic, key}, [], [true]}])
fails on our app with matching with right hand values of 2
or 6
as in the stacktrace above.
We're using Phoenix Presence outside of the ordinary Channel flow, but to detect when
our satellite applications come up on remote erlang nodes after being scheduled.
Our setup consists of a applications of type A
and applications of type B
which run on different erlang nodes and both subscribe to the same topic runner:<runner_id>
Only nodes of type B
track they're presence (with Phoenix.Presence.track(self(), "runner:<runner-id>", "runner", %{some-meta})
whenever the application starts (and only then). Type B
applications can go down and up quite frequently for the same <runner_id>
even within short period of time.
Crashes above happen only on nodes of type A
.
Our guess is that there are collisions in the triple {pid, topic, key}
for replicas on A
-nodes but coming from B
-nodes, because the pid
would always have the same sequential index per same topic (key is a fixed string).
9f654a0 wouldn't help on our setup because guards against duplicate tracks on B
-apps would always pass.
Would it make sense to relax the sanity check above concerning number of triples present in the ETS table at the moment of removing a down replica.
Cheers Andrea :-)
from phoenix_pubsub.
@zampino I believe your issues have been fixed on the cm-clouds
branch. Can you try the branch and attempt to replicate your issues? Thanks!
from phoenix_pubsub.
@chrismccord thanks! I'll check that out
from phoenix_pubsub.
Related Issues (20)
- Network traffic up for 20 minutes after a server restart
- Release v1.1.3 suppressing warnings HOT 1
- Where is version 2.0? HOT 1
- It may be helpful for users, if parallel option can be provided when starting Registry. Without it partitions in registry executes sequentially. HOT 1
- unknown registry: XXX.PubSub HOT 4
- RFC: Provide a way to mass update presences HOT 1
- topic forced to be string HOT 1
- Race condition causing data inconsistency when nodes are coming up
- Does Phoenix.PubSub itself support subscribing to wildcard topics? HOT 1
- PubSub registry fails to start or race condition when testing? HOT 3
- [Feature Request] Be able to transform a module into a Pub.Sub HOT 6
- Unsubscribe all HOT 2
- v2.1 and v2.0 are incompatible
- Odd Jason Encoding error after new release HOT 2
- Broken Build Status link
- Presence list keep growing when using Presence.update HOT 3
- Presence stops working after ~1 week HOT 5
- Politely request release of 2.1.4
- Tracker: join/leave within broadcast_period unclear in handle_diff/2 HOT 5
- RFC: expose erlang pg's scope ability for Phoenix.PubSub.PG2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phoenix_pubsub.