Giter Club home page Giter Club logo

Comments (10)

chrismccord avatar chrismccord commented on July 29, 2024

It should be the responsibility of the caller to not cause duplicates, which is the same semantics of Phoenix.PubSub.subscribe. Could you provide the stacktrace of the crash? This is not likely something I will address, but we can document is similarly to how we document subscribe:

https://github.com/phoenixframework/phoenix_pubsub/blob/master/lib/phoenix/pubsub.ex#L119-L126

I believe that operation should be idempotent(except for phx_ref), replacing the old record with new meta.

The Tracker.update api accommodates this.

from phoenix_pubsub.

rradz avatar rradz commented on July 29, 2024

@chrismccord Even in that case, I believe server should return error tuple, instead of accepting state that will cause crash later. I fixed the bug in the app, but some sort of checking internal consistency would be useful in my opinion. Here is stacktrace:

Error: "GenServer MyApp.Presence terminating no match of right hand side value: 2"

(phoenix_pubsub) lib/phoenix/tracker/state.ex:331→ Phoenix.Tracker.State.remove/4
(elixir) lib/enum.ex:1623→ Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker.ex:420→ Phoenix.Tracker.drop_presence/2
(phoenix_pubsub) lib/phoenix/tracker.ex:329→ Phoenix.Tracker.handle_info/2
/(stdlib) gen_server.erl:615→ :gen_server.try_dispatch/4
/(stdlib) gen_server.erl:681→ :gen_server.handle_msg/5
/(stdlib) proc_lib.erl:240→ :proc_lib.init_p_do_apply/3

I also do have another error on my tracker, but since I haven't got to the bottom of it yet, I only paste stacktrace for documentation purposes:

(phoenix_pubsub) lib/phoenix/tracker/state.ex:294→ anonymous fn/3 in Phoenix.Tracker.State.remove_down_replicas/2
(elixir) lib/enum.ex:1623→ Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker/state.ex:293→ Phoenix.Tracker.State.remove_down_replicas/2
(phoenix_pubsub) lib/phoenix/tracker.ex:518→ Phoenix.Tracker.permdown/2
(phoenix_pubsub) lib/phoenix/tracker.ex:435→ Phoenix.Tracker.handle_heartbeat/2
(phoenix_pubsub) lib/phoenix/tracker.ex:289→ Phoenix.Tracker.handle_info/2
/(stdlib) gen_server.erl:615→ :gen_server.try_dispatch/4
/(stdlib) gen_server.erl:681→ :gen_server.handle_msg/5

from phoenix_pubsub.

chrismccord avatar chrismccord commented on July 29, 2024

Even in that case, I believe server should return error tuple, instead of accepting state that will cause crash later.

The issue is tracking becomes more expensive since we have to read to check if we are already tracked, then write. This also opens up race conditions which again are fixed by further serializing the requests thru a server, but at a perf cost. The cause of the crash is an easy fix since I am matching on 1 = :ets.select_delete(...) as a sanity check, but I'm not still convinced this is something we should check for as there is no circumstance where the code is correct when duping track calls.

from phoenix_pubsub.

rradz avatar rradz commented on July 29, 2024

@chrismccord I think that this sanity check is very much in place. If it wasn't async operation, I would suggest returning the number of untracked instances, so the client could perform that check.
The question is: should the server be crashing because of it or not? If you think it should(as an easy precaution against unnoticed tracking issues), then this issue can be closed. Otherwise, the problem is how to (and if) inform the programmer, that there is a possible issue with duplicated tracking.

from phoenix_pubsub.

josevalim avatar josevalim commented on July 29, 2024

from phoenix_pubsub.

chrismccord avatar chrismccord commented on July 29, 2024

I forgot we were already serializing track thru the tracker server, so checking for dups is a relatively cheap ets lookup, which is the same we do for update. Thanks!

from phoenix_pubsub.

rradz avatar rradz commented on July 29, 2024

@chrismccord That was truly fast. Thanks a lot!

from phoenix_pubsub.

zampino avatar zampino commented on July 29, 2024

Dear Phoenix PubSub team,

we're having recurring Presence related errors which refer to @chrismccord 's quote above

The cause of the crash is an easy fix since I am matching on 1 = :ets.select_delete(...) as a sanity check, but I'm not still convinced this is something we should check for as there is no circumstance where the code is correct when duping track calls.

specifically

Child Phoenix.Tracker of Supervisor #PID<0.1949.0> (Supervisor.Default) terminated ** (exit) an exception was raised:
** (MatchError) no match of right hand side value: 6
(phoenix_pubsub) lib/phoenix/tracker/state.ex:294: anonymous fn/3 in Phoenix.Tracker.State.remove_down_replicas/2
(elixir) lib/enum.ex:1755: Enum."-reduce/3-lists^foldl/2-0-"/3
(phoenix_pubsub) lib/phoenix/tracker/state.ex:293: Phoenix.Tracker.State.remove_down_replicas/2         (phoenix_pubsub) lib/phoenix/tracker.ex:528: Phoenix.Tracker.permdown/2
(phoenix_pubsub) lib/phoenix/tracker.ex:447: Phoenix.Tracker.handle_heartbeat/2
(phoenix_pubsub) lib/phoenix/tracker.ex:305: Phoenix.Tracker.handle_info/2
(stdlib) gen_server.erl:601: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:667: :gen_server.handle_msg/5 Pid: #PID<0.1951.0> Start Call: Phoenix.Tracker.start_link(Radio.Presence, [name: Radio.Presence, otp_app: :radio, pubsub_server: Journal.PubSub], [name: Radio.Presence, otp_app: :radio, pubsub_server: Journal.PubSub]) 
Restart: :permanent Shutdown: 5000 Type: :worker 

the match error is due to

https://github.com/phoenixframework/phoenix_pubsub/blob/v1.0.1/lib/phoenix/tracker/state.ex#L294

i.e.

1 = :ets.select_delete(pids, [{{pid, topic, key}, [], [true]}])

fails on our app with matching with right hand values of 2 or 6 as in the stacktrace above.

We're using Phoenix Presence outside of the ordinary Channel flow, but to detect when
our satellite applications come up on remote erlang nodes after being scheduled.

Our setup consists of a applications of type A and applications of type B which run on different erlang nodes and both subscribe to the same topic runner:<runner_id>

Only nodes of type B track they're presence (with Phoenix.Presence.track(self(), "runner:<runner-id>", "runner", %{some-meta}) whenever the application starts (and only then). Type B applications can go down and up quite frequently for the same <runner_id> even within short period of time.

Crashes above happen only on nodes of type A.

Our guess is that there are collisions in the triple {pid, topic, key} for replicas on A-nodes but coming from B-nodes, because the pid would always have the same sequential index per same topic (key is a fixed string).

9f654a0 wouldn't help on our setup because guards against duplicate tracks on B-apps would always pass.

Would it make sense to relax the sanity check above concerning number of triples present in the ETS table at the moment of removing a down replica.

Cheers Andrea :-)

from phoenix_pubsub.

chrismccord avatar chrismccord commented on July 29, 2024

@zampino I believe your issues have been fixed on the cm-clouds branch. Can you try the branch and attempt to replicate your issues? Thanks!

from phoenix_pubsub.

zampino avatar zampino commented on July 29, 2024

@chrismccord thanks! I'll check that out

from phoenix_pubsub.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.