Giter Club home page Giter Club logo

Comments (6)

medmunds avatar medmunds commented on May 23, 2024 1

I thought a little more about keeping track of the preferred connection, and it seems like it would be perfectly reasonable to use Django's cache framework for this. That avoids the performance hit of checking status on every send, but without the complexity of adding models.

I updated the (still untested) gist with code to cache the preferred ESP, and added a webhook view that invalidates the cache: https://gist.github.com/medmunds/e6b837bb3b382098d775cb412b889632

from django-anymail.

medmunds avatar medmunds commented on May 23, 2024

Really interesting idea -- thanks for the thought you've put into this. (Believe me, I understand the pain you're feeling with SendGrid recently.)

And it's clever to use the status APIs to determine if an ESP is usable. Because the ESP's sending APIs often stay up, accepting and queuing messages, even when the ESP is significantly delaying delivery. So you have to find some other way to determine if the ESP is "down" for sending purposes. (BTW, is that statuspage.io v2/summary.json endpoint documented anywhere?)

There would be a handful of problems to work through:

1. Keeping track of the current preferred connection

Anymail deliberately doesn't maintain any persistent state, which avoids a whole lot of additional configuration overhead for users. I'm not sure how we could implement "save chosen backup as current preferred connection" within Anymail.

2. Data can diverge when mixing ESPs

This is probably more of a user-education issue: sending to the same set of recipients through multiple ESPs leads to potentially-confusing data divided among those ESPs.

If you're using your ESP's unsubscribe management, for example, you'd want to avoid backup ESPs sending to recipients on the primary ESP's unsubscribe list. And you'd have to figure out how to handle unsubscribes coming from messages sent through a backup ESP. That sort of syncing is beyond the scope of Anymail. (Similar issues with ESPs' bounced/blocked recipient lists.)

You'd also have to be careful how you interpret open click and open rate data collected by one of the ESPs. (Or really, anything where ESPs maintain data on your behalf.)

3. Deciding if an ESP is "up"

As I said earlier, I think using the status API is clever. But it also looks like it could be tricky to decide whether the ESP is "up" from the status summary.

The simplest approach would be, if there are any open incidents, consider it down. But a lot of incidents aren't about sending/delivery -- e.g., Mailgun recently had a control panel outage, and SparkPost had delays in metrics and reporting. I don't think I'd have wanted to switch to a backup ESP in either of these cases.

We could try to be smarter by looking for outages only in particular components. Each ESP labels its status components differently, so that would take some research. (E.g., SendGrid is currently showing "Mail Sending" as "Degraded Performance," but all three of its subcomponents as "Operational" -- including the ones that track "the flow of mail generated by mail.send API requests.") It could also be fragile to future status page changes.

Another problem is that (some) ESPs may not promptly list (some) service issues in their status pages. (Though I suppose you're no worse off in that case than you are now.)

All that said, though, I threw together a quick and untested implementation of a backend that checks status APIs and sends through the first "up" ESP. To avoid the whole persistent-state problem, it checks the status API(s) on every send. (Optimization is "left as an exercise for the reader." As is figuring out my bugs in it. 😄) Feel free to try it out, fork and edit, and if you end up with something useful we can either add it to Anymail or at least to the docs as an advanced example.

from django-anymail.

tgehrs avatar tgehrs commented on May 23, 2024

Thanks for the thoughtful response, definitely some things to work through here. I will fork it and try to work through your points here a bit.

Statuspage.io's undocumented documentation
v2 -- appending /api to a statuspage will show their documentation
v1 -- this seems to be more for the owners, but includes schema such as component status:

component[status] - The status, one of operational|degraded_performance|partial_outage|major_outage.

1. Keeping track of preferred connection
Would probably need to implement a simple model for this, I noticed in another issue that keeping any sort of state is outside of the scope of Anymail, with that in mind maybe this belongs in a separate project.

2. Data divergence between ESPs
This is likely the deal breaker, especially if you are using ASM.

If you're using your ESP's unsubscribe management, for example, you'd want to avoid backup ESPs sending to recipients on the primary ESP's unsubscribe list.

With that in mind two potential workarounds for the most simple use case:

  • Through in this current SendGrid fiasco and from what I can tell most of the cases this would deal with, their API is up, so simply syncing unsubscribes that way would not be the end of the world, but once again seems a bit outside the scope of Anymail
  • Once the status is resolved, sync unsubscribes from the fallback ESP to

3. Deciding if an ESP is up
Good points on the fragility of my suggested implementation. I reached out to StatusPage about if/how companies can make these "breaking" changes to their status page, unfortunately they do not send notifications for changes in component structure, only changes in status. I will plan to map these out (shouldn't take long) as-is but this could cause issues of not falling back.

from django-anymail.

tgehrs avatar tgehrs commented on May 23, 2024

good call with caching, seems like the perfect use case.

I forked the gist and added in a way to test for a list of components to isolate what the issue actually is so we are not falling back if for example SendGrid's marketing service is down. (Un)luckily SendGrid is having some more downtime today so I was able to confirm the is_component_working function works as expected.

Do you think checking components should replace is_backend_working or should this be a setting? I will try to work on some testing soon

from django-anymail.

medmunds avatar medmunds commented on May 23, 2024

Nice. Yeah, I'd probably just move the component checking into is_backend_working. is_backend_working should represent our best guess at determining if the ESP is "up." If you've figured out the right components to check, that's a better test than the overall status check I had in there.

from django-anymail.

medmunds avatar medmunds commented on May 23, 2024

I don't think this code really belongs in the core Anymail, but I'm going to add a link to your gist to the docs on using multiple backends.

from django-anymail.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.