Giter Club home page Giter Club logo

pyfailsafe's Introduction

Pyfailsafe

Build Status PyPI

A Python library for handling failures, heavily inspired by the Java project Failsafe.

Pyfailsafe provides mechanisms for dealing with operations that inherently can fail, such as calls to external services. It takes advantage of the Python's coroutines, and supports both "classic" and async operations, starting from Python 3.5.

Installation

To get started using Pyfailsafe, install with

pip install pyfailsafe

then read the rest of this document to learn how to use it.

Usage

Bare Failsafe call

from failsafe import Failsafe

async def my_async_function():
    return 'done'

# this is the same as just calling:
# result = await my_async_function()
result = await Failsafe().run(my_async_function)
assert result == 'done'

Failsafe call with retries

Use RetryPolicy class to define the number of retries which should be made before operation fails.

from failsafe import Failsafe, RetryPolicy

async def my_async_function():
    raise Exception()  # by default, every exception will cause a retry

retry_policy = RetryPolicy(allowed_retries=3)

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# raises failsafe.RetriesExhausted
# my_async_function was called 4 times (1 regular call + 3 retries)

By default, retries are executed immediately - there is no backoff. If you want to wait before executing a retry, you can use the backoff parameter:

from datetime import timedelta
from failsafe import Failsafe, RetryPolicy, Delay

async def my_async_function():
    raise Exception()  # by default, every exception will cause a retry

delay = Delay(timedelta(seconds=5))
retry_policy = RetryPolicy(allowed_retries=3, backoff=delay)

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# raises failsafe.RetriesExhausted
# my_async_function was called 4 times, waiting for 5 seconds between each call.

We also ship a Backoff class that supports jitter and incremental delay

from datetime import timedelta
from failsafe import Failsafe, RetryPolicy, Backoff

async def my_async_function():
    raise Exception()  # by default, every exception will cause a retry

backoff = Backoff(
    delay=timedelta(seconds=2),  # the initial delay
    max_delay=timedelta(seconds=15),
    jitter=False  # if True, the wait time will be random between 0 and the actual time for this attempt
)
retry_policy = RetryPolicy(allowed_retries=3, backoff=backoff)

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# raises failsafe.RetriesExhausted
# my_async_function was called 4 times, waiting for 2, 4 and 8 seconds respectively.

It is possible to provide your own backoff logic by subclassing the failsafe.retry_logic.Backoff class and overriding the .for_attempt(attempt) method.

It is possible to specify a particular set of exceptions that should cause a retry - any exception not contained in that set will cause immediate failure instead.

from failsafe import Failsafe, RetryPolicy

async def my_async_function():
    return 3/0

retry_policy = RetryPolicy(allowed_retries=3, retriable_exceptions=[ZeroDivisionError])

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# raises failsafe.RetriesExhausted
# my_async_function was called 4 times (1 regular call + 3 retries)
from failsafe import Failsafe, RetryPolicy

async def my_async_function():
    raise TypeError()

retry_policy = RetryPolicy(allowed_retries=3, retriable_exceptions=[ZeroDivisionError])

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# TypeError is not ZeroDivisionError, so my_async_function was called just once in this example

RetryPolicy instances are immutable and thread-safe. They can be safely shared between Failsafe instances.

Failsafe call with abortable exceptions

If you need your code to be able to raise certain exceptions that should not be handled by the failsafe, you can add them as abortable_exceptions in RetryPolicy. This is useful when you know that the nature of failure was such that consecutive calls would never succeed.

from failsafe import Failsafe, RetryPolicy

async def my_async_function():
    raise ValueError()  # ValueError is an abortable exception, so it will not cause retry

retry_policy = RetryPolicy(allowed_retries=4, abortable_exceptions=[ValueError])

await Failsafe(retry_policy=retry_policy).run(my_async_function)
# raises ValueError
# my_async_function was called 1 time (1 regular call)

Circuit breakers

Circuit breakers are a way of creating systems that fail-fast by temporarily disabling execution as a way of preventing system overload.

from failsafe import Failsafe, CircuitBreaker

async def my_async_function():
    raise Exception()

circuit_breaker = CircuitBreaker(maximum_failures=3, reset_timeout_seconds=60)
failsafe = Failsafe(circuit_breaker=circuit_breaker)

await failsafe.run(my_async_function)
await failsafe.run(my_async_function)
await failsafe.run(my_async_function)
# now circuit breaker will get open and other calls to Failsafe.run will
# immediately raise the failsafe.CircuitOpen exception and the passed
# function will not even be called.
# Circuit will be closed again in 60 seconds.

A circuit breaker instance can and should be shared across code that accesses inter-dependent system components that fail together. This ensures that if the circuit is opened, executions against one component that rely on another component will not be allowed until the circuit is closed again.

A circuit breaker instance is stateful - it remembers how many failures occur and whether the circuit is open or closed.

A circuit breaker will not take into account abortable exceptions.

CircuitBreaker interface

A CircuitBreaker can also be manually operated in a standalone way:

from failsafe import CircuitBreaker

circuit_breaker = CircuitBreaker()

circuit_breaker.open()  # executions won't be allowed when circuit breaker is open
circuit_breaker.close()
circuit_breaker.current_state  # 'open' or 'closed'

if circuit_breaker.allows_execution():
    try:
        do_something()
        circuit_breaker.report_success()
    except:
        circuit_breaker.report_failure()

Circuit breaker with retries

It is recommended to use circuit breakers together with retry policies. Every failed retry will count as a failure to the circuit breaker.

from failsafe import Failsafe, CircuitBreaker, RetryPolicy

async def my_async_function():
    raise Exception()

circuit_breaker = CircuitBreaker()
retry_policy = RetryPolicy()
failsafe = Failsafe(circuit_breaker=circuit_breaker, retry_policy=retry_policy)
await failsafe.run(my_async_function)

RetryPolicy and CircuitBreaker events

RetryPolicy and CircuitBreaker accept event handlers at construction time, such as on_retry, on_retries_exhausted, on_abort, on_failed_attempt in the case of RetryPolicy, and on_open, on_half_open, on_close for CircuitBreaker. These event handlers can be useful for things like logging or metric recording.

import logging
from failsafe import Failsafe, CircuitBreaker, RetryPolicy

logger = logging.getLogger("MyLogger")
circuit_breaker = CircuitBreaker(on_open=lambda: logger.error("Circuit open!"))
retry_policy = RetryPolicy(on_retry=lambda: logger.warning("Retrying..."))
failsafe = Failsafe(retry_policy=retry_policy, circuit_breaker=circuit_breaker)

Using Pyfailsafe to make HTTP calls

Failsafe is not dependent on any HTTP client library, so a function making a call has to be provided by the developer. Said function must return a coroutine.

The example below uses aiohttp client to make a call.

from failsafe import Failsafe, RetryPolicy, CircuitBreaker, FailsafeError
import aiohttp

circuit_breaker = CircuitBreaker()
retry_policy = RetryPolicy()
failsafe = Failsafe(circuit_breaker=circuit_breaker, retry_policy=retry_policy)

async def make_get_request(url):
    async def _make_get_request(_url):
        with aiohttp.ClientSession() as session:
            async with session.get(_url) as resp:
                if resp.status != 200:
                    raise Exception()  # exception tells Failsafe to retry
                return await resp.json()

    try:
        return await failsafe.run(_make_get_request, url)
    except FailsafeError:
        raise RuntimeError("Error while getting data")


if __name__ == "__main__":
    async def print_response():
        from pprint import pprint
        result = await make_get_request('https://api.github.com/users/skyscanner/repos')
        pprint(result)


    import asyncio

    loop = asyncio.get_event_loop()
    loop.run_until_complete(print_response())
    loop.close()

Making HTTP calls with fallbacks

Use FallbackFailsafe class to simplify handling fallbacks:

import aiohttp
from urllib.parse import urljoin

from failsafe import FallbackFailsafe

class SortingClient:
    def __init__(self):
        endpoint_main = "http://eu-west-1.sorting-service.local"
        endpoint_secondary = "http://eu-central-1.sorting-service.local"

        self.fallback_failsafe = FallbackFailsafe([endpoint_main, endpoint_secondary])

    async def get_sorting(self, name, age):
        query_path = "/v1/sort/name/{0}/age/{1}".format(name, age)
        return await self.fallback_failsafe.run(self._request, query_path)

    async def _request(self, endpoint, query_path):
        url = urljoin(endpoint, query_path)

        async with aiohttp.ClientSession() as session:
            async with session.get(url) as resp:
                if resp.status != 200:
                    raise Exception()
                return await resp.json()

Examples

It is recommended to wrap calls in the class which will abstract away the outside service.

Check examples folder for comprehensive examples of how Pyfailsafe should be used. See examples/README.md to run examples.

Developing

When making changes to the module it is always a good idea to run everything within a python virtual environment to ensure isolation of dependencies.

# Python 3.5 or greater needed
python3 -m venv venv
source venv/bin/activate
pip install -r requirements_test.txt

Unit tests are written using pytest and can be run from the root of the project with

py.test tests/ -v

Coding standards are maintained using the flake8 tool which will run as part of the build process. To run locally simply use:

flake8 failsafe/ tests/ examples/

Publishing

  1. Set new version number in failsafe/init.py and commit it
  2. git tag X.Y.Z
  3. git push --tags

Contributing

See CONTRIBUTING.md file to add a contribution.

Maintainers:

pyfailsafe's People

Contributors

albertferras avatar alevinval avatar cajturner avatar carl0ff avatar dependabot[bot] avatar jakubka avatar joanvila avatar jsoucheiron avatar rmariano avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyfailsafe's Issues

Deployment fails for 'nightly' Python builds

Our Travis build is configured to run in Python 3.5 and 'nightly' Python: https://github.com/Skyscanner/pyfailsafe/blob/master/.travis.yml#L4

The problem is that it tries to deploy to pypi from both of these environments. The first one succeeds, but second one always fails: https://travis-ci.org/Skyscanner/pyfailsafe/jobs/408569147

(most likely because the package is already published).

The desired behaviour is that tests should be run in all environments (3.5, 3.6, 3.7, nightly) and after all these tests succeed, the deployment should be run once.

Improve interface of run to allow arguments

It seems the run() method doesn't take parameters and in order to make it work, we need to pass a wrapped function with the parameters already bound (similar to using functools.partial), as done in

return await self.failsafe.run(lambda: self._request(url))

If this can be done internally by the method, the interface would be nicer.
I'm happy to contribute if you think this is a valid use case.

Can be used without asyncio?

The readme says I can use it with "classic" functions. Does it mean classic coroutines or sync functions? And I saw a recent commit that seemed that you were removing support for sync calls. Do you plan to bring the support to sync functions back?

Ability to specify delay or backoff when retrying

Something similar to what original Failsafe supports:

It can add a fixed delay between retries:

retryPolicy.withDelay(1, TimeUnit.SECONDS);

Or a delay that backs off exponentially:

retryPolicy.withBackoff(1, 30, TimeUnit.SECONDS);

FallbackFailsafe needs better explanation in readme

We need to show an example of setting retry_policy_factory and the circuit_breaker_factory, also stress out as much as possible that there needs to be just ONE instance of FallbackFailsafe per dependency.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.