Giter Club home page Giter Club logo

chaostoolkit-oci's Introduction

Chaos Toolkit Extension for OCI

Build Status Python versions

This project is a collection of actions and probes, gathered as an extension to the Chaos Toolkit.

Setup environment

Install Python and pip

Install and create a virtual environment

Install chaostoolkit, chaostoolkit-oci

Configure your signing keys and obtain your OCIDs.

Create your OCI Configuration file.

Using chaostoolkit-oci

Creating and running chaos experiments.

Contributing

Contributing to chaostoolkit-oci.

Changelog

View the CHANGELOG.

Acknowledgement

Chaos installation: chaostoolkit.org

Python, pip installation and virtualenv documentation: https://docs.python-guide.org/#

License

chaostoolkit-oci's People

Contributors

akshat1145 avatar hyder avatar lawouach avatar markxnelson avatar nazarenof avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

chaostoolkit-oci's Issues

Simulate an accidental nat gateway deletion

Simulate the accidental deletion of a nat gateway by a human operator. This can also be used to simulate isolation scenarios.

Need to be able to:

  • get nat gateways with a given vcn id
  • filter the nat gateway using given a set of metadata
  • delete a nat gateway using a given nat gateway id
  • rollback the deletion of the nat gateway

Improve documentation for setting up dev environment, coding standards

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

The chaostoolkit-oci project sets some coding standards e.g. pep8 and also requires packages. These are documented a little bit all over the place either in the chaostool-oci, the chaostoolkit itself or sometimes we have to check the Python documentation.

It would be good if these could be documented in the project here to facilitate developer set up, adoption of common standards etc to facilitate the rapid integration of new contributors.

Potential Configuration

# Copy-paste your configuration here. 

References

Machine network isolation

A useful experiment is to product a machine network isolation on a compute resource.

The reason for the experiment would be not only to see how a cluster would behave if it looses a node (that can be done by stopping the resource) but to see how a resource would behave if it looses its cluster.

Please clarify copyright

Hi,
I am interested in using and contributing to this project. Would you be able to clarify who owns the copyright? I don't see any notices in the repo.
Thanks a lot for your help,
Mark Nelson

stop_instance_pool

The capability of stopping an instance_pool, analogous to AWS asg.

This will allow to experiment with shutting down (stopping) several compute resources at once from a given group.

Delete a routing table

Simulate an accidental routing table deletion.

Simulate the accidental deletion of a routing table by a human operator. This can also be used to simulate isolation scenarios.

Need to be able to:

  1. get route tables with a given vcn id
  2. filter the route table using given a set of metadata
  3. delete a route table a given route table id
  4. delete a route table using a given set of metadata
  5. rollback the deletion of the route table

kill_failure_domain

Oracle cloud has three concepts, Region, AD (Availability Domain) and FD (failure domain), resources in one FD are independent from resources in a different FD, if one fails the rest remains fine, hence the name.

This action needs to be able to stop/force stop, an entire set of resources in a given FD.

In order to be more useful, I would add a probe that counts the number of FDs in a given AD, this is in the hypothetical case we have resources in two failure domains, and we want to shutdown one of them as part of the experiment or not shutdown anything if there is one FD, implying we don't have any failover scenario or HA.

Block traffic in nat gateway

In cases where instances do not have a public ip address, they use a nat gateway to access the internet; by blocking the traffic of the nat gateway we can simulate an entire subnet failure and get insights on what would happen if we suddenly lost an entire group of nodes.

We can test:

  • Monitoring system and see alerting.
  • How external monitoring would alert us if the monitoring system inside the network is not able to reach the internet (if part of the experiment).

This can be an interesting experiment and it is simple to do since the nat gateway in OCI blocks traffic directly (and rollbacks if needed) without any major change in configuration.

Refactor and organize chaosoci along the same line structure as the oci-python-sdk

Currently, under chaosoci, we have the following:

compute
init.py
types.py

We need to refactor and reorganize so it follows the structure of the oci-python-sdk e.g.:

chaosoci
|core
|
block
|_compute
|_computemanagement
|_networking
....

Likewise, the tests need to follow the same structure.

This will make it easier to add the different probes and actions in the future.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.