tim77code / limbo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from llimllib/limbo

2.0 3.0 12.0 9.76 MB

A simple, clean, easy to modify Slack chatbot

License: MIT License

Makefile 1.08% Shell 3.95% Python 94.97%

limbo's Introduction

Limbo

A Slack chatbot

Installation

Clone the repo
Create a bot user if you don't have one yet, and copy the API Token
export SLACK_TOKEN="your-api-token"
make run (or make repl for local testing)
Invite Limbo into any channels you want it in, or just message it in #general. Try typing !gif dubstep cat to test it out

I recommend that you always run limbo in a virtualenv so that you are running in a clean environment.

Command Arguments

--test, -t: Enter command line mode to enter a limbo repl.
--hook: Specify the hook to test. (Defaults to "message").
-c: Run a single command.
--database, -d: Where to store the limbo sqlite3 database. Defaults to limbo.sqlite3.
--pluginpath, -pp: The path where limbo should look to find its plugins (defaults to /plugins).

Environment Variables

SLACK_TOKEN: Slack API token. Required.
LIMBO_LOGLEVEL: The logging level. Defaults to INFO.
LIMBO_LOGFILE: File to log info to. Defaults to none.
LIMBO_LOGFORMAT: Format for log messages. Defaults to %(asctime)s:%(levelname)s:%(name)s:%(message)s.
LIMBO_CLOUDWATCH: Turn on CloudWatch metrics. Must have format NameSpace&DimName-1=Value-1&...&DimName-n=Value-n, for zero or more dimension-value pairs. A metric named EventCount will be published into the given namespace with the given dimensions (and unit Count). This metric gives the number of events Limbo received from Slack.
LIMBO_PLUGINS: Comma-delimited string of plugins to load. Defaults to loading all plugins in the plugins directory (which defaults to "limbo/plugins")
LIMBO_NEEDMENTION: If defined, then the Limbo chatbot will only respond to commands addressed to it (via an @-mention). If undefined, then will respond to all command sent to channels it has been invited to.

Note that if you are getting an error message about not seeing environment variables, you may be running limbo as sudo, which will clear the environment. Use a virtualenv and always run limbo as a user process!

Commands

It's super easy to add your own commands! Just create a python file in the plugins directory with an on_message function that returns a string.

You can use the !help command to print out all available commands and a brief help message about them. !help <plugin> will return just the help for a particular plugin.

By default, plugins won't react to messages from other bots (just messages from humans). Define an on_bot_message function to handle bot messages too. See the example plugins for an easy way to define these functions.

These are the current default plugins:

Docker

How do I try out Limbo via docker?
- @PeterGrace maintains a public build of limbo, available from the docker registry. Executing make docker_run will start the default bot.
- make docker_stop will stop the bot
When I start the docker container, I see an error about unable to source limbo.env. Is this a problem?
- No. The limbo.env file only exists when using Kubernetes with the included opaque secret recipe for storing your environment variables.
I'd like to develop plugins for Limbo, but would still like to use Docker to run the bot. Is there a quick way to add plugins to the bot?
- Yes! Use the included Dockerfile.dev as a template, and simply build via make docker_build You'll then need to start the bot with your new_image_name, for example docker run -d -e SLACK_TOKEN=<your_token> new_image_name

Contributors

limbo's People

Contributors

Stargazers

Watchers

Forkers

rstata davidchaiken tooler1 rstata-verticloud amva13 micarr19 xmwangi aravic amirfarhat sebastianrpalacios mohit-chawla mkorpusik

limbo's Issues

Switch to Fargate

The current ECS configuration uses EC2 instances rather than the new Fargate service (which was launched just after we started preparing for the class).

Fargate requires less configuration and monitoring than EC2 instances ("eliminate toil"). It would be better to use Fargate instead of EC2 instances.

Get Docker working

The Docker file from upstream and commands like "make docker_build" aren't working. Get them to work in a reasonable manner.

Require the bot's @name be present in message before responding

We expect to have a lot of bots in the same team during the class. It would be chaos if every bot responded to !CMD. To avoid this, modify the behavior of the bot to require @botname to be present in a message before responding to a !CMD.

Improve reproducibility of builds

The current Dockerfile.test file has a lot of time-dependent commands in it which make the current build non-reproducible. If multiple team members were working together on the same 'bot, they would no doubt be tripped-up by this problem.

This problem can be fixed by publishing to ECR (or another Docker registry) these time-dependent layers and having multiple team-members reference them by a name and version, which is a reproducible way of sharing these lower layers.

Remove AWS keys from environment of containers

Note to students: we have already implemented this feature (see pull #36), which we will be
the lab for Thursday. So this is probably not the best issue to pick up for "hack day." However,
see issue #43 as an extension to this issue.

AWS keys are currently passed into the container environment so Limbo can talk to CloudWatch. These environments can be inspected with the ECS Web UI. A better approach would be to create an IAM Role for these containers with the right set of CloudWatch permissions.

Add "update-only" command to "deploy.sh" and change .travis.yml to use it

The Travis deploy automation will launch the limbo-travisci bot when we push changes onto the master branch. We probably don't want that. Do we? If not, it would be easy to add logic to deploy.sh to prevent it from happening.

Deploy automation using Docker compose files and "ecs-cli compose"

Want to use Docker compose files to describe the (very simple) orchestration required to Deploy our 'bot. Using Docker compose should allow us to more easily switch to Kubernetes in the future should we decide to.

Also, want to support deployments from laptops without requiring that a lot of software gets installed on the laptops. "docker-compose run" is a good way to achieve this.

Finally, we want to design deployment orchestration in a manner that supports multiple 'bots being written by the multiple students in the class.

Migrate to Elastic Kubernetes Service (EKS)

Kubernetes is a fast-growing, widely-adopted application management fabric that functionally is equivalent to ECS. EKS is a managed Kubernetes service provided by AWS. It would've been better to have done the class using EKS, but it wasn't available when we started preparing the labs. (We do use docker-compose for deployment, which works well with Kubernetes, so the migration to EKS should be simple.)

Autoscaling for Limbo cluster

Right now, if too many 'bots are started, right now the system throws an alarm which needs to be eliminated through manual effort. To eliminate toil, it would be good to hookup the ECS Autoscaling feature to the ECS memory-reservation utilization metric to trigger expansions and collapses based on the memory reservation (which is a proxy for the number of bots that are running).

Avoid endless restart loop

Right now, if the Limbo process crashes, ECS is configured to restart it. If the crash is deterministic -- for example, if there are bad credentials or bad code was pushed -- ECS will restart it indefinitely, which causes a ton of log streams to be created in CloudWatch logs (among other problems). I've poked around the ECS documentation and don't see any configuration setting that tells it to stop restarting after too many tries. We could probably write a Lambda function that reads the event stream coming off of ECS and looks for this (although would need state for doing this).

Switch to Ubuntu

We got Alpine Linux to work, but the experience convinced us to switch to Ubuntu:

The fact that gliderlabs/docker-alpine#184 has remained open for over a year has us wondering about the overall security of this Linux distribution.
Although I did get the build and tests to work, it's been flaky. Python/Pip in particular seem to be behaving inconsistently at times.
Giving students some experience with Ubuntu SA is probably a more reusable experience than Alpine SA would be.

Add CloudWatch metric that counts received events

Add a CloudWatch metric for the count of events received by a Chatbot. Post "zero counts" so this metric can also be used as a health check.

Talk to CloudWatch through a Metrics class so that alternative monitoring systems could be used instead.

CloudWatch will be used when Limbo finds the environment variable LIMBO_CLOUDWATCH defined. This variable has the format "NameSpace&DimName=Value&DimName=Value&...&DimName=Value", which controls how metrics are sent to CloudWatch. Specifically, the given CloudWatch name space is used for metrics, and the provided dimensions are applied to all metrics. Metric names are hard-coded into Limbo. Right now, the only metric that is sent to CloudWatch is named EventCount, the event count talked about earlier in this description.

Switch from BOTNAME to SERVICE_NAME

The use of BOTNAME has made the deployment scripts fragile, as they are dependent on people correctly setting the BOTNAME variable. (I was putting together detailed TravisCI setup instructions for the Wed lab and was hit again with this BOTNAME problem, which was the last straw.)

In fact, the BOTNAME variable is used to name the Docker Registry Repository used for a student's bot, plus the ECS Service used to run the bot. There's no reason for the name used for Docker Registry and ECS to agree with the BOTNAME.

Instead, let's switch to SERVICE_NAME, which won't be set manually but rather will be set by grabbing the username or organization name that "owns" the current repository in Github. In particular, the SERVICE_NAME is extracted from git using the following command line:

git remote get-url origin | sed 's|.*[:/]\([^:/]*\)/[^/]*$|\1|'

This is very specific to Github, but that's okay for 6.S188, and it greatly simplifies configuration.

The SERVICE_NAME variable will be set in the Makefile, and it will be set in such a way that it can be overridden by the SERVICE_NAME environment variable set by the caller of Make, so users are not stuck with the default, github-specific behavior.

cloudwatch/pagerduty integration

We have verified that sending email from cloudwatch to pagerduty can trigger an alert when limbo is down. Would it be possible to use the PagerDuty API so that the incident can autoclear when the error condition clears?

Need more expressive error messages when credentials are wrong or missing

When trying to eliminate the use of environment variables, instead relying on the AWS service, this error message was displayed. The problem was that the environment did not have access to the iap-secrets variables. A better error message could simply mention it has no access to iap-secrets. This is the error:

Missing master_SLACK_TOKEN
Makefile.windows:78: recipe for target 'ecs_stop' failed
make: *** [ecs_stop] Error 1

Switch Travis CI to Docker-based build and test

The upstream Limbo project uses Travis CI, but it doesn't use Docker for that purpose. Instead, it uses a direct build against a matrix of five versions of Python. In our case, since we're always running out of a Docker container with a fixed version of Python, we don't care to do matrix testing, but we do want to test to make sure that all the Docker stuff is working.

Further, the changes in this commit will set us up nicely to use Travis to push our build artifact to a Docker registry so it can be available for production.

Support configuration through a file or service

The ECS Console allows one to easily inspect the environment variables of an ECS task, which exposes the Slack token of Chatbots to all students. In general it's good practice to support configuration through files as well as environment variables, and in this case it would close a security hole.

Remove SLACK_TOKEN from environment of containers

The Slack token used by Limbo is passed via the SLACK_TOKEN environment variable. When running in ECS, this variable can be inspected with the ECS Web UI. A better approach would be to store the token in S3 or some other storage service that provides access control, and then use the IAM role provided by issue #22 to gain access to the token. (This would require a change to the initialization code of Limbo to provide an alternative mechanism for loading the Slack token. This alternative mechanism should be available through some sort of "feature flag," with the current environment-based initialization configured as the default.)

Aggregate posting of EventCount metric

The current implementation posts EventCount updates to CloudWatch for every iteration through the main loop in limbo.py. This loops executes once a second, and typically processes no events (and thus posts a zero to CloudWatch). Since CloudWatch charges by the number of times a metric is posted, this is not cost effective. (The AWS free tier supports 1M API calls free per month, including PutMetricData. A single bot running 24x7 all month would generate 2.7M API calls.) This is particularly wasteful because, on the CloudWatch side, since we aren't using "high resolution" events (which cost more), the EventCount metric is aggregated on a minute-wise basis.

A better implementation would do some aggregation of the EventCount metric on the limbo side. A "optimal" solution would seek to post only once every 60s, because that's the aggregation boundary of CloudWatch. A simpler implementation would post once every ten seconds and could probably deliver much of the cost savings with a simpler implementation.

Generalize automation for upstream

The automation we've been adding lately -- especially in Makefile, ecr_push.sh, and deploy.sh -- is hardwired for the IAP class. For example, things like our Docker registry, the AWS region, and so forth. To make our work more useful upstream, the tim77-specific assumptions should be parameterized.

tim77code / limbo Goto Github PK

limbo's Introduction

Limbo

A Slack chatbot

Installation

Command Arguments

Environment Variables

Commands

Docker

Contributors

limbo's People

Contributors

Stargazers

Watchers

Forkers

limbo's Issues

Recommend Projects

Recommend Topics

Recommend Org