Giter Club home page Giter Club logo

go-choria's Introduction

Choria Broker and Server

Choria is a framework for building Control Planes, Orchestration Systems and Programmable Infrastructure.

This is a daemon and related tools written in Go that hosts services, autonomous agents and generally provide a secure hosting environment for callable logic that you can interact with from code.

Additionally, this is the foundational technology for a monitoring pipeline called Choria Scout.

More information about the project can be found on Choria.IO.

CodeFactor CII Best Practices Go Report Card

Links

go-choria's People

Contributors

ananace avatar bastelfreak avatar fklajn avatar jeffmccune avatar jpluscplusm avatar mpepping avatar mrbanzai avatar optiz0r avatar ploubser avatar ripienaar avatar smortex avatar treydock avatar vjanelle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-choria's Issues

keep stats about protocol issues

  • JSON parse failures
  • JSON schema validation failures
  • Signing failures
  • messages that had their signatures validated
  • messages that had invalid signatures
  • invalid certificates received
  • general protocol errors - missing certs, unparseable certs etc

add buildinfo

Instead of showing the build details in choria --version we should have a build info sub command and specify just version for version.

This currently breaks man page generation from kingpin

Build instructions and considerations

Hi!

I finally managed to build Choria on FreeBSD, the process was unexpectedly complicated, and I am not sure about the exactness of the process I used to build the code.

Because I have no real experience with go-based programs, I looked at other go software packages in the FreeBSD ports tree to see how they are built, and since there is a bunch of differences, it might make sense to consider a few points.

How I built choria

In case I did this utterly wrong, let me start by explaining how I built Choria:

git clone https://github.com/choria-io/go-choria
cd go-choria
go get
go build

This produced a working go-choria binary in the working directory \o/.

However, a ~/go directory was created and a lot of source codes where downloaded there (I assume it's all the choria dependencies… 76MB 😨).

Reproducible builds

After creating a new user account and building choria the same way (but at a different date), the content of the ~/go directory is not strictly the same (89 MB this time). Maybe I am wrong, but my guess is that the dependencies are not fetch against a particular commit, and at some point, the build may break because of a dependency got updated.

In order to package Choria on FreeBSD, the checksum of all sources must be registered so that they are checked before building. Most go applications in the FreeBSD ports tree have all their dependencies included into their source code (!), generaly in a vendor directory (e.g. aptly, hub, syncthing, etc). While this is arguably ugly, it offers the benefits of reproducibility. Another option would be to add information about each dependencies version / commit, so that they could be checked. This is for example done for the go-cve-dictionary port, based on a file Gopkg.lock in upstream's repository (a lot more work for porters, but far less ugly at the repository level).

Moving into one of these directions (or something similar) would be awesome.

Binary name

The repository is called go-choria, and go build produce a go-choria binary. Can you confirm that the file should be renamed to choria when installed on the end-user system?

nats-io/go-nats dependency registered twice

I guess that the nats-io/go-nats dependency appears twice in glide.lock, first here:

go-choria/glide.lock

Lines 46 to 50 in 8352d2b

- name: github.com/nats-io/go-nats
version: d66cb54e6b7bdd93f0b28afc8450d84c780dfb68
subpackages:
- encoders/builtin
- util

then here (with the name nats-io/nats that redirects to nats-io/go-nats):

go-choria/glide.lock

Lines 55 to 56 in 8352d2b

- name: github.com/nats-io/nats
version: d66cb54e6b7bdd93f0b28afc8450d84c780dfb68

Notice how the second entry has the same commit as the first entry.

This was discovered while scripting dependencies extraction from glide.yaml. I'll try to submit a PR that fix the issue.

use os.Getuid rather than user.Current

At present user.Current() is used in a few places, unfortunately this does not work too hot when cross compiling since cgo isnt available during that.

Annoyingly Go is supposed to have worked around this by falling back to os.Getuid in those cases but this appears to not work, running a cross compiled choria still produce errors about this when determining SSL paths etc

So rip those out and use os.Getuid directly

keep stats in the connector

  • initial connect tries
  • initial connect total time
  • numbers of each type of message - direct, federated, broadcast etc
  • connection disconnects
  • connection reconnects
  • connection closed
  • connection errors

Keep stats of the federation broker

Old fed broker keeps stats and publishes this on the wire regularly so that mco federation observe can report on the global state.

We need at least something compatible for now but down the line I'd like graphite/prometheus etc emiters.

Keep stats and expose

Keep stats using the https://github.com/rcrowley/go-metrics library:

  • Adapters
  • Federation Brokers
  • Network Server
  • Registration

Expose those via the expvar method for now, later we'll do graphite, prometheus etc

It should listen on plugin.choria.stats_port and be off by default

allow extra agents to be compiled in

It should be possible for files to be patched into the build process and built conditionally,

so say you have github.com/foo/go-foo-agent you might patch in a file server/additional_agents_foo.go:

// +build foo

package server

import (
	fooagent "github.com/foo/go-foo-agent"
)

func init() {
	registerAdditionalAgent(func(ctx context.Context, mgr *agents.Manager, connector choria.InstanceConnector, log *logrus.Entry) error {
		fa, err := fooagent.New(mgr)
		if err != nil {
			return fmt.Errorf("Could not create foo agent: %s", err)
		}

		mgr.RegisterAgent(ctx, "foo_agent", fa, connector)

		return nil
	})
}

and then a go build -tags 'foo' should activate this agent

add a network broker

NATS is easily embedded so lets make a choria broker --config /.... command that runs an embedded broker.

It should:

  • always use SSL as taken from the choria configs
  • support clusters with SRV resolution and manual configs
  • have minimal config
  • log to normal choria places/formats
  • not expose its own stats just yet, but a future integration for stats must be able to get at those

Proposed configs are:

  • plugin.choria.network_client_port
  • plugin.choria.natwork_peer_port
  • plugin.choria.network_peer_user
  • plugin.choria.network_peer_password
  • plugin.choria.network_peers

Down the line, as a feature that can be enabled using a compiler flag, it should support FIPS via a something like https://github.com/spacemonkeygo/openssl which would let people build against the system openssl.

NATS doesnt play well with this library so internally we'd open a listening port that takes normal TLS connections and route it internally to the plain text NATS port. A future mcollective connector would then support the same - TLS connection to plain text NATS. Thus we can elevate NATS to FIPS compliance via a managed TLS Proxy

revisit 'mco rpc'

The default mco rpc tool is a bit meh and probably contributes greatly to the difficulty in using mcollective.

It was written before DDLs even existed and was never revisited.

This should be revisited to focus more on the problem it should solve, towards that I have come to the following basic sketch:

Goals:

  • Focus on what users will most often want to see by creating a dynamic interface
  • Rethink some of the user facing terminology, removes RPC in favor of Request etc
  • Gradually expose the complexity inherent in a generic RPC client rather than by default produce a wall of text
  • Have tab completion for every part of the cli, the mco completion can already do this
  • Use $PAGER for things like --action-doc if more than $LINES of output

There is one possibility I also want to explore and that is to make a more interactive client that asks you client using the :prompt defined in the DDL, you'd start it up like choria puppet --interactive and it will start asking you questions via prompts, defaults etc and construct the request - and possibly show you what command would have produced the same outcome as a learning tool. This way people with almost no experience can interactively learn the system

Some future suggestions:

  • Make some file where you can specify on a per agent/action basis defaults you always wish to apply, like say --batch or --noop for the runonce agent or whatever via @trevor-vaughan

The code used to produce the output can be found here https://gist.github.com/ripienaar/f68d2a9031b35f9dc3d467c9d85886ee it just prints stuff doesnt actually make requests

Default action, show available agents

$ choria
Choria client version x.x.x

Usage: choria <agent> <action> [agent options] [request options]

Available agents:

  package        Install and uninstall software packages
  puppet         Run Puppet agent, get its status, and enable/disable it
  rpcutil        General helpful actions that expose stats and internals to SimpleRPC clients
  service        Start and stop system services

See choria <agent> --help for details about the agent

Per agent generated details

$ choria puppet
Puppet agent version 1.11.1

Usage: choria puppet <action> [agent options] [request options]

Run Puppet agent, get its status, and enable/disable it

Available actions:

  disable                Disable the Puppet agent
  enable                 Enable the Puppet agent
  last_run_summary       Get the summary of the last Puppet run
  resource               Evaluate Puppet RAL resources
  runonce                Invoke a single Puppet run
  status                 Get the current status of the Puppet agent

See choria puppet <action> --help for details about one of the actions

Per action view

Here we focus on showing the available inputs the action take and turn them into --foo style flags and show them as options.

All the old RPC noise is hidden by default under --filter-help and --request-help, an additional --action-doc exist to show the DDL produced doc for the action

Ideally these options would show as much as possible from the DDL things like data type and default but we have limited screen real estate

$ choria puppet runonce --help
Puppet agent version 1.11.1

Usage: choria puppet runonce [agent options] [request options]

Run Puppet agent, get its status, and enable/disable it

Options for the runonce action:

    Use --action-doc to get details about these such as types, defaults and valid values

Optional options:
        --force                      Will force a run immediately else subject to default splay time
        --server                     Address and port of the Puppet Master in server:port format
        --tags                       Restrict the Puppet run to a comma list of tags
        --noop                       Do a Puppet dry run
        --splay                      Sleep for a period before initiating the run
        --splaylimit                 Maximum amount of time to sleep before run
        --environment                Which Puppet environment to run
        --use_cached_catalog         Determine if to use the cached catalog or not

Additional help:
        --action-doc                 View the documentation for the runonce action
        --filter-help                Help on selecting which nodes to act on
        --request-help               View a full set of request options

Per action DDL doc

This needs some iteration it really is just to show the idea here:

$ choria puppet runonce --action-doc
Puppet agent version 1.11.1

Definition of the runonce action

Action inputs:

  Optional options:
    environment (String):
      Description: Which Puppet environment to run
           Prompt: Environment
         Required: false
          Default: nil
       Max Length: 50
       Validation: puppet_variable

    force (Boolean):
      Description: Will force a run immediately else subject to default splay time
           Prompt: Force
         Required: false
          Default: nil
       Max Length: 0
       Validation: none

  <snip>


  Action outputs:

    initiated_at:
      Description: Timestamp of when the runonce command was issues
       Display As: Initiated at
          Default: 0

    summary:
      Description: Summary of command run
       Display As: Summary
          Default:

Host filters help

Supplying --filter-help will add just the filter options. For demo purposes this is just a copy/paste from mco rpc some refining will be needed to make this suck a bit less

$ choria puppet runonce --filter-help
Puppet agent version 1.11.1

Usage: choria puppet runonce [agent options] [request options]

Run Puppet agent, get its status, and enable/disable it

Options for the runonce action:

    Use --action-doc to get details about these such as types, defaults and valid values

Optional options:
        --force                      Will force a run immediately else subject to default splay time
        --server                     Address and port of the Puppet Master in server:port format
        --tags                       Restrict the Puppet run to a comma list of tags
        --noop                       Do a Puppet dry run
        --splay                      Sleep for a period before initiating the run
        --splaylimit                 Maximum amount of time to sleep before run
        --environment                Which Puppet environment to run
        --use_cached_catalog         Determine if to use the cached catalog or not

Additional help:
        --action-doc                 View the documentation for the runonce action
        --filter-help                Help on selecting which nodes to act on
        --request-help               View a full set of request options

Host Filters:
    -W, --with FILTER                Combined classes and facts filter
    -S, --select FILTER              Compound filter combining facts and classes
    -F, --wf, --with-fact fact=val   Match hosts with a certain fact
    -C, --wc, --with-class CLASS     Match hosts with a certain config management class
    -A, --wa, --with-agent AGENT     Match hosts with a certain agent
    -I, --wi, --with-identity IDENT  Match hosts with a certain configured identity

Full request help

For demo purposes this is just a copy/paste from mco rpc some refining will be needed to make this suck a bit less

$ choria puppet runonce --request-help
Puppet agent version 1.11.1

Usage: choria puppet runonce [agent options] [request options]

Run Puppet agent, get its status, and enable/disable it

Options for the runonce action:

    Use --action-doc to get details about these such as types, defaults and valid values

Optional options:
        --force                      Will force a run immediately else subject to default splay time
        --server                     Address and port of the Puppet Master in server:port format
        --tags                       Restrict the Puppet run to a comma list of tags
        --noop                       Do a Puppet dry run
        --splay                      Sleep for a period before initiating the run
        --splaylimit                 Maximum amount of time to sleep before run
        --environment                Which Puppet environment to run
        --use_cached_catalog         Determine if to use the cached catalog or not

Additional help:
        --action-doc                 View the documentation for the runonce action
        --filter-help                Help on selecting which nodes to act on
        --request-help               View a full set of request options

Request Modifiers:
        --no-results, --nr           Do not process results, just send request
        --np, --no-progress          Do not show the progress bar
    -1, --one                        Send request to only one discovered nodes
        --batch SIZE                 Do requests in batches
        --batch-sleep SECONDS        Sleep time between batches
        --limit-seed NUMBER          Seed value for deterministic random batching
        --limit-nodes, --ln, --limit COUNT
                                     Send request to only a subset of nodes, can be a percentage
    -j, --json                       Produce JSON output
        --display MODE               Influence how results are displayed. One of ok, all or failed
    -c, --config FILE                Load configuration from file rather than default
    -v, --verbose                    Be verbose

    -T, --target COLLECTIVE          Target messages to a specific sub collective
        --dt, --discovery-timeout SECONDS
                                     Timeout for doing discovery
    -t, --timeout SECONDS            Timeout for calling remote agents
    -q, --quiet                      Do not be verbose
        --ttl TTL                    Set the message validity period
        --reply-to TARGET            Set a custom target for replies
        --dm, --disc-method METHOD   Which discovery method to use
        --do, --disc-option OPTION   Options to pass to the discovery method
        --nodes FILE                 List of nodes to address
        --publish_timeout TIMEOUT    Timeout for publishing requests to remote agents.
        --threaded                   Start publishing requests and receiving responses in threaded mode.
        --sort                       Sort the output of an request before processing.
        --connection-timeout TIMEOUT Set the timeout for establishing a connection to the middleware

explore live provisioning

In large setups the desired middleware etc might not be known upfront when the machine is being installed.

Imagine the machine it built in a DC so large that one have multiple choria networks in the same DC, or perhaps you just want to create seperate networks for whatever reason.

I imagine a process like this:

  • At start look for a configuration file with choria.provision=1 set, go into provision mode if a provision server was compiled in

At this point if in provision mode it will try to connect to a compiled in nats server, or maybe one given on the CLI:

  • Connects to the provision collective
  • Publishes metadata without splay every n seconds
  • Wait for something to interact with the provision agent to tell it its configuration

The provisioning agent:

  • Receives a request to store configuration - which includes choria.provision=0
  • Writes the configuration and reloads exec itself now with the new config

We now have a normal configured choria, it:

  • Starts the provision agent should there by any provision url compiled in
  • The provision agent allows just a reprovision action that lets makes it write choria.provision=1 and copies over logging and registration settings from the running instance and reload

Any configuration file that gets loaded into the framework - even ones passed into it - should adjust itself this way when provisioning is on:

  • Turn off federation
  • Set main_collective and collectives to provisioning
  • Set registration interval to 120
  • Disable registration splay
  • Set the file_content registration target to choria.provisioning_data

check request TTLs

the protocol does not check TTLs so we have to do it in server.handleRawMessage

fix TLS route connections

In versio 0.0.2 trying to setup TLS routes yields:

{"component":"network_broker","level":"debug","msg":"192.168.88.39:5222 - rid:2 - TLS route handshake error: x509: certificate signed by unknown authority","time":"2017-12-10T13:51:
44Z"}
{"component":"network_broker","level":"debug","msg":"192.168.88.39:5222 - rid:2 - Router connection closed","time":"2017-12-10T13:51:44Z"}

Appears we're missing some TLS setup from NATS still

cache dns lookups

the way the go nats package resolves servers results in many concurrent DNS lookups in every worker in every federation broker and adapter.

This stuff should be shared and cached - even a 5 second cache will help a ton

improve writing adapters

the current adapter is like the first possible thing that worked, hacky and just served the need I had at the time.

A better adapter framework should exist, mainly it seems NATS ingest will be the most prolific use so this should be a parameter to a well written package, the other side can stay roughly as it is now but the whole thing should be written around channels and context for plumbing rather than the meh way it is now

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.