cashapp / misk Goto Github PK

Microservice Kontainer

Home Page: https://cashapp.github.io/misk/

License: Apache License 2.0

Java 2.04% Kotlin 91.96% Shell 0.08% HTML 0.04% TypeScript 5.77% JavaScript 0.10% CSS 0.01%

misk's Issues

Allow a single action method to support multiple request content types

Currently action methods can only support a single request content type, and content negotiation picks actions based on the incoming content type. We should allow actions to support multiple request content types; this would allow a single action to support both json and protobuf for wire messages, for example.

Keep Java version of exemplar in sync with Kotlin version

We've added a bunch of actions to the Kotlin exemplar (for websockets, for form actions) but haven't kept the Java version up to date. Need to add these to the Java variant to ensure they function properly in Java (right now Java exemplar doesn't even start).

Use deterministic order for routing requests to actions

Currently misk uses the order in which actions were registered with Guice to determine precedence when routing requests. This is fragile; Guice module order is often random, and the Guice modules used during testing don't always line up with production. Instead of relying on installation ordering, misk should have a deterministic precedence rule for routing requests to actions. Many routing frameworks determine precedence based on how exactly the pattern matched the path - when a request comes in, you find a set of candidate actions by doing pattern matches on path, then picking the action whose pattern has the most non-variable path segments. If this results in a tie between multiple actions, pick the action whose pattern has the most number of non-wildcard variable segments.

So given actions with the following patterns

AdminObjectLookupAction = /admin/{admin_type}/{admin_container}/{admin_command}/{admin_object}
GenericAdminAction = /admin/{admin_path:*}``
OrgAdminAction = /admin/org/{org}/{admin_command}/{admin_object}
OrgUserAdminAction = /admin/org/{org}/users/{user}
NotFoundAction = /{path:*}
/admin/org/foo/bar/zed matches [OrgAdminAction, AdminObjectLookupAction, GenericAdminAction, NotFoundAction] and selects OrgAdminAction since it has the most non-variable path segments (2 vs 1 and 0, respectively)
/admin/org/foo/users matches [OrgUsersAdminAction, OrgAdminAction, AdminObjectLookupAdmin, GenericAdminAction, and NotFoundAction] and selects OrgUsersAdminAction since it has the most non-variable path segments.
/admin/foo/bar/zed/nolo matches [AdminObjectLookupAction, GenericAdminAction, and NotFoundAction] and picks AdminObjectLookupAction since it has the most non-wildcard variable segments.

Route requests based on combination of content-type, accept, path, and method

Currently misk routes requests to actions based on the combination of path and method. The routing should take into account the content-type and accept headers as well, with the latter also influencing the unmarshalling of requests and marshalling of responses. Proposal is to introduce new @RequestContentType and @ResponseContentType annotations. The @RequestContentType annotation takes a list of media ranges, and the @ResponseContentType annotation takes a media type. When a request arrives, we find the appropriate action by:

Checking if the action's method matches the request method
Checking if the action's path pattern matches the request path
Checking if the request's Content-Type can satisfy the action's @RequestContentType media ranges
Checking if the action's associated @ResponseContentType media type can satisfy the request Accept media ranges

To handle content negotiation based unmarshalling, we remove the existing @XXXRequestBody annotations and associated ParameterExtracters and introduce a new @PostRequestBody and PostRequestBodyExtractor. The PostRequestBodyExtractor looks at the request Content-Type and finds an associated Unmarshaller for that Content-Type. Like ParameterExtractors, Unmarshallers are created via factories registered with the runtime.

interface Unmarshaller {
    fun unmarshal(source: BufferedSource): Any?

    interface Factory {
        fun create(mediaType: MediaType, parameter: KParameter): Unmarshaller?
    }
}

class JsonUnmarshaller(val adapter: JsonAdapter<Any>) : Unmarshaller {
    override fun unmarshal(source: BufferedSource) = adapter.fromJson(source)

    class Factory @Inject internal constructor(val moshi: Moshi) : Unmarshaller.Factory  {
        override fun create(mediaType: MediaType, parameter: KParameter): Unmarshaller? {
            if (mediaType.type() != "application" || mediaType.subtype() != "json") return null
            return JsonUnmarshaller(moshi.adapter<Any>(parameter.type.javaType))
        }
    }
}

To handle content negotiation based marshaling, we remove the existing @XXXResponseBody annotations and associated interceptors, and replace them with a single MarshallingInterceptor. The MarshallingInterceptor uses the@ResponseContentType associated with the action to find an Marshaller for the action’s return type. The Marshallers are also created via factories registered with the runtime.

interface Marshaller<in T> {
    fun contentType() : MediaType
    fun responseBody(o: T) : ResponseBody

    interface Factory<out T> {
        fun create(mediaType: MediaType, action: Action): Marshaller<T>?
    }
}

class JsonMarshaller<T>(val adapter: JsonAdapter<T>) : Marshaller<T> {
    override fun contentType(): MediaType = Json.MEDIA_TYPE
    
    override fun responseBody(o: T) = object : ResponseBody {
        override fun writeTo(sink: BufferedSink) {
            adapter.toJson(sink, o)
        }
    }

    class Factory @Inject internal constructor(val moshi: Moshi) : Marshaller.Factory<Any> {
        override fun create(mediaType: MediaType, action: Action): Marshaller<Any>? {
            if (mediaType.type() != "application" || mediaType.subtype() != "json") return null

            val responseType = when {
                action.returnType.rawType == Response::class.java -> {
                    (action.returnType.type as ParameterizedType).actualTypeArguments[0]
                }
                else -> action.returnType.type
            }

            return JsonMarshaller<Any>(moshi.adapter<Any>(responseType))
        }
    }
}

A single media type can have different marshallers for different response kotlin types; in the example above, a single marshaller supports actions returning Response types and actions returning T types, but you could have separate marshallers for each.

If an action lacks a @RequestContentType annotation, then it is assumed to have the equivalent of @RequestContentType(“*/*”) meaning accept all content types. If an action lacks a @ResponseContentType annotation, then it is required to return a Response and handle all marshaling and header management itself.

Hibernate Module

Currently the Hibernate Service does nothing of note, it starts up and it shuts down. This needs to be fleshed out to support querying and modifying a database. The main points will look something like:

Supporting config based on environment (#1 and #2)
Able to read database secrets securely (#8)
Allowing database objects to be registered
Allow simple operations such as INSERT, UPDATE, SELECT

Automatically detect cloud environment and bootstrap appropriately

MiskApplication (or maybe someplace else) should automatically detect the cloud provider in which it is running and configure health systems appropriately (e.g. setup StackDriver / CloudWatch friendly logging, retrieve and cache InstanceMetadata, etc). We can determine the cloud provider by hitting the provider-specific instance metadata endpoints:

AWS - http://169.254.169.254/instance/latest-metadata
GCP - http://metadata.google.internal/computeMetadata/v1/instance

Create a SignalFX metrics backend

In parallel with the stackdriver and graphite backends.

Should include @AppName and InstanceMetadata components as tags, so we can do service-level slicing in SignalFX

Provide leasing interface

Similar to flags, the interface for this might need to live within misk, with the specific implementation (zk based, datastore based, etc) living elsewhere.

Add WebSocket support

This will allow features such as real-time admin dashboards, which should offer a responsive user experience.

Adapter to optionally emit logs to StackDriver

Should have the ability to emit logs to StackDriver - see https://cloud.google.com/logging/docs/reference/libraries#client-libraries-install-java

Google Cloud Datastore Module

Right now we don't have a persistence module. Google's Datastore is a good first candidate: https://cloud.google.com/datastore/docs/reference/libraries#client-libraries-install-java

Provide dynamic flags interface

Should provide an interface for listening / responding to dynamic flag changes. Interface probably needs to live within misk itself so misk core components can leverage it, but implementation might live elsewhere.

Get process to easily publish a JAR set up

We should have a script to easily create a JAR (and optionally publish it), so we can release easily and often. @swankjesse I think you mentioned you'd be okay to take this on?

Config module should support overrides and default values

Currently the ConfigModule only supports loading one file, and does not respect default values of data classes. There should be a simple naming scheme for overriding configurations, i.e. foo-production.yaml overrides foo-common.yaml, and a config class similar to the following:

data class FooConfig : Config (
  val port : Int = 8080,
  val max_connections : Int = 300
)

Should by default have port and max_connections set without explicit configuration, to reduce the verbosity of config yaml files.

Create @FormValue parameter extractor

For handling form values posted in application/x-www-form-urlencoded and multipart/form-data content types. Should extract specific form values and convert into nullable primitive types, lists of nullable primitive types, and objects that can be encoded as JSON

Misk reports illegal reflection warnings under Java 9

When running exemplar under Java 9, the following warning is reported:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/Users/mmihic/Development/misk/exemplar/build/libs/exemplar-0.1.1-SNAPSHOT.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

Possibly we need to update Guice

Add support for distributed tracing

With adapters for StackDriver tracing, as well as Zipkin.

Update exemplar with SSL/TLS

Looks like exemplar just works with plain HTTP right now. However normal mode of operation will likely be with TLS (and mutual TLS specifically). So seems like the exemplar should reflect that.

Provide access to cloud platform event streaming infrastructure

Should define interfaces with implementations layered on Kinesis + cloud pub/sub. Note there are significant differences between these models, specifically with kinesis supporting ordered delivery and cloud pub/sub only supporting unordered delivery. Might need to allow applications to choose whether ordered delivery is required or not (determined as a property of the topic not the client), with different implementation on GCP for each (ordered == lower throughput and backed by datastore, unordered == higher throughput and backed by pub/sub).

Allow interceptors to return response bodies that do not conform to the actions return type

Currently the marshaller works off the return type of the action method. We likely want to allow interceptors to override this and return other response bodies without requiring that the implement their own marshalling. For example, when handling an exception, we may want to allow the exception mapping interceptor to return a structured response body in JSON format. Similarly other interceptors might short-circuit the dispatch chain and want to return a customized response (e.g. an authentication interceptor that returns a redirect + structured response about how to authenticate). These should be possible without requiring the interceptors to write their own custom ResponseBody - they should be able to rely on the marshalling framework to do the content negotiation and object to ResponseBody conversion for them.

Provide mechanism for running scheduled tasks

Provide some sort of mechanism for running tasks in process on a defined schedule, with appropriate metrics and configuration-/flag-driven schedule management. Should allow programmatic adjustment to schedule based on status of task execution. These are recurring tasks, not programmatic submission of jobs based on application events.

Add support for admin pages

Each service should be able to provide resources such as HTML, CSS, etc, which Misk can combine to form an admin dashboard. The data displayed would be used for debugging and administrative purposes, such as setting dynamic config.

Consider providing a mechanism for building command line utilities in the main app

One of the nice features of dropwizard / the internal SC is the ability to register command line utilities inside of the main application class, so that you can do things like:

java -jar buckets.jar upload-protos -f /path/to/my/protos.zip -b MyBucket

instead of

java -cp buckets.jar com.squareup.buckets.cli.UploadProtos -f /path/to/my/protos.zip -b MyBucket

Might be good to provide a similar ability. Important to note that different commands will almost certainly require different sets of Guice modules (e.g. only the server start command would embed the JettyModule, for instance).

Create a DatasourceModule providing access to a JDBC Datasource

We should provide a DatasourceModule for connecting to a JDBC database instance. This should not be tied in to Hibernate or other framework; it should just bind a raw JDBC Datasource, optionally qualified by an annotation specified by the installer. Services that want to use higher level frameworks can install other modules that use the bound datasource to initialize those frameworks. The DatasourceModule should use the Hikari connection pool and be opinionated on which configuration options are allowed (fixed pool size, pool check timeout, etc). It should also install appropriate readiness checks to prevent a service from reporting itself as ready until after connectivity to the database has been confirmed, and expose appropriate database metrics.

Example install:

fun main(args: Array<String>) {
   val serviceConfig = MiskConfig.load<ExemplarConfig>("exemplar",
       Environment.fromEnvironmentVariable())
   MiskApplication(
       listOf(
          ConfigModule.create("exemplar", serviceConfig),
          TracingModule(serviceConfig.tracing),

          // Installs datasource support, binding it to an unqualified Datasource
          DatasourceModule(“exemplar”),
      
          // Installs a second datasource for read-replica access, binding it to 
          // a Datasource qualified by the  @ReadReplica binding annotation
          DatasourceModule("exemplar-readonly", annotatedBy=ReadReplica::class))
   ).run(args)
}

class MyAction : WebAction {
  @Inject lateinit var primaryDatasource: DataSource
  @Inject @ReadReplica lateinit var readOnlyDatasource: DataSource
}

Example configuration:

default_db_settings: &default_db_settings
    type: mysql
    username: exemplar
    password: 
    connection_properties: zeroDateTimeBehavior=convertToNull
    fixed_pool_size: 10
    pool_check_timeout: 30ms
    max_conn_lifetime: 1m

datasources:  
  exemplar:
    <<: *default_db_settings
    host: exemplar.cluster-cmxb34unfd.us-west-2.rds.amazonaws.com 

  exemplar-readonly:
    <<: *default_db_settings
    host: exemplar.cluster-cmxb34unfd-ro.us-west-2.rds.amazonaws.com

Theoretical example also using Hibernate:

fun main(args: Array<String>) {
   val serviceConfig = MiskConfig.load<ExemplarConfig>("exemplar",
       Environment.fromEnvironmentVariable())
   MiskApplication(
       listOf(
          ConfigModule.create("exemplar", serviceConfig),
          TracingModule(serviceConfig.tracing),

          // Installs datasource support, binding it to a Datasource qualified by 
          // a @Primary binding annotation
          DatasourceModule(“exemplar”, annotatedBy = Primary::class),

          // Installs a second datasource for read-only access, binding it to a Datasource
          // qualified by a @ReadReplica binding annotation
          DatasourceModule("exemplar-readonly", annotatedBy = ReadReplica::class),

          // Installs hibernate support, telling it to use the Datasource with the @Primary
          // binding annotation for writable transactions and the datasource annotated with the
          // @ReadReplica binding annotation 
          HibernateModule(
               datasourceAnnotatedBy = Primary::class,
               readOnlyDatasourceAnnotatedBy = ReadReplica::class)
   ).run(args)
}

Consider providing a mechanism for overridding configuration

Currently the configuration framework only reads configuration YAML files that have been embedded into the application jar, and the only way to customize the application's configuration for local development is to edit the configuration files that have been committed to git. This is a poor development experience, as it means needing to remember to skip adding those hand-modified files when building commits, or risk accidentally committing a development configuration specific to an individual machine. Propose that we provide a way to specify an override for the embedded YAML files at runtime, possibly through e.g. an environment variable that points to an external YAML file that gets loaded instead of or after all of the embedded files.

With this, the development flow would look like:

src/main/resources contains default configuration (<service_name>-common.yaml) and a development configuration (<service_name>-development.yaml) that runs "out of the box" inside a local docker image built from the service supplied Dockerfile. Typically this development configuration will use the same hostnames, ports, paths as the service would use in a staging/production configuration, and which are local to the docker image / user bridged docker network.
Developers that want to run a service on their laptop outside of Docker can create a yaml file anywhere on their local laptop file system and explicitly control all of the configuration for the service, then tell the service to load that custom configuration via env var (or some equivalent means).

PathPatternParameterExtractor hijacks all string parameters

The PathPatternParameterExtractorFactory unconditionally binds itself to all string parameters, regardless of their annotations. This causes the action to blow up if the string parameter has an annotation indicating it comes from some other place (header value, query parameter, request body) based on annotations.

The extractor should only bind to strings which explicitly have the PathParam annotation.

Provide access to background task submission / execution

Submission of background tasks based on application demand, with tasks being dispatched by cloud infra and possibly executed in another process. Likely implementation on top of SQS for AWS, no real equivalent for GCP until cloud tasks becomes available (maybe can use some combination of datastore for tracking task state, with either polling or sending trigger messages via cloud pub/sub).

Update to use JUnit 5

This repo uses JUnit 4 for now, but we should be using the latest and greatest here.

Separate network and application interceptors

I wanna split the interceptors into two halves:

the “application interceptors” that get application-layer objects
the “network interceptors” that get network-layer objects

In the middle is our bridge. So users can decide whether they want a Response or a MyResponseObject.

Tracing interceptor should understand Response objects

Applications can indicate errors either by throwing an exception or by returning a Response object with a non-2XX response code. The tracing interceptor should understand both. Easiest way to do this is probably to follow the pattern set by MetricsInterceptor, installing the tracing interceptor closer to the transport than the ExceptionMappingInterceptor (which translates exceptions to response codes), and checking the Response.statusCode field.

Tracing interceptor swallows application exceptions

The TracingInterceptor is installed closer to the application than the ExceptionMappingInterceptor, which means that it receives exceptions from the application before they get translated into Response objects. Further, TracingInterceptor is catching and swallowing exceptions from chain.proceed, which means application exceptions are not being properly propagated to clients when tracing is installed.

Also, the TracingInterceptor should not log exceptions from chain.proceed - the ExceptionMappingInterceptor controls exception logging, based on configuration. In general, interceptors should never log exceptions from chain.proceed

Automatically expose HTTP and RPC metrics

All HTTP and RPC endpoints should expose a set of standardized metrics, including:

request counters
response status (success, client error, server error, specific error codes) counters
response time histograms
request / response size histograms

Secret Management

Misk should be opinionated about how secrets are stored, and force users to do so securely. This might use a service such as Keywhiz directly, or it could support multiple backends such as Vault.

Set up Travis for CI

Write an optional request logging interceptor that spits out httpd-format access logs

Lots of tools process apache httpd2 access logs, and it's handy for debugging. Would be useful to have an interceptor that emits this log file.

Figure out a client strategy

misk currently offers nothing for sending requests to other services. It should have some strategy for this, at minimum to provide a standard way to hook in cross-cutting concerns such as context propagation, end-to-end request deadlines, chain-of-trust establishment, etc. Could range from having standardized okhttp client interceptors to a full-blown stub model.

Check signatures and fail fast when binding action methods with incompatible parameters

We now have several illegal combinations of parameter types and action annotations - for example non-POST/PUT methods cannot take a @RequestBody, and methods that are not @ConnectWebsocket cannot take WebSockets. We should check method signature compatibility on startup and fail fast if a method's parameters are incompatible with the type of action being bound.

Misk should support some concept of environments

This will be useful when determining which yaml files for parse for config, as mentioned in #1. Examples of environments are testing, staging, production, etc.

A simple implementation could just be a String that gets bound so you can inject it like

@Inject @Env String environment

Misk can provide a module that binds based of an environment variable. Alternatively a module can statically provide an environment, and then an app can select which to install.

Provide application-level control over dispatch error logging

Currently we log all errors that occur in the dispatch pipeline. This can be noisy especially if there is a client that is submitting a lot of bad data. Consider allowing applications to determine which exceptions get logged and at what error level; we might want to (for example) log IllegalArgumentExceptions and other client generated errors as warnings, or suppress them altogether.

Figure out a componentization strategy

misk is starting to get big - it's probably about time to figure out if and how to split up into smaller source+build components. For example, we could have a structure that looks something like:

misk-core - metrics/logging/healthchecks/tracing/startup/security + web framework
misk-jdbc - JDBC access support
misk-hibernate - friendly Hibernate layer
misk-eventrouter - The eventrouter
misk-cloud - common interfaces for cloud services
misk-gcp - GCP-specific cloud support
misk-aws - AWS-specific cloud support

where each is a separate buildable artifact etc

Create an exemplar service ecosystem

Right now we have exemplar which shows how to use misk to write a single service. It'd be very beneficial to have an actual running exemplar service ecosystem - e.g. three services all of which talk to each other on a continual basis, hooked into metrics and logging, etc. - so that we can demonstrate service interaction and integration with systems health infrastructure. This can also be used as the basis for monitoring of the base misk componentry, making sure that e.g. changes don't break systems health integration and so on.

Add tests for web dispatch

Pretty much the entirety of the web dispatch layer is untested. This needs tests.

Interceptors for web sockets

Websocket messages bypass the existing action dispatch framework, so we don't have common interceptors running. This might cause a variety of issues, particularly with things like tracing, auditing, and credential propagation which expect to have request level context established. We probably don't want to piggyback on existing interceptors since the websockets API is divergent from the actions API and interceptors will likely blow up if they are called in the wrong context, but may need to create a parallel interceptor specifically for websockets.

Provide a database connection pool

Possibly making use of HikariCP. This is a pre-requisite for for Issue #10

Pagerduty Module

A Pagerduty service would offer easy alerting to Pagerduty, by accepting an API key and creating incidents directly.

Other services like Jetty could alert if there are too many 500's over a time period, as a simple example.

Metrics Module

Misk should offer an opinionated metrics service that acts as a facade to multiple backends.

For example, the counter, timer, gauge, and histogram models are all commonly used.

See Dropwizard metrics and Prometheus for examples.

Create @QueryParam parameter extractor

To pull from query parameters. Should do proper type conversion into Nullable primitive types (Int?, Long?, Double?, etc), lists of primitive types (for when the query parameter appears more than once), and support optional non-Nullable primitive parameters (which probably necessitates switching from using call to callBy when the target action method has optional parameters and/or the incoming request does not contain mappings for all of the target method parameters)

cashapp / misk Goto Github PK

misk's Issues

Recommend Projects

Recommend Topics

Recommend Org