cashapp / misk Goto Github PK
View Code? Open in Web Editor NEWMicroservice Kontainer
Home Page: https://cashapp.github.io/misk/
License: Apache License 2.0
Microservice Kontainer
Home Page: https://cashapp.github.io/misk/
License: Apache License 2.0
Currently action methods can only support a single request content type, and content negotiation picks actions based on the incoming content type. We should allow actions to support multiple request content types; this would allow a single action to support both json and protobuf for wire messages, for example.
We've added a bunch of actions to the Kotlin exemplar (for websockets, for form actions) but haven't kept the Java version up to date. Need to add these to the Java variant to ensure they function properly in Java (right now Java exemplar doesn't even start).
Currently misk uses the order in which actions were registered with Guice to determine precedence when routing requests. This is fragile; Guice module order is often random, and the Guice modules used during testing don't always line up with production. Instead of relying on installation ordering, misk should have a deterministic precedence rule for routing requests to actions. Many routing frameworks determine precedence based on how exactly the pattern matched the path - when a request comes in, you find a set of candidate actions by doing pattern matches on path, then picking the action whose pattern has the most non-variable path segments. If this results in a tie between multiple actions, pick the action whose pattern has the most number of non-wildcard variable segments.
So given actions with the following patterns
AdminObjectLookupAction
= /admin/{admin_type}/{admin_container}/{admin_command}/{admin_object}
GenericAdminAction
= /admin/{admin_path:*}``
OrgAdminAction
= /admin/org/{org}/{admin_command}/{admin_object}
OrgUserAdminAction
= /admin/org/{org}/users/{user}
NotFoundAction
= /{path:*}
/admin/org/foo/bar/zed
matches [OrgAdminAction, AdminObjectLookupAction, GenericAdminAction, NotFoundAction] and selects OrgAdminAction since it has the most non-variable path segments (2 vs 1 and 0, respectively)
/admin/org/foo/users
matches [OrgUsersAdminAction, OrgAdminAction, AdminObjectLookupAdmin, GenericAdminAction, and NotFoundAction] and selects OrgUsersAdminAction since it has the most non-variable path segments.
/admin/foo/bar/zed/nolo
matches [AdminObjectLookupAction, GenericAdminAction, and NotFoundAction] and picks AdminObjectLookupAction since it has the most non-wildcard variable segments.
Currently misk routes requests to actions based on the combination of path and method. The routing should take into account the content-type and accept headers as well, with the latter also influencing the unmarshalling of requests and marshalling of responses. Proposal is to introduce new @RequestContentType
and @ResponseContentType
annotations. The @RequestContentType
annotation takes a list of media ranges, and the @ResponseContentType
annotation takes a media type. When a request arrives, we find the appropriate action by:
@RequestContentType
media ranges@ResponseContentType
media type can satisfy the request Accept media rangesTo handle content negotiation based unmarshalling, we remove the existing @XXXRequestBody
annotations and associated ParameterExtracter
s and introduce a new @PostRequestBody
and PostRequestBodyExtractor. The PostRequestBodyExtractor looks at the request Content-Type and finds an associated Unmarshaller for that Content-Type. Like ParameterExtractor
s, Unmarshallers
are created via factories registered with the runtime.
interface Unmarshaller {
fun unmarshal(source: BufferedSource): Any?
interface Factory {
fun create(mediaType: MediaType, parameter: KParameter): Unmarshaller?
}
}
class JsonUnmarshaller(val adapter: JsonAdapter<Any>) : Unmarshaller {
override fun unmarshal(source: BufferedSource) = adapter.fromJson(source)
class Factory @Inject internal constructor(val moshi: Moshi) : Unmarshaller.Factory {
override fun create(mediaType: MediaType, parameter: KParameter): Unmarshaller? {
if (mediaType.type() != "application" || mediaType.subtype() != "json") return null
return JsonUnmarshaller(moshi.adapter<Any>(parameter.type.javaType))
}
}
}
To handle content negotiation based marshaling, we remove the existing @XXXResponseBody
annotations and associated interceptors, and replace them with a single MarshallingInterceptor
. The MarshallingInterceptor
uses the@ResponseContentType
associated with the action to find an Marshaller
for the action’s return type. The Marshaller
s are also created via factories registered with the runtime.
interface Marshaller<in T> {
fun contentType() : MediaType
fun responseBody(o: T) : ResponseBody
interface Factory<out T> {
fun create(mediaType: MediaType, action: Action): Marshaller<T>?
}
}
class JsonMarshaller<T>(val adapter: JsonAdapter<T>) : Marshaller<T> {
override fun contentType(): MediaType = Json.MEDIA_TYPE
override fun responseBody(o: T) = object : ResponseBody {
override fun writeTo(sink: BufferedSink) {
adapter.toJson(sink, o)
}
}
class Factory @Inject internal constructor(val moshi: Moshi) : Marshaller.Factory<Any> {
override fun create(mediaType: MediaType, action: Action): Marshaller<Any>? {
if (mediaType.type() != "application" || mediaType.subtype() != "json") return null
val responseType = when {
action.returnType.rawType == Response::class.java -> {
(action.returnType.type as ParameterizedType).actualTypeArguments[0]
}
else -> action.returnType.type
}
return JsonMarshaller<Any>(moshi.adapter<Any>(responseType))
}
}
}
A single media type can have different marshallers for different response kotlin types; in the example above, a single marshaller supports actions returning Response types and actions returning T types, but you could have separate marshallers for each.
If an action lacks a @RequestContentType
annotation, then it is assumed to have the equivalent of @RequestContentType(“*/*”)
meaning accept all content types. If an action lacks a @ResponseContentType
annotation, then it is required to return a Response and handle all marshaling and header management itself.
Currently the Hibernate Service does nothing of note, it starts up and it shuts down. This needs to be fleshed out to support querying and modifying a database. The main points will look something like:
MiskApplication
(or maybe someplace else) should automatically detect the cloud provider in which it is running and configure health systems appropriately (e.g. setup StackDriver / CloudWatch friendly logging, retrieve and cache InstanceMetadata
, etc). We can determine the cloud provider by hitting the provider-specific instance metadata endpoints:
In parallel with the stackdriver and graphite backends.
Should include @AppName
and InstanceMetadata
components as tags, so we can do service-level slicing in SignalFX
Similar to flags, the interface for this might need to live within misk, with the specific implementation (zk based, datastore based, etc) living elsewhere.
This will allow features such as real-time admin dashboards, which should offer a responsive user experience.
Should have the ability to emit logs to StackDriver - see https://cloud.google.com/logging/docs/reference/libraries#client-libraries-install-java
Right now we don't have a persistence module. Google's Datastore is a good first candidate: https://cloud.google.com/datastore/docs/reference/libraries#client-libraries-install-java
Should provide an interface for listening / responding to dynamic flag changes. Interface probably needs to live within misk itself so misk core components can leverage it, but implementation might live elsewhere.
We should have a script to easily create a JAR (and optionally publish it), so we can release easily and often. @swankjesse I think you mentioned you'd be okay to take this on?
Currently the ConfigModule
only supports loading one file, and does not respect default values of data classes. There should be a simple naming scheme for overriding configurations, i.e. foo-production.yaml
overrides foo-common.yaml
, and a config class similar to the following:
data class FooConfig : Config (
val port : Int = 8080,
val max_connections : Int = 300
)
Should by default have port
and max_connections
set without explicit configuration, to reduce the verbosity of config yaml files.
For handling form values posted in application/x-www-form-urlencoded
and multipart/form-data
content types. Should extract specific form values and convert into nullable primitive types, lists of nullable primitive types, and objects that can be encoded as JSON
When running exemplar under Java 9, the following warning is reported:
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/Users/mmihic/Development/misk/exemplar/build/libs/exemplar-0.1.1-SNAPSHOT.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Possibly we need to update Guice
With adapters for StackDriver tracing, as well as Zipkin.
Looks like exemplar just works with plain HTTP right now. However normal mode of operation will likely be with TLS (and mutual TLS specifically). So seems like the exemplar should reflect that.
Should define interfaces with implementations layered on Kinesis + cloud pub/sub. Note there are significant differences between these models, specifically with kinesis supporting ordered delivery and cloud pub/sub only supporting unordered delivery. Might need to allow applications to choose whether ordered delivery is required or not (determined as a property of the topic not the client), with different implementation on GCP for each (ordered == lower throughput and backed by datastore, unordered == higher throughput and backed by pub/sub).
Currently the marshaller works off the return type of the action method. We likely want to allow interceptors to override this and return other response bodies without requiring that the implement their own marshalling. For example, when handling an exception, we may want to allow the exception mapping interceptor to return a structured response body in JSON format. Similarly other interceptors might short-circuit the dispatch chain and want to return a customized response (e.g. an authentication interceptor that returns a redirect + structured response about how to authenticate). These should be possible without requiring the interceptors to write their own custom ResponseBody - they should be able to rely on the marshalling framework to do the content negotiation and object to ResponseBody conversion for them.
Provide some sort of mechanism for running tasks in process on a defined schedule, with appropriate metrics and configuration-/flag-driven schedule management. Should allow programmatic adjustment to schedule based on status of task execution. These are recurring tasks, not programmatic submission of jobs based on application events.
Each service should be able to provide resources such as HTML, CSS, etc, which Misk can combine to form an admin dashboard. The data displayed would be used for debugging and administrative purposes, such as setting dynamic config.
One of the nice features of dropwizard / the internal SC is the ability to register command line utilities inside of the main application class, so that you can do things like:
java -jar buckets.jar upload-protos -f /path/to/my/protos.zip -b MyBucket
instead of
java -cp buckets.jar com.squareup.buckets.cli.UploadProtos -f /path/to/my/protos.zip -b MyBucket
Might be good to provide a similar ability. Important to note that different commands will almost certainly require different sets of Guice modules (e.g. only the server start
command would embed the JettyModule
, for instance).
We should provide a DatasourceModule
for connecting to a JDBC database instance. This should not be tied in to Hibernate or other framework; it should just bind a raw JDBC Datasource
, optionally qualified by an annotation specified by the installer. Services that want to use higher level frameworks can install other modules that use the bound datasource to initialize those frameworks. The DatasourceModule
should use the Hikari connection pool and be opinionated on which configuration options are allowed (fixed pool size, pool check timeout, etc). It should also install appropriate readiness checks to prevent a service from reporting itself as ready until after connectivity to the database has been confirmed, and expose appropriate database metrics.
Example install:
fun main(args: Array<String>) {
val serviceConfig = MiskConfig.load<ExemplarConfig>("exemplar",
Environment.fromEnvironmentVariable())
MiskApplication(
listOf(
ConfigModule.create("exemplar", serviceConfig),
TracingModule(serviceConfig.tracing),
// Installs datasource support, binding it to an unqualified Datasource
DatasourceModule(“exemplar”),
// Installs a second datasource for read-replica access, binding it to
// a Datasource qualified by the @ReadReplica binding annotation
DatasourceModule("exemplar-readonly", annotatedBy=ReadReplica::class))
).run(args)
}
class MyAction : WebAction {
@Inject lateinit var primaryDatasource: DataSource
@Inject @ReadReplica lateinit var readOnlyDatasource: DataSource
}
Example configuration:
default_db_settings: &default_db_settings
type: mysql
username: exemplar
password:
connection_properties: zeroDateTimeBehavior=convertToNull
fixed_pool_size: 10
pool_check_timeout: 30ms
max_conn_lifetime: 1m
datasources:
exemplar:
<<: *default_db_settings
host: exemplar.cluster-cmxb34unfd.us-west-2.rds.amazonaws.com
exemplar-readonly:
<<: *default_db_settings
host: exemplar.cluster-cmxb34unfd-ro.us-west-2.rds.amazonaws.com
Theoretical example also using Hibernate:
fun main(args: Array<String>) {
val serviceConfig = MiskConfig.load<ExemplarConfig>("exemplar",
Environment.fromEnvironmentVariable())
MiskApplication(
listOf(
ConfigModule.create("exemplar", serviceConfig),
TracingModule(serviceConfig.tracing),
// Installs datasource support, binding it to a Datasource qualified by
// a @Primary binding annotation
DatasourceModule(“exemplar”, annotatedBy = Primary::class),
// Installs a second datasource for read-only access, binding it to a Datasource
// qualified by a @ReadReplica binding annotation
DatasourceModule("exemplar-readonly", annotatedBy = ReadReplica::class),
// Installs hibernate support, telling it to use the Datasource with the @Primary
// binding annotation for writable transactions and the datasource annotated with the
// @ReadReplica binding annotation
HibernateModule(
datasourceAnnotatedBy = Primary::class,
readOnlyDatasourceAnnotatedBy = ReadReplica::class)
).run(args)
}
Currently the configuration framework only reads configuration YAML files that have been embedded into the application jar, and the only way to customize the application's configuration for local development is to edit the configuration files that have been committed to git. This is a poor development experience, as it means needing to remember to skip adding those hand-modified files when building commits, or risk accidentally committing a development configuration specific to an individual machine. Propose that we provide a way to specify an override for the embedded YAML files at runtime, possibly through e.g. an environment variable that points to an external YAML file that gets loaded instead of or after all of the embedded files.
With this, the development flow would look like:
src/main/resources
contains default configuration (<service_name>-common.yaml
) and a development configuration (<service_name>-development.yaml
) that runs "out of the box" inside a local docker image built from the service supplied Dockerfile
. Typically this development configuration will use the same hostnames, ports, paths as the service would use in a staging/production configuration, and which are local to the docker image / user bridged docker network.
Developers that want to run a service on their laptop outside of Docker can create a yaml file anywhere on their local laptop file system and explicitly control all of the configuration for the service, then tell the service to load that custom configuration via env var (or some equivalent means).
The PathPatternParameterExtractorFactory
unconditionally binds itself to all string parameters, regardless of their annotations. This causes the action to blow up if the string parameter has an annotation indicating it comes from some other place (header value, query parameter, request body) based on annotations.
The extractor should only bind to strings which explicitly have the PathParam
annotation.
Submission of background tasks based on application demand, with tasks being dispatched by cloud infra and possibly executed in another process. Likely implementation on top of SQS for AWS, no real equivalent for GCP until cloud tasks becomes available (maybe can use some combination of datastore for tracking task state, with either polling or sending trigger messages via cloud pub/sub).
This repo uses JUnit 4 for now, but we should be using the latest and greatest here.
I wanna split the interceptors into two halves:
In the middle is our bridge. So users can decide whether they want a Response or a MyResponseObject.
Applications can indicate errors either by throwing an exception or by returning a Response
object with a non-2XX response code. The tracing interceptor should understand both. Easiest way to do this is probably to follow the pattern set by MetricsInterceptor
, installing the tracing interceptor closer to the transport than the ExceptionMappingInterceptor
(which translates exceptions to response codes), and checking the Response.statusCode
field.
The TracingInterceptor
is installed closer to the application than the ExceptionMappingInterceptor
, which means that it receives exceptions from the application before they get translated into Response
objects. Further, TracingInterceptor
is catching and swallowing exceptions from chain.proceed
, which means application exceptions are not being properly propagated to clients when tracing is installed.
Also, the TracingInterceptor
should not log exceptions from chain.proceed
- the ExceptionMappingInterceptor
controls exception logging, based on configuration. In general, interceptors should never log exceptions from chain.proceed
All HTTP and RPC endpoints should expose a set of standardized metrics, including:
Lots of tools process apache httpd2 access logs, and it's handy for debugging. Would be useful to have an interceptor that emits this log file.
misk currently offers nothing for sending requests to other services. It should have some strategy for this, at minimum to provide a standard way to hook in cross-cutting concerns such as context propagation, end-to-end request deadlines, chain-of-trust establishment, etc. Could range from having standardized okhttp client interceptors to a full-blown stub model.
We now have several illegal combinations of parameter types and action annotations - for example non-POST/PUT methods cannot take a @RequestBody
, and methods that are not @ConnectWebsocket
cannot take WebSocket
s. We should check method signature compatibility on startup and fail fast if a method's parameters are incompatible with the type of action being bound.
This will be useful when determining which yaml files for parse for config, as mentioned in #1. Examples of environments are testing, staging, production, etc.
A simple implementation could just be a String that gets bound so you can inject it like
@Inject @Env String environment
Misk can provide a module that binds based of an environment variable. Alternatively a module can statically provide an environment, and then an app can select which to install.
Currently we log all errors that occur in the dispatch pipeline. This can be noisy especially if there is a client that is submitting a lot of bad data. Consider allowing applications to determine which exceptions get logged and at what error level; we might want to (for example) log IllegalArgumentException
s and other client generated errors as warnings, or suppress them altogether.
misk is starting to get big - it's probably about time to figure out if and how to split up into smaller source+build components. For example, we could have a structure that looks something like:
where each is a separate buildable artifact etc
Right now we have exemplar which shows how to use misk to write a single service. It'd be very beneficial to have an actual running exemplar service ecosystem - e.g. three services all of which talk to each other on a continual basis, hooked into metrics and logging, etc. - so that we can demonstrate service interaction and integration with systems health infrastructure. This can also be used as the basis for monitoring of the base misk componentry, making sure that e.g. changes don't break systems health integration and so on.
Pretty much the entirety of the web dispatch layer is untested. This needs tests.
Websocket messages bypass the existing action dispatch framework, so we don't have common interceptors running. This might cause a variety of issues, particularly with things like tracing, auditing, and credential propagation which expect to have request level context established. We probably don't want to piggyback on existing interceptors since the websockets API is divergent from the actions API and interceptors will likely blow up if they are called in the wrong context, but may need to create a parallel interceptor specifically for websockets.
Possibly making use of HikariCP. This is a pre-requisite for for Issue #10
A Pagerduty service would offer easy alerting to Pagerduty, by accepting an API key and creating incidents directly.
Other services like Jetty could alert if there are too many 500's over a time period, as a simple example.
Misk should offer an opinionated metrics service that acts as a facade to multiple backends.
For example, the counter, timer, gauge, and histogram models are all commonly used.
See Dropwizard metrics and Prometheus for examples.
To pull from query parameters. Should do proper type conversion into Nullable primitive types (Int?
, Long?
, Double?
, etc), lists of primitive types (for when the query parameter appears more than once), and support optional non-Nullable primitive parameters (which probably necessitates switching from using call
to callBy
when the target action method has optional parameters and/or the incoming request does not contain mappings for all of the target method parameters)
It seems that Jackson just treats the merge key <<
as a regular key, even though it uses SnakeYaml as a backend (which does resolve them).
Need to investigate if how to help Jackson properly handle merge keys.
Applications and services should be able to register internal health checks which are run periodically, and exposed to outside systems (load balancer health checkers, monitoring tools) via a _status endpoint.
Should switch to either JSSE or OpenSSL integration for handling SSL on both the client and server. Think we decided on OpenSSL but don't recall exactly (@swankjesse ?)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.