cloudstateio / cloudstate Goto Github PK
View Code? Open in Web Editor NEWDistributed State Management for Serverless
Home Page: https://cloudstate.io
License: Apache License 2.0
Distributed State Management for Serverless
Home Page: https://cloudstate.io
License: Apache License 2.0
It would be really cool to support .NET Core for user functions!
Define what metrics can and should be exposed by the platform.
We need to document somewhere as to what data format these should be in. Protobuf is not the ideal data storage format as it is designed for protocols and not data storage. Should we mandate a format or leave it up to the users?
Where is a good entrypoint for folks interested in Golang contributions to clients slide?
We need to define the desired developer feedback loop (code-test-package-deploy-inspect)
It would be interesting to have the sidecar cache the description of the .proto provided by the user function and expose that description through a HTTP (or also a gRPC) endpoint—this would make it possible to obtain the descriptor from the outside.
Thoughts?
It would be interesting to have the CloudState Proxy handle gRPC and HTTP Authentication and Authorization transparently. Even for outbound calls made by functions (route outbound calls via the proxy too) and seamlessly add and verify security tokens.
Both event sourcing and CRDTs serializes Google protobuf Any values. Currently, it uses the built in Akka protobuf serializer to do this, which will store com.google.protobuf.Any
in the manifest, and then the serialized Any as the message, which will contain the type_url
as well as the serialized value
field. This is inefficient, the com.google.protobuf.Any
is repeated for every value unnecessarily. Instead, we should write an Akka serializer for com.google.protobuf.Any
that stores the type_url
in the manifest, and writes the value
field directly as the bytes.
This issue is specifically for when minScale > 0.
It's possible to "work around" this issue by setting the GC time limit to a low time and adjusting the number of old revisions to retain.
See: knative/serving#2720
User platform support for Ruby-based services
We should not return ready until we have established communication with the database. This would only apply to the Cassandra backend currently.
Obviously a big thing we're missing right now is CQRS support - ie, read side processors for event sourced entities.
In general, there are two different types of read side processors, local, and remote. With local, you consume the event log directly. With remote, you publish to a message queue (eg Kafka) and consume from a different service from Kafka.
The consumption of these two things should feel the same to the developer. There may be an extra step in publishing to Kafka where you translate your event log (possibly filter/transform etc) to a message queue stream - we could add support for publishing the log directly with no user code though - the only issue around that is whether it's a good idea to expose the persistence format to the outside world, or whether we should require an anti-corruption layer.
An event stream consumer could just go and talk to a database directly - and I think we have to support that. Though, it would be good to investigate whether other technologies (Lightbend pipelines, Knative events) would be better to do that.
It would also be good to investigate whether we could create some more CloudState-esque consumer protocols. So for example, could a key-value entity be a consumer of an event stream? Or an event sourced entity? We would have to have a way of translating event streams to virtual gRPC calls. Also, we would need a way of expressing entity keys, the original events may be associated with the entity that produced them, but read side views very often want to pivot that, eg, in a chat application, you might have room entities which have a list of which users are in the room, but then you want a view that shows which rooms a user is in, so you pivot from users by room to rooms by user. To do that, you need to change the entity key. One idea for this would be to use forwards - so a non entity key based service receives the user joined room event, and then forwards it to the rooms by user entity and that message would be keyed by user. But that might be too much overhead, both from the developer experience, and from a performance perspective, so something more automated, perhaps the consumer can take the producers proto spec, and add their own pivot keys to the messages, possibly allowing fan out etc if the key is embedded in a repeated field.
Add support for creating Scala-based stateful serverless functions
We would like to add a P2P messaging pattern, that is, a protocol that user functions can use to do P2P messaging through Akka.
It's important to define what we mean by peer. A peer is an abstract concept that, for each domain, is defined by the domain. It could be a human, or it could be a device - eg, an IoT device - or it could be an entity (eg, an event sourced entity that is pushing updates to pages in real time). If it is a human, they may be interacting through many devices, for example, I have Slack installed on multiple laptops and multiple mobile devices, when someone sends me a message, and I have Slack on all my devices open, I expect to receive that message in real time on all of my devices at once. Typically, a device will have a TCP connection (perhaps gRPC stream or WebSocket) to a serverless service from which it will receive P2P messages. That connection may be over an unreliable network, and when it fails, it will reconnect, but not necessarily back to the same node that it was originally connected to.
While we probably can't address every possible use case, we want to come up with one or more solutions that cover a broad range of use cases. With that in mind, here are some different characteristics or requirements that some use cases might have.
The P2P messaging may in some cases be between more than 2 peers (eg, a chat room, or multiple IoT devices in a home), there may be multiple publishers for a single topic, and multiple subscribers for a single topic - this may expand the traditional definition of P2P, perhaps we really are talking about addressed communication, but note that address is not a machine or actor address, it is the abstract user/device as defined above.
Various use cases exist for a range of different delivery guarantees. At most once is useful when the current state is being sent, and new messages invalidate previous messages. For example, tracking the location of an IoT enabled vehicle. The other major useful guarantee is effectively once. In this case it's assumed that the device receiving updates can deduplicate (using a domain specific sequence number for example, or unique ids), but needs at least once delivery. Instant messaging is an example of this.
Delivery time guarantees for effectively once messaging vary too. The point of P2P messaging is to allow effectively instant delivery, ie the only latency comes from network, routing, and processing, and that should happen in the happy case. In failure scenarios however, in some use cases there should be a maximum time that it takes for the message to be delivered, in other cases it's ok for the dropped message to not be delivered until the next message is received.
Currently, the only out the box solution that Akka provides to implement P2P messaging as described above is distributed pubsub. This can be combined with Akka persistence to achieve at least once delivery, by persisting messages first, then publishing them, and then using the sequence number to detect dropped messages, and the journal to recover.
Distributed pubsub however requires replicating the subscriber state to all nodes, and hence doesn't scale well when there are a very large number of topics being subscribed to.
Here are two other distributed P2P possibilities that we might want to consider. These ideas are very raw and not fully thought out, they may be terrible.
See this improvement proposal which could address the issue: kubernetes/enhancements#753
Go through these:
Without the standalone version, we need to decide whether we will support scale to zero or not. It may suffice to say that if you want scale to zero, you need to use in combination with Knative or something similar. Also possible is that in integrating with Knative, we'll need to implement our own activator proxy anyway (to avoid the tight undocumented coupling that Knatives activator proxy has with the rest of its infrastructure).
The design would essentially be the same as Knative - though I'm not 100% sure how it works in Knative, that is, I'm not sure how Knative ensures that requests are routed to the activator. Given that we don't have an independent autoscaler, our activator would need to also be responsible for scaling up from zero.
We should provide support for product CRDTs - ie, CRDTs that are a product of multiple CRDTs. The underlying CRDT for this would be a Grow-only Map, which would be a map of keys (any type, but typically strings) to child CRDTs. Akka doesn't have support for this out of the box, though it's very similar to the ORMap, it just uses a GSet instead of an ORSet for maintaining the keys. Since for the product use case, we'd anticipate only a small number of keys, it would make sense for the delta to contain deltas of the member CRDTs (if supported), rather than full values (this isn't the case in ORMap, though is for ORMultiMap).
Once we have a GMap, we can expose that in the CRDT protocol, and then build user APIs that allow it to be used directly, as well as allow it to be used as a product (ie, essentially a plain old object whose properties are keys in the map). This would be done using a proxy in JavaScript, is is currently exposed for the ORMap, for Java, we might have some reflection based mechanism that would inject a POJO with the CRDT values.
Add automatic code formatting to the Reference Implementation build, and optionally, the respective frontend implementations.
It would be nice to have something like page (perhaps named https://new.cloudstate.io) where users could go, pick the storage management strategy, target language, service name, etc and it will generate a downloadable archive with a project structure, stub-files and some documentation links…
This would allow new users to really quickly be able to write their own stateful services without much pre-knowledge about how to structure their projects, what dependencies are needed etc.
TODO devise a solution for being able to consume domain events to facilitate things like creating projections, or even consuming domain events from something like Alpakka (CloudEvents?)
Should this project recommend gRPC client libraries on a per-user-platform basis?
This would make it easier for developers to figure out how to call services. It would also be worth-while for platform testing.
We should support either working within Knative, or as standalone.
Currently, an operator that works with Knative revisions has been implemented, though it is probably in a non working state since we stopped work on integrating with Knative. We've created a patch for Knative that allows our operator to take over managing the deployment:
That patch has been rejected due to undocumented tight coupling with the autoscaler, and we have found that the Knative autoscaler isn't suitable for scaling Akka clusters anyway, so work needs to be continued on Knative to disable the autoscaler for certain revisions too (and possibly the activator). But we will also need to make our proxy possible to work with either Knative or stand alone. To work with Knative, we need the following:
Note that what is meant by CRUD is not SQL, or joins, but rather being able to get an Entity value, modifying it, and having the modified version stored for the next command/request. So "destructive updates".
Could in theory be implemented on top of the EventSourcing support by either storing the new Entity value as an event, or by repeatedly generating new Snapshots for each new state.
This also impacts the user-facing API as they would not have to deal with anything but the inbound commands (not events).
Add support for defining stateful serverless functions in JavaScript
We should not return ready until the server actor is running and bound (which implies that we have done the ready handshake with the user function).
It would be extremely interesting to be able to make the sidecar application a Graal Native Image.
It would be interesting to target TypeScript
Discussing things like:
Instead of the original Akka Persistence
Clean up the .protos and create a formal specification for the interactions, and then transfer those rules to the TCK for verification.
We need to define, implement, verify and document how the backend platform is updated and what the implications is for migrating to a new version.
I think it'd be useful to create a Gitter channel (or Discord,or whatever is convenient) for discussion, especially early on as the project is moving at a high pace.
Define, implement, verify and document a Protobuf protocol that will ensure that the backend platform can be used by any user platform language (which supports protobuf).
TODO requirements
Having support for Key-Value style state management would be interesting,
it could either be implemented on top of CRUD #50 by the Entity being a Map.
This also impacts the user-facing API as they would not have to deal with anything but the inbound commands (not events), by adding/modifying/removing the values in the Map.
Current proposed time is 6am AEST Wednesday (this means Tuesday for US and Europe).
User platform support for Python-based services
Create a well-documented example application
Make a straight forward how-to on how to add more language support.
Document and validate the steps necessary to create a working installation
CRDT support is going to be interesting.
I see two general approaches. The first is to offer a very low level protocol, which essentially would look like the Akka ddata ReplicatedData
and associated traits (eg delta handling traits). In this case, all CRDT types will need to be implemented in the language support libraries. The other approach would be to offer a high level protocol, reusing the CRDT types in Akka. This would mean the protocol itself would understand operations that can be made on for example a PNCounter
or LWWMap
.
I think the former approach is going to be necessary, as the latter will be really restrictive, you won't be able to implement custom CRDTs, and I'm not sure how you'd compose CRDTs. But the former pushes a lot more work into each implementation of the support library for each language.
A big challenge with all of these however is how to actually integrate with Akka ddata. All callbacks on Akka ddata types are synchronous, eg 'merge', 'mergeDelta' etc. Also, the modify' function on the
Update` message is synchronous. This makes it impossible to implement them in the user function, since an invocation on anything from Akka to the user function is going to involve IO and so be asynchronous.
What we might need to do is fork an asynchronous version of the Replicator
, that asynchronously invokes merge
and friends. That could be equivalent to a rewrite.
User platform support for php-based services
Define, implement, verify and document that user code for the different platforms can interoperate.
Being able to emit events to be consumed by other services/endpoints as a part of command processing, and then being able to declare/autogenerate that an endpoint is able to receive certain kinds of events, either by having a transformation from an event to a command, and use the existing gRPC endpoints, or having to implement a specific endpoint signature to be able to receive events of that kind.
Implementing support for this will require a bit of R&D
Add Java support for creating stateful serverless functions
README.md
is just about the coolest guide that I've seen in a great long while; and I've pretty much seen them all.README.md
has done nothing less for me than give me a manifesto for building (and using) next generation serverless, the way it should be done. Kudos 👍FaaS
in the context it should be seen; a first (important) step toward stateful cloud computing, but a first step nonetheless 🌩README.md
, I suggest that we make a handful of (minor) updates to it so as to enhance its flow and readability even more than it currently stands.A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.