Note: this proposal contains a lot of pseudo code and some of the core aspects of the proposal are contained in the code comments - don't skip over it
Table of contents
Abstract
This proposal addresses a couple of issues that come with MUD’s current approach to on-chain data modelling and state management.
We propose to move away from individual component contracts to store application state, and instead create a core spec and library for on-chain for data modelling and storage. The core library can be used in any contract to store state in a MUD compatible way and emit MUD compatible events for general-purpose indexers. (The core library doesn’t implement access control.)
We then use this core library to create a framework to add storage access control and the ability for third party developers to register new data pools and mount new contracts (similar to the current World
contract).
Issues with the previous approach
-
Currently, state is organised into separate components to manage access control and implement functions with typed parameters and typed return values (since Solidity doesn’t support generic types)
- The component contracts call a function on a central World contract to register their update
- The components use solidity’s
abi.encode
under the hood, which leads to unnecessarily high gas costs (because abi.encode
reserves one 32 byte word per struct key)
-
Currently, developers have to opt-in to MUD’s entire framework to benefit from conceptually independent features like general purpose indexers, instead of being able to upgrade their existing codebases
-
Currently, developers have to manually define their components' schemas using a DSL, which is not intuitive for Solidity developers and leads to easy to miss bugs (when the defined schema doesn’t match the abi.encoded
value)
-
Currently, developers using MUD have to implement a lot of “boilerplate” code to read or write component values compared to setting vanilla Solidity storage variables
-
Current MUD:
PositionComponent position = PositionComponent(getAddressById(components, PositionId));
position.set(0x00, Position(1,2));
-
Vanilla Solidity:
positions[0x00] = Position(1,2);
-
Currently, MUD is limited to the ECS pattern (Entity-Component-System), requiring every piece of data to be associated with a single uint256
id. This makes some data modelling harder than desired, for example using a composite key (consisting of two values)
- The current workaround is to create composite entity ids by using
keccak(entity1, entity2)
, but this approach obfuscates the used entity ids and is cumbersome to work with
The following is a proposal to address all of the issues above and more.
Design goals
- A way to store introspectable structured data on-chain
- Introspectable = data schema can be retrieved on-chain, so default general-purpose indexers are possible
- General-purpose indexers by default
- Events notifying indexers about every state change
- On-chain schema, so indexers know how to interpret state changes
- SQL-compatible data modelling on-chain, so indexers can benefit from decades of SQL research
- Dynamic schemas / ability to add more schemas after the core contract has been deployed
- This is important to enable “autonomous worlds” where third party developers can add data packets and systems to an application
- As little gas-overhead compared to the most efficient custom way of storing data on-chain as possible
- As little “third party code managing core state” as possible. As much as possible should be done by the core library
- The best developer experience possible (at least as good as working with native solidity structs/mappings)
- Splitting up the core storage problem from the framework problem
- This allows more people to develop tools integrating with the core storage method, without having to opt-in to the framework
Core storage management library
- Implements logic to store and access data based on registered schemas
- Implements update events
- “Untyped” - uses
bytes
everywhere - typing is responsibility of wrapping libraries (see below)
- Any contract can implement the
IMudStore
interface / extend the MudStore
base contract to become compatible with a large chunk of MUD’s toolchain, like general-purpose indexers
- Data is organised by
table
and index into the table, where the index can be a tuple of multiple bytes32
keys
- This is a superset of ECS: In ECS, “Components” correspond to tables, and “Entities” are indices into the table. In this proposal, we allow to use tuples as index into the table, allowing more complex relationships to me modelled (and making the data model more similar to a relational database). However, single keys are still possible, so ECS is still possible.
- The tuple of keys used to index a table is emitted as part of the event, so it can be picked up by indexers and we don’t have to rely on hacks like hashed composite entities anymore.
Illustration of data model
// Illustration of data model:
// Assume we want to index with two keys: A, B
keys: (A, B)
valueSchema: (X, Y, Z)
conceptually:
{
[A1]: {
[B1]: {
X1,
Y1,
Z1
},
[B2]: {
X2,
Y2,
Z2
}
},
[A2] { ... }
}
-> translates into relational database:
| A | B | X | Y | Z |
| -- | -- | -- | -- | -- |
| A1 | B1 | X1 | Y1 | Z1 |
| A1 | B2 | X2 | x2 | Z2 |
| ...
-> translates to on-chain:
mapping(
keccak(A1,B1) => {X1, Y1, Z1},
keccak(A1,B2) => {X2, Y2, Z2}
)
Pseudo-code implementation with more details
// Solidity-like pseudo code
// omitting some language features for readability
// eg using keccak(a,b,c) for keccak256(abi.encode(a,b,c))
// or omitting memory, public, pure etc
enum SchemaType {
UINT8,
..., // in steps of 8, so 32 total
UINT256,
INT8,
..., // in steps of 8, so 32 total
INT256,
BYTES1,
..., // in steps of 1, so 32 total
BYTES32,
BOOL,
ADDRESS,
BYTES,
STRING, // until here we have 100 types
BIT, // we could add a native "bitpacking" type using the same approach described below
<T>_ARRAY // everything above as an array - until here we have 202 types
// 54 more slots to define more types and keep SchemaType a uint8
}
// A table schema can have up to 32 keys, so it fits into a single evm word.
// (Schemas above 32 keys are definitely an anti-pattern anyway)
// Working with unnamed schemas makes the core library simpler; naming keys is the job of wrapping libraries
type Schema = SchemaType[32];
// Interface to turn any contract into a MudStore
interface IMudStore {
event StoreUpdate(bytes32 table, bytes32[] index, uint8 schemaIndex, bytes[] data);
function registerSchema(bytes32 table, SchemaType[] schema);
function setData(bytes32 table, bytes32[] index, bytes[] data);
function setData(bytes32 table, bytes32[] index, uint8 schemaIndex, bytes data);
function getData(bytes32 table, bytes32[] index) returns (bytes[] data);
function getDataAtIndex(bytes32 table, bytes32[] index, bytes32 schemaIndex) returns (bytes data);
function isMudStore() returns (bool); // Checking for existence is this function is sufficient for consumers to check whether the caller is a MUD store (this could potentially be turned into eip-165 in the future)
}
library MudStoreCore {
// Note: the preimage of the tuple of keys used to index is part of the event, so it can be used by indexers
event StoreUpdate(bytes32 table, bytes32[] index, uint8 schemaIndex, bytes[] data);
constant bytes32 _slot = keccak("mud.store");
constant bytes32 _schemaTable = keccak("mud.store.table.schema");
// Register a new schema
// Stores the schema in the default "schema table", indexed by table id
function registerSchema(bytes32 table, SchemaType[] schema) {
// Optional: verify the schema only has one dynamic type at the last slot, see note 1 below
setData(_schemaTable, table, Convert.encode(schema));
}
// Return the schema of a table
function getSchema(bytes32 table) returns (SchemaType[] schema) {
bytes value = getData(_schemaTable, table);
return Convert.decodeUint8Array(value);
}
// Check whether a schema exists for a given table
function hasTable(bytes32 table) returns (bool) {
return getData(_schemaTable).length > 0;
}
// Update full data
function setData(bytes32 table, bytes32[] index, bytes[] data) {
// Optional: verify the value has the correct length for the table (based on the table's schema)
// (Tradeoff, slightly higher cost due to additional sload, but higher security - library could also provide both options)
// Store the provided value in storage
bytes32 location = _getLocation(table, index);
assembly {
// loop over data and sstore it, starting at `location`
}
// Emit event to notify indexers
emit StoreUpdate(table, index, 0, data);
}
// Update partial data (minimize sstore if full data wraps multiple evm words)
function setData(bytes32 table, bytes32[] index, uint8 schemaIndex, bytes data) {
// Get schema for this table to compute storage offset
SchemaType[] schema = getSchema(table)[];
// Compute storage location for given table, index and schemaIndex
bytes32 location = _getLocation(table, index);
uint256 offset = _getByteOffsetToSchemaIndex(schema, schemaIndex); // Simple helper function
assembly {
// set data at the computed location (location + offset)
}
// Emit event to notify indexers
emit StoreUpdate(table, index, schemaIndex, [data]);
}
// Get full data
function getData(bytes32 table, bytes32[] index) returns (bytes[] data) {
// Get schema for this table
// Compute length of the full schema
// Load the data from storage using assembly
// Split up data into bytes[] based on schema
// Return the data as bytes[]
}
// Get partial data based on schema key
// (Only access the minimum required number of storage slots)
function getDataAtIndex(bytes32 table, bytes32[] index, bytes32 schemaIndex) returns (bytes data) {
// Get schema for this table
// Compute offset and length of this schema index
// Load the data for this schema index from storage using assembly
// Return the data as bytes
}
// Compute the storage location based on table id and index tuple
// (Library could provide different overloads for single index and some fixed length array indices for better devex)
function _getLocation(bytes32 table, bytes32[] index) returns (bytes32) {
return keccak(_slot, table, index);
}
// Simple helper function to compute the byte offset to the given schema index based in the given schema
function _getByteOffsetToSchemaIndex(schema, schemaIndex) returns (uint256) {
// Sum `getByteLength(schemaType)` for every schema index before the given index
}
// Simple helper function to return the byte length for each schema type
// (Because Solidity doesn't support constant arrays)
function _getByteLength(SchemaType schemaType) returns (uint8) {
// Binary tree using if/else to return the byte length for each type of schema
}
}
// A helper library to convert any primitive type (+ arrays) into bytes and back
library Convert {
// Overloads for all possible base types and array types
// Encode dynamic arrays in such a way that the first 2 byte are reserved for the array length = max arr length 2**16 (to help decoding)
function encode(uint256 input) returns (bytes);
// Decoder functions for all possible base types and array types
function decodeUint8Array(bytes input) returns (uint8[]);
...
}
Notes
- If we only allow one dynamic array type per table schema, encoding/decoding/storing partial data gets much simpler and cheaper (the dynamic array type always has to come last in the schema)
- cheaper because only one storage access to get the schema, instead of additional storage access to get the length of each dynamic array. Also, dynamic array types anywhere else but at the last schema slot would shift all remaining schema values (even non-dynamic ones), so modifying partial data would be much more expensive (worst case as expensive as modifying the full data) - we could save developers from having to think about this in their model by restricting schemas to one dynamic type that has to come last.
Wrapping typed libraries
- While Solidity doesn’t support generic types, we can autogenerate libraries to set/get typed values based on user defined schemas to emulate the experience of working with a generically typed core library.
- The libraries encode typed values to raw bytes and vice versa to improve developer experience (in theory devs could call the core functions manually but devex would suck)
- The library detects whether the call comes from within a
MudStore
(eg if the contract using the library is called via delegatecall
from a MudStore
) or if the msg.sender
is a MudStore
(eg if the contract using the library is called via call
from a MudStore
) and automatically switches between writing to own storage using the core library and calling the respective access controlled methods on the calling MudStore
.
Pseudo-code implementation with more details
// Solidity-like pseudo code
// omitting some language features for readability
// eg using keccak(a,b,c) for keccak256(abi.encode(a,b,c))
// or omitting memory, public, pure etc
// ----- Example of an auto-generated typed library for a Position table -----
// -- User defined schema and id --
bytes32 constant id = keccak("mud.store.table.position");
struct Schema {
uint32 x;
uint32 y;
}
// -- Autogenerated schema and library --
library PositionTable {
// Detect whether the call to the system was done via delegatecall or a regular call
// to switch between writing to own storage and using access controlled external storage functions
// (see note 1. below)
function isDelegateCall() internal returns (bool) {
(bool success, bytes memory data) = address(this).call(
abi.encodeWithSignature("isMudStore()")
);
return success && abi.decode(data, (bool));
}
// Register the table's schema
// (used to compute data length when returning values from core lib and for input validation)
function registerSchema() {
// Autogenerated schema based on schema struct definition
SchemaType[2] schema = [SchemaType.UINT32, SchemaType.UINT32];
// Call core lib or wrapper contract to register schema
if(isDelegateCall()) {
MudStoreCore.registerSchema(id, schema);
} else {
MudStore(msg.sender).registerSchema(id, schema);
}
}
// Set the full position value
function set(uint256 entity, uint32 x, uint32 y) {
bytes[] data = [
Convert.encode(x),
Convert.encode(y)
];
// Set the data via core lib or wrapper contract
if(isDelegateCall()) {
MudStoreCore.setData(id, entity, data);
} else {
MudStore(msg.sender).setData(id, entity, data);
}
}
// Offer both syntax for convenience
function set(uint256 entity, Schema data) {
set(entity, data.x, data.y);
}
// Set partial schema values
function setX(uint256 entity, uint32 x) {
// Set the data via core lib or wrapper contract
if(isDelegateCall()) {
MudStoreCore.setData(id, entity, 0, x);
} else {
MudStore(msg.sender).setData(id, entity, data);
}
}
function setY(uint256 entity, uint32 y) {
// Set the data via core lib or wrapper contract
if(isDelegateCall()) {
MudStoreCore.setData(id, entity, 1, x);
} else {
MudStore(msg.sender).setData(id, entity, data);
}
}
// Get the full position value
function get(uint256 entity) returns (Schema) {
// Get data via core lib or wrapper contract
bytes[] data = isDelegateCall()
? MudStoreCore.getData(id, entity)
: MudStore(msg.sender).getData(id, entity);
return Schema(
Convert.decodeUint32(data[0])),
Convert.decodeUint32(data[1]))
);
}
// Get partial schema values
function getX(uint256 entity) returns (uint256) {
// Get data via core lib or wrapper contract
bytes data = isDelegateCall()
? MudStoreCore.getData(id, entity, 0)
: MudStore(msg.sender).getData(id, entity);
return Convert.decodeUint32(data);
}
function getY(uint256 entity) returns (uint256) {
bytes data = isDelegateCall()
? MudStoreCore.getData(id, entity, 1)
: MudStore(msg.sender).getData(id, entity);
return Convert.decodeUint32(data);
}
}
Usage examples
// Usage examples from within System:
PositionTable.set(0x01, 1, 2);
PositionTable.set(0x01, {x: 1, y: 2});
PositionTable.set({entity: 0x01, x: 1, y: 2});
PositionTable.setX(0x01, 1);
Schema position = PositionTable.get(0x01);
uint32 x = PositionTable.getX(0x01);
Notes
- We want to be able to detect
deletegatecall
in the storage library called in the system
- If the system is called via
delegatecall
, it means it can write to storage using MudStoreCore
directly without having to call functions with access control on a MudStore
contract. This saves (700 call base gas + x
calldata gas + y
access control check gas) per storage operation
- To detect
delegatecall
inside of a library, we can check if this
has the isMudStore()
function
- since systems don’t implement their own
isMudStore
function, if this
supports isMudStore
, it means the current context is a MudStore
and we can use libraries directly (this could be turned into something like ERC165’s supportsInterface
)
- This approach is cheaper than alternatives like setting a temporary storage variable (5k gas to temp store, 2.1k to read from the system)
Framework (aka World)
Edit: the original proposal included a section on the World framework. Since then we reworked the World framework concept and moved the discussion about it to a new issue (#393). For reference this toggle includes the original proposal.
- Using the
MudStoreCore
library, any contract can become compatible with MUD’s toolchain
- To further improve developer experience, we create a framework around
MudStoreCore
(like the current World contract and conventions)
- Common patterns for modularising code (into modular systems)
- Common patterns for approvals akin to ERC20-like approvals, used for:
- system-to-system calls
- session wallets
- atomic contract interactions (akin to ERC20 swaps)
- Replacing dynamic contract addresses with known and human-readable function names inside the framework
- The framework has similarities to the well known diamond pattern, but implements facets differently to support an “autonomous mode”, where third party developers can register new tables and new systems on the core World contract.
- Systems (akin to facets) can be registered as
DELEGATE
systems, meaning they are called via delegatecall
from the World contract
DELEGATE
systems have full access to all storage, so they can only be registered and upgraded by the World’s owner
- The World can be made “autonomous” by setting its owner to
address(0)
- This means no more
DELEGATE
systems can be registered and the existing DELEGATE
systems can not be upgraded anymore
- Systems can be registered as
AUTONOMOUS
systems, meaning they are called via call
from the World contract
AUTONOMOUS
systems set state via the World’s access controlled setData
method
- They can read from all tables, but can only write data to tables they have write access to
- Anyone can register a new
AUTONOMOUS
system
- The owner of an
AUTONOMOUS
system can upgrade the system (by overwriting the existing entry in the SystemTable
)
- All systems are called via the World’s
fallback
method
- Why?
- The central World contract can implement logic like access control, approval pattern, system-to-system calls, account abstraction
- This central logic can be upgraded by the World owner (which can be a DAO)
- Access control bugs can be fixed and new features can be added for the entire World instead of each system separately
- Neither do Systems need a reference to “their World” in storage, nor does the World parameter need to be passed via a parameter
- Instead systems can trust the
msg.sender
to be the World contract (if called via call
) and therefore read and write data via World’s access controlled methods, or have write access to the delegated storage directly (if called via delegatecall
). All of this can be abstracted into the autogenerated libraries per table.
- This also enables systems to be deployed once and then be registered in and called from multiple different World contracts (akin to diamond's facets).
- Same developer and user experience independent of working in “diamond mode” with mostly
DELEGATE
systems or in “autonomous mode” with AUTONOMOUS
systems.
- How?
- When registering a new system, the World computes a new function selector based on the system’s name and function signature
- Example: Registering a
CombatSystem
’s attack
function:
- Register via call to
world.registerSystem(<contractAddr>, "Combat", "attack(bytes32)")
- Now the system can be called via
world.Combat_attack(bytes32)
(the call will be forwarded to CombatSystem.attack(bytes32)
)
- Since systems are called via the World contract,
msg.sender
is either the external msg.sender
(if the system is called via delegatecall
) or the World contract (if the system is called via call
).
- Therefore all systems’s functions need to have
address _from
as their first parameter, which will be populated by the World contract with the external msg.sender
, or other addresses based on some approval pattern (see discussion in #327)
- Great benefit of this approach: access control, account abstraction, etc can all be implemented (and upgraded) at the central World contract instead of separately in each system (see notes on “Why” above)
Pseudo-code implementation with more details
// Solidity-like pseudo code
// omitting some language features for readability
// eg using keccak(a,b,c) for keccak256(abi.encode(a,b,c))
// or omitting memory, public, pure etc
// `MudStore` base contract implements all view functions from IMudStore (getData, ...)
// that don't require access control checks.
// World contract extends `MudStore` and implements access control for write methods (`setData`)
contract World is MudStore {
error World_TableExists();
function registerSchema(bytes32 table, SchemaType[] schema) {
// Require unique table ids
if(MudStoreCore.hasTable(table)) revert World_TableExists();
// Register schema
MudStoreCore.registerSchema(table, schema);
// Set table's owner in owner tab
// (OwnerTable uses auto-generated typed helper table like `PositionTable` described above)
OwnerTable.set({ index: table, owner: msg.sender });
}
function setData(bytes32 table, bytes32[] index, bytes[] data) {
// TODO: Require caller to have permission to modify table
// (access control details tbd)
// Set data
MudStoreCore.setData(table, index, data);
}
// Register a new system
// -> Anyone can call this method, but only World owner can pass DELEGATE mode
// - DELEGATE systems are called via delegatecall and have access to all storage
// - AUTONOMOUS systems are called via call and modify storage via access controlled `setData` method
function registerSystem(
address contractAddress,
string contractName,
string functionSig,
ExecutionMode mode) {
// TODO: if mode is DELEGATE, require msg.sender to be World's owner
// TODO: check if contract name is already registered
// - if so, require msg.sender to be owner
// - else, register contract name and set msg.sender as owner
// TODO: check if function signature already exist for the given contract
// - if so, this is an upgrade
// - require msg.sender to be system's owner
// - and if the given system is a DELEGATE system, require World's owner to be system's owner
// (to prevent upgrades to DELEGATE systems in fully autonomous mode)
// Compute the selector to use to call this system via the fallback() entry point
// using the format <contractName>_<functionSig>()
// NOTE: this is slightly simplified - in reality we have to remove the `address _from` parameter
// from the function signature because it will be automatically populated by the World based on `msg.sender` (see notes above)
bytes4 worldSelector = bytes4(keccak(abi.encodePacked(contractName, "_", functionSig)));
// Register World selector with contract address
SystemTable.set({
index: bytes32(worldSelector),
addr: contractAddress,
selector: bytes4(keccak(functionSig),
mode: mode
});
}
// TODO: Set approval (see general approval pattern discussion in mud#327)
function approve( ... ) { ... }
// The fallback function is used for consumers to call system functions
// with proper types. We can generate an ABI for the World contract based
// on registered systems.
// The function selector is generated in `registerSystem` (see above)
fallback() external payable {
// Find system based on function selector
SystemTableEntry system = SystemTable.get(msg.sig);
if(system.mode == ExecutionMode.DELEGATE) {
// TODO: If system is DELEGATE system, populate the _from parameter with msg.sender,
// forward the call via `delegatecall`, and return any value.
// This is almost equivalent to EIP2535 (diamond pattern), except from
// using `_from` instead of `msg.sender`
} else {
// TODO: If system is an AUTONOMOUS system, populate the _from parameter with msg.sender
// forward the call via `call`and return any value.
// The called system will use access controlled `setData` methods of this contract.
}
}
}
Usage example
// ----- Example of a move system -----
contract MoveSystem {
// System can trust the `move` function will only be called via a `MudStore` contract (in our case World)
// and must therefore use the _from parameter instead of msg.sender. (Note: this requires something like the "general access pattern" (#327) to be in place)
// Since system doesn't have any internal state, it doesn't have to check whether the call actually comes from a `MudStore`
// (because state will always be modified in the calling contract and the call fails if it doesn't come from a MudStore)
function move(address _from, bytes32 _entity, Position _position) public {
// Check if the `_from` address owns the given entity
require(OwnerTable.get(_entity) == _from, "only owner can move entity");
// Set the new entity's new position value
PositionTable.set(entity, position);
}
}
Further work / extensions
Table migrations
- For a persistent world it is plausible that table schemas need to be upgraded from time to time. How could this be implemented in this proposal?
- We could add an additional signature for
setData
and getData
that includes a uint16 version
parameter
MudStoreCore._getLocation
includes the version to get the storage location hash
- If the version parameter is omitted, it is set to
0
by default
- To increase a table’s version, a “migration” has to be specified (how to interpret the original data with the new schema). This migration is used to generate a typed access library using the new schema, which calls
setData
with an incremented index value and the new schema, and implements the migration in the getter functions.
Acknowledgements
- This proposal is based on many internal discussions with and ideas by @ludns, @holic, @Kooshaba and @authcall
- Generating libraries to improve developer experience and allow typed access to tables is based on an idea by @FlynnSC
- Registering contracts as “facets” and calling them via a
fallback
method, as well as using delegated storage is based on Nick Mudge, "EIP-2535: Diamonds, Multi-Facet Proxy," Ethereum Improvement Proposals, no. 2535, February 2020. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-2535.
- Using diamond storage to improve the developer experience and gas efficiency of MUD is based on ideas by @cha0sg0d, @0xhank and @dk1a