The Mir Execution Environment

Brendan Farmer, Daniel Lubarov & William Borgeaud · 2021-01-25

TLDR: Mir is designed to support applications that developers can build on Ethereum, but in an execution environment where transactions are validated with zero-knowledge proofs.

Introduction

Mir aims to allow developers to build and run scalable, full-featured, and private decentralized applications. On Mir, every transaction is verified with a zero-knowledge proof. The arithmetic circuits required for these proofs are a significantly different computing environment from a typical VM, but Mir's kernel presents a rich execution environment that supports a wide range of applications.

State on Mir

On Mir, the basic unit of state is a record. A record is defined as a tuple $(\phi_b, \phi_m, \phi_d, \phi_f, \textsf{address}, \textsf{state})$, where $\phi_*$ are birth, mutation, death, and function predicates, $\textsf{address}$ is an optional address tag for mutable records, and $\textsf{state}$ is a field element.1

Mir represents the current state of all records in the Commitment Set, which consists of two data structures, the Commitment Log and the Liveness Mask (described here). An application's state and logic need not be published to the network; validators instead use the Commitment Set to verify claims about record state.

Record Diagram

Applications running on Mir can create, mutate, consume, and call records as functions. When a record is created, it is added to the Commitment Set. A record is consumed when it is set to inactive in the Commitment Set. When a record is mutated, it is consumed, and its address is tagged to a newly-created record representing its updated state.

Operations on records are performed in transactions. A transaction consists of a list of created record commitments $\{b_i\}$, consumed record addresses $\{d_i\}$, accessed records $\{a_i\}$, previous block roots $\{r_i\}$,2 optional auxiliary data $\textsf{aux}$, and a zero-knowledge proof $\pi$.

Beyond ZEXE

In ZEXE,3 record logic is encoded in birth and death predicates. This programming model is an extension of Pay-to-Script-Hash in Bitcoin: it allows developers to create new tokens with birth predicates, and specify arbitrary conditions for spending those tokens with death predicates. However, this model can't support many of the most popular applications on Ethereum, specifically those that rely on shared state.

ZEXE transaction

Consider an application that relies on a shared interest rate or collateral ratio, as in MakerDAO, or a liquidity ratio, as in Uniswap. When we open a CDP, we have to access the global state of the system. In ZEXE, we can create a record for this shared state, but in order for a predicate to read state from a record in a transaction, it must consume that record. Unfortunately, a record can only be consumed once, limiting synchronous access to shared state.

The Mir Execution Environment

Mir introduces a new execution environment (EE), where records are mutable, and where it's possible to access record state (and even execute functions stored in records) without consuming those records. This is achieved by adding an address tag to identify mutable records, adding a mutation predicate and a function predicate to records, and providing addresses of accessed records to the transaction proof as public inputs.

Mir Kernel

We can see that records 1 and 2 are consumed in this transaction, records 3 and 4 are created, record 5 is called as a function to access state without consuming a record, and record 6 is mutated to record 7. Mutating a record is similar to consuming and recreating it, except the EE enforces that the consumed record's address tag is carried over.

Transactions

Consider a transaction consisting of created records $\{b_i\}$,4 mutated records $\{m_i\}$, consumed records $\{d_i\}$, and accessed records $\{a_i\}$.

Executing an application

The transaction creator retrieves the addresses, block roots, state, predicate circuits for each existing record, either via local storage or externally. Application logic is executed off-chain, and used to generate state, verification keys, and commitments for newly-created records $\{b_i\}$ and $\{m_i\}$. Newly created mutable records are assigned an address tag that is inherited from the mutated record.

Generating predicate proofs

Predicate proofs verify that the predicate $ \phi_* $ corresponding to an operation being performed on a record (creation, mutation, access, or consumption) is satisfied. Predicate proofs $\pi_{*}$ are generated for each record involved in the transaction.5

The public input for each predicate proof is constructed from the set of addresses of existing records, the states and predicates for each record,6 and the transaction memory. Each predicate in the transaction is given a memory location that can be "written" to by that predicate proof, allowing predicate circuits to access return values from function predicates.

Generating a transaction proof

The transaction proof recursively verifies predicate proofs with non-deterministically provided verification keys, and verifies that record commitments contain the correct state and hashed verification keys, ensuring that the public input provided to predicate proofs is consistent with the record commitments. It also checks that the address tag for each mutable record is correctly applied.

The transaction creator non-deterministically provides inclusion proofs in previous blocks for each existing record commitment, and the transaction proof verifies these inclusion proofs with the provided addresses and block roots $\{r_i\}$ for each existing record.

EE features

Composability

The Mir Execution Environment allows developers to build applications that are composable: they can synchronously access records containing shared state, like an oracle or an order book, or call functions that enforce global behavior, like a whitelist or blacklist. Importantly, this increase in functionality doesn't present a penalty in proving time or efficiency.

Global state

Transactions can synchronously read from global state, so a natural question is whether it's possible to synchronously write to global state. Application developers have two options to handle synchronous writes, either via non-atomic "update records," or Chained Transactions.

Update records

To see how the non-atomic option works, let's say that I have a simplified MakerDAO protocol that lets users lock up collateral, and the total amount of collateral in the system is tracked in a global state record, which determines interest rates, collateral ratios, and so forth. Suppose that 10 users lock up collateral in a single block, and each must update the global state of the application.

Global State Diagram

A developer can require an update record to be emitted when collateral is locked up in a CDP. These intermediate records are then gathered and used in a transaction to update the record that contains shared state.7 Each transaction that creates a CDP can then access the updated global Maker state.

Chained transactions

Certain applications, like Uniswap, wouldn't work well with update records, because transaction creators can't be sure of the exact state of a pool when they submit their transactions. Instead, we should be able to support the same application flow as on Ethereum, where users can submit transactions that are then sequenced and processed to update shared state.

On Mir, we call those Miner- or Validator-Completed Transactions. A user will submit either a witness, or a partially-proven transaction, to a sequencer. The sequencer includes transactions that write to the same record, and orders the transactions such that multiple writes can occur atomically in a single block. In a future post, we'll describe how this works in more detail.

Chained Transactions

Data availability

An immediate question arises: if a shared record can be mutated by any user, how do we prevent a malicious user from mutating it while keeping the new state private? That would make it impossible for other users to interact with it, so clearly we need a way to prevent such behavior.

We could designate certain records as public records, and mandate that validators store the state of those records. However, this approach raises a couple concerns. First, how would we enforce this behavior? Mir validators wouldn't actually need any record data to validate transactions (in contrast to Ethereum), so we would need to turn to proofs of replication,8 which we would rather avoid. Second, how can we manage storage costs? Ethereum's approach of charging one-time fees for writing data is not sustainable, since actual storage costs are a function of time. Some projects are turning to rent as a solution,9 but this presents its own challenges.

We decided to take an unopinionated approach. Predicates can require that some data (including, but not limited to, the payload of a shared record) be published to the network, but we don't mandate long-term storage of this data. This is similar to Ethereum's CALLDATA, which is propagated but not stored by default, and is used by rollups like zkSync as a cheaper storage alternative. We expect that many applications will work fine with this IPFS-like storage model, as some parties will be inclined to store application data without any mandate or reward from the protocol. Let's consider a few examples:

  • A price oracle, such as the one in MakerDAO. Since a price oracle's state is tiny, the cost of storing it is trivial, so users of the oracle may as well persist its state in order to guarantee continued availability of the oracle.
  • A DEX with off-chain order books, such as 0x. As suggested in Zexe,3 each order could be represented by a record. These records would be stored by their owners, so there is no data availability concern.
  • A game such as CryptoKitties. Companies like Axiom Zen would be incentivized to store any shared record data. There could be concerns about the centralization of this data, but a user could always eliminate this concern for herself by storing a replica herself.

In the future, one or more sidechains could be added for storing data with explicit incentives, to support any applications which lack these natural storage incentives. Application developers could then write predicates to verify that a piece of data is included in the committed state of a storage sidechain, ensuring its availability.

Future development

There are a few areas where the current implementation of Mir's EE could be extended or improved, including:

Synchronous reads

There's currently a limitation in how accessed mutable records are handled in the EE. Consider a price feed, and an application that checks whether the price is below some value $v$. Transaction proofs make assertions about the specific state of a mutable record, but a quickly changing price feed might mutate to a new state (but still be below $v$), invalidating the transaction proof before it's accepted by the network.

We're currently working on changing how accessing records is handled, so that a transaction proof could make an assertion about a mutable record, and the assertion could be proven as valid in the recursive proof verifying the transaction proof, allowing assertions about record state to operate on the live state of a quickly changing record.

Subpredicates

Instead of encoding a birth/mutation/death predicate as a single circuit, it would be more efficient and useful for different applications to encode a condition as a set of subpredicates, and then prove in the transaction proof that certain subpredicates are satisfied. This would allow Mir to support complex applications without a significant proving time penalty, and would make it easier for applications to enforce conditions on records (such as a slashing condition) without dictating their exact logic.

Conclusion

The Mir EE is thin rather than opinionated: it consists of generic actions performed on records commitments. Mir doesn't have gas, or a complicated VM or runtime. It doesn't mandate a particular storage method, but separates availability from storage, so devs can use whatever storage works best for them.

The idea that it's difficult or impractical to implement Ethereum contracts in the UTXO model is increasingly being challenged,10 as Ethereum and other systems begin to migrate to a UTXO approach to support stateless transaction validation.

Mir demonstrates that it's possible to support rich functionality in a pseudo-UTXO model. The kernel is powerful enough to support applications currently running on Ethereum, so Mir can provide privacy, scalability, and space efficiency to decentralized applications.

Notes

1

This field element is generally the image of a hash function, or a commitment to some larger amount of state.

2

By block root we mean the root of the subtree of the Commitment Log that contains the record commitment involved in the transaction. We need these block roots because we need to check that the correct record commitment is used in a transaction proof, so within the transaction, we verify a proof of inclusion with the block root and record address.

4

We'll use $i$ as a dummy index for all records, even though there might be different numbers of consumed, mutated, created, and accessed records.

5

Mir actually allows developers to write predicates that are repeated in a transaction (like a balance conservation check) so that they're verified only once, to improve proof generation time.

6

We have to provide record states to predicate proofs, rather than commitments, as we have to verify record commitment openings in the transaction proof, which is defined over a different curve, with a different scalar field, and evaluating a SNARK-friendly hash non-natively is inefficient.

7

The kernel is agnostic as to who pays for this update transaction. It could be an application developer, those with an incentive for the application to continue operating, or a mechanism requiring that users contribute a fee with their update records could be implemented in the kernel.

9

See, for example, Vitalik's rent suggestion from 2018.