web3.py Patterns: Typed Data Message Signing

Message signing in Ethereum has a surprisingly complicated history.

Best practices and economic realities have evolved considerably over the years. In Ethereum's earliest days, you could get away with storing any and all activity onchain without breaking the bank. Years later, when demand for block space grew and even simple transactions priced many users out, app developers were faced with very different questions, e.g., "How can I maximize the user experience in as few transactions as possible?" As the cost of transactions grew, so too did the popularity of replacing them with signed messages, where possible.

For many use cases, submitting a transaction is overkill; an application just needs to know that an account owner wants to take some action. That action could be logging in to a service, posting on a social network, or anything else that doesn't necessarily need to update onchain state. In these cases, a signed message and some offchain processing gets the job done.

For activity that does need to make its way onchain, message signing can permit relayers to subsidize gas for users. This enables app developers to remove the biggest source of onboarding friction: wallet download, setup, and funding – while still providing users with a wallet they fully own. The rise of account abstraction tooling and "Layer 2" scaling solutions have made this economically reasonable for many use cases and app developers can factor the transaction fee subsidies into their cost of user acquisition.

TL;DR – signing and sending a message is a solved problem. Modern cryptography allows for an app developer to be certain that a message signer is in possession of a private key without revealing that private key. The issue is that the user experience is miserable – and even outright dangerous – without an agreed upon standard for what can be signed, how to display it, and how to parse it.

In order to be efficiently transmitted, the content a user signs needs to be hashed, but presenting a user with an obscure hash to sign is, at best, uninformative and, at worst, an opportunity for bad actors to drain your account.

Pause for dramatic effect

This is a good opportunity to stress the point: never sign an obscure hash in isolation. It's important to understand that a hash function produces a random-looking set of bytes. At a glance, there's no way to tell if you're looking at the hash of an innocuous message or the hash of a transaction that approves an attacker's ability to move your tokens. One of the primary goals of any message signing standard is to produce safer, more human-readable interactions.

An answer

Today's best, i.e., most adopted, solution is eth_signTypedData_v4. As the name suggests, this method is the latest of several iterations on a concept introduced via EIP-712: Typed structured data hashing and signing. (Very brief version history here.)

In short, EIP-712 outlines a formula for apps to turn metadata, data types, and data values into meaningful messages for users to understand and sign. There's a lot to digest when walking through the EIP, but you don't need to deeply understand the process of encoding and hashing each section in order to use typed data messages; eth-account and similar libraries abstract away much of that complexity. If you're curious though, you'll find those details in the EIP and in JavaScript or Python source code.

How to use it

As mentioned, there are three components to assembling typed data for a user to sign: metadata, types, and values. In this section, we'll walk through a Python version of the example used in the EIP. The sample code is for a fictitious email/messaging app called "Ether Mail," where each signed message contains some content being sent from one account to another.

Domain Separator

That metadata mentioned above goes by the name domain data or domain separator and its purpose is twofold: 1) to make it clear to the user which application is requesting their signature, and 2) for the app developer to have smart contract-readable metadata within the message, allowing them to guard against stale or fraudulent messages.

The domain data is flexible; each application can optionally use whichever fields it requires.

domain_data = {
    "name": "Ether Mail",
    "version": "1",
    "chainId": 1,
    "verifyingContract": "0xCcCCccccCCCCcCCCCCCcCcCccCcCCCcCcccccccC",
    "salt": b"decafbeef",
}

Message Types

Messages in this format must be statically typed and mapped to types present in smart contract languages: bytes1 to bytes32, uint8 to uint256, int8 to int256, bool, address, bytes, string, and custom arrays or structs.

This example outlines a message that could be an email from one address to another. We define a custom Person struct which contains a name and a wallet address. Next, a custom Mail struct includes a from and to of the new Person type, in addition to the email contents.

msg_types = {
    "Person": [
        {"name": "name", "type": "string"},
        {"name": "wallet", "type": "address"},
    ],
    "Mail": [
        {"name": "from", "type": "Person"},
        {"name": "to", "type": "Person"},
        {"name": "contents", "type": "string"},
    ],
}

Message Values

Finally, populate the Mail message you want the user to sign.

msg_data = {
    "from": {
        "name": "Cow",
        "wallet": "0xCD2a3d9F938E13CD947Ec05AbC7FE734Df8DD826",
    },
    "to": {
        "name": "Bob",
        "wallet": "0xbBbBBBBbbBBBbbbBbbBbbbbBBbBbbbbBbBbbBBbB",
    },
    "contents": "Hello, Bob!",
}

Putting it all together

Once you've got the domain, types, and values, you're in the clear to encode the message and present it to the user for signing.

from eth_account import Account

# Obligatory: don't store your private key in plain text or commit it to git:
private_key = "0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

# sign the message with the private key:
signed_msg = Account.sign_typed_data(private_key, domain_data, msg_types, msg_data)

print(signed_msg.messageHash)
# HexBytes('0xc5bb16ccc59ae9a3ad1cb8343d4e3351f057c994a97656e1aff8c134e56f7530')

Note: this example leverages eth-account directly, but you can access the same API within web3.py. For example:

signed_msg = w3.eth.account.sign_typed_data(...)

A signed message represents an intent to perform some action. The message sender, in this case, would like to send some text to another account. In the next section, we'll make very sure the signed message hash came from the proper author.

Verify the signer

You, the app developer, can now take this signed message hash, pass it to your smart contract or decode it locally, and decide to perform some action based on what's in the message. As an example, it might be a very good idea to verify that the signer of the message matches the from wallet address in the message.

You'll find plenty of documentation written about recovering the address of a message signer. Be warned that these functions work just fine for typed data (i.e., they will return the message signer's address), but generally they won't take into account or verify your domain data.

signable_msg = Account.messages.encode_typed_data(domain_data, msg_types, msg_data)

signer = Account.recover_message(signable_msg, signature=signed_msg.signature)

recover_message can recover the signer's address, but can't tell you if the message had valid domain data.

Smart contracts can perform the same decoding. In solidity, the ecrecover function accepts a message hash, v, r, and s as inputs and returns an address:

contract Verifier {
    function recoverSigner(bytes32 message, uint8 v, bytes32 r, bytes32 s) public pure returns (address) {
        return ecrecover(message, v, r, s);
    }
}

Verify the domain

The whole point of the domain separator is to provide additional context and security, so don't throw it out. EIP-712 supplies a formula for hashing all the components of the typed message and verifying not just the message signer, but the domain too.

To be frank, it's not a straightforward process. Fortunately, OpenZeppelin does provide helpful utilities (EIP712.sol) to make parts of this easier. In particular, the OpenZeppelin contract can handle most of the domain separator components and includes a generic encoding function (_hashTypedDataV4), but you're still left to your own devices for the type definitions and value hashing.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.23;

// Note: ERC-5267 "Retrieval of EIP-712 domain" baked into OZ's EIP712:
import {EIP712} from "@openzeppelin/utils/cryptography/EIP712.sol";
import {ECDSA} from "@openzeppelin/utils/cryptography/ECDSA.sol";

contract Verifier is EIP712 {

    // define custom data types: Person and Mail
    
    struct Person {
        string name;
        address wallet;
    }

    struct Mail {
        Person from;
        Person to;
        string contents;
    }


    // per EIP-712, the hash of custom data type is included
    //    when encoding data of that type
    
    bytes32 constant PERSON_TYPEHASH = keccak256(
      "Person(string name,address wallet)"
    );

    bytes32 constant MAIL_TYPEHASH = keccak256(
      "Mail(Person from,Person to,string contents)Person(string name,address wallet)"
    );


    // instantiate the contract:
    // OZ EIP712 sets `verifyingContract` and `chainId` upon instantiation
    
    constructor() EIP712("Ether Mail", "1") {}


    // data hashing helpers:
    
    function hashString(string calldata _source) private pure returns (bytes32) {
      return keccak256(bytes(_source));
    }
    
    function hashPerson(Person calldata _person) private pure returns (bytes32) {
      return keccak256(abi.encode(PERSON_TYPEHASH, hashString(_person.name), _person.wallet));
    }

    function hashMail(Mail calldata _mail) private pure returns (bytes32) {
      return keccak256(abi.encode(MAIL_TYPEHASH, hashPerson(_mail.from), hashPerson(_mail.to), hashString(_mail.contents)));
    }
    
    function mailHashData(Mail calldata _mail) public view returns (bytes32) {
      bytes32 encoded = hashMail(_mail);
      return _hashTypedDataV4(encoded);
    }


    // return the address of the signer:
    
    function recoverAddress(Mail calldata _mail, bytes calldata _signature) public view returns (address) {
      bytes32 encoded = mailHashData(_mail);
      return ECDSA.recover(encoded, _signature);
    }
}

If this contract is deployed, you can then confidently use the recoverAddress function to return the signer's address of a signed typed data message. Additionally, if any of the domain separator values have been altered, the resulting address will be completely different.

Below is a pseudocode usage example of that contract. (Full example within an Ape project available here.)

msg_data = {
    "from": {
        "name": "Cow",
        "wallet": "0xCD2a3d9F938E13CD947Ec05AbC7FE734Df8DD826",
    },
    "to": {
        "name": "Bob",
        "wallet": "0xbBbBBBBbbBBBbbbBbbBbbbbBBbBbbbbBbBbbBBbB",
    },
    "contents": "Hello, Bob!",
}

# pseudocode: verify the message signer + domain in the contract
signer_712 = contract.recoverAddress(tuple_of(msg_data), signed_msg.signature)
assert signer_712 == expected_address

Wrapping Up

While the implementation details include some complexity, hopefully the value of typed data signing is now clear and worthwhile. Now that you've got a baseline, you can get more creative with how to verify and leverage signed messages. Maybe you can imagine how this could grow into a full authentication scheme or how you might leverage timestamps or block numbers to set time limits for actions.

Note that the typed data methods are a recent investment in the eth-account library. Feedback, feature requests, bug reports and other suggestions are welcome within an issue on the repo.

A few more helpful resources:

  • EIP-712 and OpenZeppelin's 712 utilities
  • Python: sample repo using Ape and eth-account
  • Python: eth-account docs
  • JS: Dan's thread in the wagmi repo; verifier contract modeled after his ✊
  • JS: MetaMask's eth_signTypedData_v4 docs
  • An older, but still salient message signing deep dive from Maarten