Concordance: Safely Simplify Complex Smart Contract with LLM

September 12, 2025

Concordance: Automatically Simplifying Smart Contracts

Making complex smart contracts easier to read.

Author:

John Toman

Editors:

Uri Kirstein

Ilya Leybovich

As the Web3 space has matured, smart contract developers have likewise grown more and more sophisticated, writing heavily optimized code, often in the low-level assembly language available in Solidity. While this approach means faster (and cheaper) smart contract execution, these sophisticated coding techniques make auditing and code reviews more difficult.

That’s why we’re excited to announce the release of a new open-source tool named Concordance that can help bring greater understanding to deeply complex code. Concordance automatically simplifies complex smart contract implementations using LLMs without changing the observable on-chain behavior of the code. We are able to guarantee that the behavior doesn’t change thanks to our equivalence checker Concord.

An Overview of Concordance

To explain why we wrote Concordance and how it can be useful, let’s start by looking at an example function taken from the Solady library:

function safeTransferFrom(
   address token,
   address from,
   address to,
   uint256 amount
) internal {
   assembly {
       let m := mload(0x40)
       mstore(0x60, amount)
       mstore(0x40, to)
       mstore(0x2c, shl(96, from))
       mstore(0x0c, 0x23b872dd000000000000000000000000)
       let success := call(gas(), token, 0, 0x1c, 0x64, 0x00, 0x20)
       if iszero(and(eq(mload(0x00), 1), success)) {
           if iszero(
               lt(
                   or(iszero(extcodesize(token)), returndatasize()),
                   success
               )
           ) {
               mstore(0x00, 0x7939f424)
               revert(0x1c, 0x04)
           }
       }
       mstore(0x60, 0)
       mstore(0x40, m)
   }
}

The name suggests that this is performing a “safe” `transferFrom` operation, but how it’s doing so is not immediately clear and the revert condition borders on inscrutable. It’s quite common to find hand-optimized code like this in Web3 applications outside of the Solady library. Auditing or simply reviewing heavily optimized code like this example is extremely difficult and error prone.

What if we could review an equivalent but simpler version?

Concordance will iteratively simplify our Solady example to:

function safeTransferFrom(
   address token,
   address from,
   address to,
   uint256 amount
) internal {
   (bool success, bytes memory returndata) = token.call(
       abi.encodeWithSelector(0x23b872dd, from, to, amount)
   );
  
   // Check if the call succeeded AND returned exactly 1 (true)
   bool returnsOne = success && returndata.length >= 32 &&
                       uint256(bytes32(returndata)) == 1;
  
   if (!returnsOne) {
       // When success = 0: always revert
       // When success = 1: revert if (token.code.length == 0 OR returndata.length > 0)
       if (!success) {
           revert TransferFromFailed();
       } else {
           // success = 1 case
           if (token.code.length == 0 || returndata.length > 0) {
               revert TransferFromFailed();
           }
       }
   }
}

Here the assembly code that builds the call to `token` has been replaced by the use of Solidity’s built ABI encoding APIs. The revert logic has also been significantly clarified. We can now see that the function reverts if:

The transferFrom function call reverted, or
The function returned a non-empty buffer which didn’t correctly encode the boolean “true” or
The function returned an empty buffer and the token has no code

This implementation behaves the same on all possible parameters to safeTransferFrom and all possible behaviors of the called token (e.g., the token reverts, or returns a malformed buffer or is an EOA, etc.).

How It Works

Notice that Concordance arrived at this result “iteratively.” Concordance first uses an LLM to propose an equivalent rewrite to the provided code. In our example, this was the Solady safeTransferFrom. However, as we just discussed, the original function is quite complex, and so this initial proposal from the LLM may have some subtle differences in behavior. To ensure the rewrite is equivalent, Concordance always compares the proposal from the LLM to the original code using an “equivalence checker” called Concord.

This equivalence checker will either produce a mathematical proof that the two pieces of code behave the same on all possible inputs, or an example input where the behavior diverges. In the latter case, this example found by Concord is fed back into the LLM with instructions to correct the proposed rewrite. This loop iterates until Concord, the equivalence checker, judges the proposed rewrite equivalent to the original code.

By using Concord to check the proposals from the LLM, the rewrite returned to the user is necessarily equivalent to the original code in every observable way. Concord is thus the foundation on which Concordance rests.

Equivalence Checking

Concord is the name of our general purpose equivalence checker, a tool that can automatically (and exhaustively) check whether two smart contracts are equivalent. We’ve mentioned this tool before when discussing the Vyper compiler bug we found using a prototype of Concord. In that writeup we didn’t get into the details of what it means for two programs to be equivalent, but we’ll address that here.

We’ve said that the resulting code is equivalent in every observable way. This distinction is important: if we insisted that the Ethereum Virtual Machine’s memory and stack always looked exactly the same when running two different programs, that would leave little room for any of the changes necessary for the rewrite above. Instead, we insist that the observable side effects of the code must be the same. These side effects are:

Storage changes
Generating log events
Reverting/returning the current EVM frame
An external call (including value transfers)

The data associated with these side effects (the reason for a revert, the topics of an emitted log) must also be the same between two programs. More technically, two equivalent implementations A and B will satisfy the following properties:

For any input, if A reverts with reason R, B must also revert with reason R on the same input
For any input, if A returns with buffer R, B must also return with buffer R on the same input

If both implementations revert then any side effects they performed are rolled back. However, when the implementations return, we insist the effects they had on the state are the same. In particular, if A and B return without exception then they must both make the same sequence of external calls, to the same addresses, with the same ether value and calldatas. They must also emit the same logs, with the same topics and data, and the values in every storage slot must be equal.

Gas consumption is not part of our definition of equivalence. Indeed, unless two EVM executions execute exactly the same opcode with the exact same data, the chances that gas consumption is equivalent is incredibly small. Further, when simplifying code, it is quite likely that the simplified version will use more gas–this is the necessary tradeoff made for undoing optimization.

With this rough definition of equivalence, a natural question arises: how can we ensure the above holds for two implementations A and B on all possible inputs? As we mentioned in our Vyper blog post, Concord is built on the symbolic reasoning used by the Certora Prover, which lets it reason about effectively infinite state spaces, covering all possible inputs.

How Concord achieves this within the framework of the Certora Prover is itself the topic of several blog posts. However, to give a quick overview: Concord instruments the two implementations it is comparing to record all the side effects encountered on any path of execution into two different vectors. After symbolically executing both implementations, we assert that the two vectors are equivalent; if they are not, then there is some difference in behavior between the two implementations. Concord analyzes the vectors to provide a human readable explanation of the different behavior: it is this explanation that is returned back to the LLMs by Concordance. A much more detailed explanation of this technique, and a longer discussion around defining equivalence can be found in our technical report on Concord here.

Concordance Revisited

To reiterate, Concord is a standalone tool that can be used to check the difference between two smart contract implementations. To better understand how Concordance builds upon it, let’s examine a single round of interaction between the LLM and Concord, mediated by Concordance.

The LLM first proposes the following rewrite for the Solady safeTransferFrom:

function safeTransferFrom(
   address token,
   address from,
   address to,
   uint256 amount
) internal {
   (bool success, bytes memory returndata) = token.call(
       abi.encodeWithSelector(0x23b872dd, from, to, amount)
   );
  
   // Check if the call succeeded and returned true (1)
   if (success && returndata.length >= 32 && abi.decode(returndata, (bool))) {
       return;
   }
  
   // Allow calls to contracts that don't return values (non-standard tokens)
   if (success && token.code.length > 0 && returndata.length == 0) {
       return;
   }
  
   // All other cases should revert with TransferFromFailed error using same pattern as original
   assembly {
       mstore(0x00, 0x7939f424) // TransferFromFailed()
       revert(0x1c, 0x04)
   }
}

This is quite close, but Concord finds a counter example:

There were 1 call(s) prior to this event:
   External call to 0xffffffffffffffffffffffffffffffffffffffff with eth: 0
    The calldata buffer was:
23b872dd00000000000000000000000000000000000000000000ffffffffffffffffffffffffffffffffffffffff00000000000000000000000000000000
     The callee codesize was chosen as: 1
     The call result was:
         A successful return
         With a buffer of length: 4294967295
         The returned buffer model is:
             0000000000000000000000000000000000000000000000000000000000000002... missing 4294967263 more bytes
The methods performed different actions.
In SafeTransferFromHarness.safeTransferFromExternal(address,address,address,uint256):
    ! The call reverted
    Event Context:
        No additional context information
    The raw buffer used in the event was:
        7939f424
In SafeTransferFromRewriteHarness.safeTransferFromExternal(address,address,address,uint256):
    ! The call reverted
    Event Context:
        No additional context information
    The raw buffer used in the event was:
        ! Empty

Interpreting the low-level results here, we can see that Concord is considering a case where the `safeTransferFrom` function being called returns an enormous 4294967295 byte buffer. The original code reverts with 7939f424 (AKA TransferFromFailed()) whereas the proposed rewrite reverts with an empty buffer. The enormous buffer size is not the source of divergent behavior. Instead, notice that the first word in this buffer is 0x0…02. This is not a valid encoding of a boolean, and thus the call to `abi.decode` in the proposed rewrite will revert with an empty buffer. The above counter-example is sent back to Claude for refinement, and it eventually converges to a correct rewrite.

Installing Concordance

Concordance is a Python script that is included in the open-source Certora repository. You can find the README describing how to install dependencies and run the tool here. As mentioned, Concordance is an open source tool you can try today, although we stress it is very much in pre-alpha development. Of course, we welcome any feedback you may have if you try it out yourself.

The Takeaway

As LLMs get more powerful, we can apply them to more and more complex software engineering tasks. However, as has been documented widely, LLMs are not perfect and can make subtle (and sometimes not so subtle) mistakes. By using formal verification tools as guardrails for LLMs, we can benefit from the automation and ease of AI while still getting the bulletproof assurance of formal verification. We’re also exploring other possibilities in this same vein which we’ll hopefully be able to share soon. Stay tuned!