Bitcoin 2.0.0 App For Ledger
No, this article is not about the next, better Bitcoin. With the successful Taproot upgrade that’s been active since block 709,632, Bitcoin just got better on its own.
With improvements that affect both the scalability of certain types of transactions and the privacy of the users transacting with them, Bitcoin is still king… or queen.
This is about the new Ledger Nano App for Bitcoin:
The very first versions of the Ledger Bitcoin app date back to 2013. Since then, Bitcoin’s been through two block reward halvings, the MtGox hack and collapse, the invention of the Lightning Network, the Block Size War that culminated with the SegWit soft-fork, the Taproot soft-fork, and the explosion of the broader cryptocurrency market. In these last eight years, the price went from less than 1,000 USD to over 60,000 USD, with two bear markets in-between.
Oh, and the Ledger Nano S didn’t even exist yet!
How things have changed! There are now tens of millions users of Bitcoin, which is now even legal tender in a sovereign country and an increasing number of companies and institutions now hold Bitcoin in their balance sheet. In order to reach a billion users, it is imperative that we make the technology both easier to use and more robust, and to do this we must give users the tools to be sovereign individuals, in control of their assets.
The Ledger Bitcoin application version 2.0.0 brings you a welcoming ₿ logo when you use it − but it’s not just a little UX redesign. It’s an entirely new application.
Let’s look a bit more in depth at the design principles of the new application, and some of the new features that it brings.
There are three core concepts that are at the foundation of the new Bitcoin application:
- Partially Signed Bitcoin Transactions (PSBT)
- Data Merkleization
- Wallet policies
The usage of Merkle trees and Merkle proofs allows us to design protocols that can work on large data structures that are not in the hardware wallet’s memory. Partially Signed Bitcoin Transactions (PSBTs) and Wallet policies (based on Output Script Descriptors) allow us to define, work with, and create workflows for wallets with complex policies that might involve multiple parties (like multisig wallets), and to stay interoperable with software and hardware tools from different vendors.
PSBT-based signing flow
What is a hardware wallet? Well, first of all: not a wallet. It would probably be more precise to call it a hardware signer, but it’s hard to change a wording that is so rooted into the cryptocurrency jargon.
It’s a device that cryptographically signs transactions after they have been verified and approved by the user, and its job is to make sure that the user cannot be tricked into signing something they don’t want to sign. When a user wants to make a transaction, the message looks like, “I own this UTXO (loosely speaking, some amount of coins) and I agree to send it as part of this transaction”.
During a transaction, a user needs to inspect and approve everything that matters about the transaction: the total amount spent, how much is sent to each of the outputs, how much is spent in transaction fees.
While the user inspects and validates this information, a lot more work has to be done by the hardware wallet: is all the necessary information to sign the transaction available and correct? Are all the inputs of the transaction under control of the wallet, or not? How about the outputs, which one is the change address?
As a mean to standardize the necessary information that a signer needs in order to sign a transaction, BIP-0174 introduced the Partially Signed Bitcoin Transaction (PSBT) standard, which is an interchange format that among other things, can be used in order to handle more complex scenarios where signatures from multiple parties are involved (e.g. multi signature wallets). BIP-0340 introduces PSBTv2, with a number of improvements that incidentally are extremely convenient for hardware wallets.
By adopting PSBTv2 as the native language spoken by the new Bitcoin app, we inherit its interoperability benefits. The new app does not ask the device to sign a transaction, but rather to sign a PSBTv2, where the client side is responsible for filling in all the necessary information. This helps us be adaptable and ready for the future, because no change in the application’s API is necessary to support a new transaction type: the PSBT format is highly extensible!
Merkleize all things
Transactions can be large, and PSBTs can be even larger.
The total RAM available on the Secure Element of your Ledger Nano S is 10 kB; the RAM actually available for running an application is less than 5 kB.
Working with such a constrained environment certainly presents some challenges. In fact, the legacy application had to resort to streaming transactions in small pieces while parsing it, in order to compute its transaction id (necessary step when signing legacy and SegWit transactions, for security reasons).
On the other hand, this approach wouldn’t work for a PSBT because the signing flow is a complex process, and it’s not realistic to read the entire PSBT just once from beginning to end: random access is needed. We could just ask the client wallet software to provide the required data when needed, but that gives a compromised client the power to adaptively choose the data later, or to provide two different answers when asked the same question.
While it’s possible to design interactive protocols that are safe against adaptive adversaries, it is a lot harder to think about them, and it’s a notoriously difficult problem in cryptographic protocols to make sure that no subtle vulnerabilities are involved.
To avoid any kind of problem at the source, we make extensive use of Merkle trees. Whenever we are faced with a large collection of objects, we require the client to provide the root of the Merkle tree that commits to all the objects. That creates a sort of short summary (only 32 bytes) that is a cryptographic commitment to the arbitrarily large collection, while providing random access for the hardware wallet. When it wants to know the 7th element in the collection, the client replies not just with the element itself, but with a proof that it’s the correct element that was committed to in the first place. By validating such proofs, the hardware wallet is certain that the client is behaving correctly.
The details and process of this approach are quite technical, but in short, thanks to Merkle trees, the Bitcoin app can work on arbitrarily large objects that are never stored on the hardware wallet’s memory. This approach is key to ensure that we can keep the full generality of PSBT for present and future applications, and still make everything work in the limited memory of the user’s Ledger Nano S.
But what does it mean to sign a PSBT?
A PSBT contains information about:
- A transaction (which is unsigned, or not completely signed)
- Each of the transaction’s inputs
- Each of the transaction’s outputs
The information associated with each input and output allows, among other things, to verify if the inputs and outputs that are controlled by the wallet or not, since it contains information about the keys used to create those same inputs and outputs.
But who is signing the PSBT? Signing an input only makes sense once it’s known what type of script locks those coins, and which of the many possible keys that can be derived from the seed should be used. Standards like BIP‑44, BIP‑49, BIP‑84, and BIP‑86 were created in order to easily characterize the outputs that are logically part of the same wallet or account, and describe what scripts and keys should be used. There are many more possible combinations and a more general solution was necessary.
Output script descriptors make a giant step forward in solving this problem, by providing a language to describe sequences of scripts and the corresponding addresses.
For example, a Native Segwit wallet following BIP-84 is described by the following pair of descriptors:
This fully identifies a specific Native Segwit P2WPKH account. Once a policy is established, it’s easier to understand what signing a PSBT means: sign all the inputs that correspond to this policy. As long as the language to describe policies is expressive enough, this will work for all the standardized wallet types above, for multi-signature wallets, and potentially for any arbitrarily complex script policy. An example would be any policy expressible in miniscript.
On top of descriptors, we built a modified language (see documentation here) with the following main changes:
- The keys are stripped and replaced with placeholders and the set of keys are committed to using a Merkle tree for random access.
- The wildcard
*is removed, and replaced with a double wildcard that represents both the
1/*derivations, to represent in a compact way both the receive and change address descriptors.
Therefore, the wallet policy above would be simply represented as
@0 is the placeholder for the key
Compactness allows more intuitive human inspection, while splitting the keys allows the hardware wallet to work with potentially a very large number of keys, without having to keep them in its memory. Moreover, this model works better with policies where the same key might be present in multiple sub-policies, which is likely to happen in more complex scripts, and especially in tapscripts.
The advantage is more obvious for multi-signature wallets: a three-of-five native SegWit multi-signature wallet could have the policy template
wsh(sortedmulti(3,@0,@1,@2,@3,@4)), which is reasonably easy to inspect for a human.
Using multi-signature wallet policies
Standard single-signature policies can be used straight away. Any other policy such as policies with external keys or policies using non-standard paths, must be registered on the device first.
A policy is registered with a name, which is a short string that identifies the policy to the user. In the examples below, the policy name is “Cold storage”.
In order to keep the hardware wallet stateless, meaning that your seed is the only information that the device stores, the wallet registration feature is designed in such a way that the client is responsible for its storage. Yet, care is taken to ensure that a compromised client cannot alter information about a registered policy or otherwise trick the user into signing something they did not intend to sign.
Once properly integrated with desktop wallet software, these features allow for the first time the ability to create really secure multisignature setups with Ledger hardware wallets.
Wallet policy registration
During the registration flow, the user verifies the name of the policy, the policy itself, and each of the cosigners.
Once approved by the user, the Hardware Wallet returns a 32-byte string, a hmac-sha256, which can be used in any future call to provide proof that the policy was already approved.
This allows the ability to keep the hardware wallet stateless: it is the client’s responsibility to store the registered wallet’s metadata and the hmac, and provide it for future calls.
Moreover, it is the user’s responsibility to make sure that wallet policies are registered with different and easy to distinguish names. The hardware wallet cannot warn the user if different policies are registered under the same name!
Receive a transaction to a registered wallet
In order to use a registered wallet safely, it is crucial to be able to derive addresses for the wallet. The device can show the name of the registered wallet and the derived address on its secure screen:
Spending from a registered wallet
Spending from a registered wallet has a flow that is exactly the same as for usual single-signature transactions, except that the name of the registered wallet is shown first, and has to be explicitly approved by the user.
This approach will be generalized to more complex policies in future versions. We are excited about the potential of policy wallets for more secure and straightforward and advanced tools for self-custody.
The new application’s API is not compatible with the legacy application, and the legacy API is going to stop being supported. As an interim solution to avoid disruption, the Bitcoin app still supports all the legacy API unchanged. The downside is a noticeable impact on the application size, which is currently 64kB on a Nano S, versus 42kB for version 1.6.5.
What’s next for the Ledger Bitcoin App?
Integrating our multisignature support with bitcoin-core/HWI and with popular multi-signature software wallets is the next natural step.
In no particular order, these are some of the features we plan to work on in the next six months:
- Integrations and client libraries
- Taproot script spends (currently only BIP-86 key path accounts are supported)
- More general wallet policies/miniscript
- Better and simplified UX flows
- Support for SIGHASH flags
A great amount of effort has been put into the architecture of the new Bitcoin app. While the history of signatures and our user interface with transactions might feel complex, the future only brings more security and interface challenges, which is why it’s so important to build innovation on top of the progress that’s been made before us.
We plan to build the best possible application for Bitcoiners for the next decade to come and more! Onwards.
Your feedback is fundamental.
Here are some links to get more in-depth information about the new application:
- Technical specifications
- Merkle trees and how we use them
- Wallet policies
Come discuss with us: https://discord.com/invite/ledger.