Skip to content

BIP138: Compact Encryption Scheme for Non-seed Wallet Data#1951

Open
pythcoiner wants to merge 12 commits into
bitcoin:masterfrom
pythcoiner:encrypted_descriptor
Open

BIP138: Compact Encryption Scheme for Non-seed Wallet Data#1951
pythcoiner wants to merge 12 commits into
bitcoin:masterfrom
pythcoiner:encrypted_descriptor

Conversation

@pythcoiner

@pythcoiner pythcoiner commented Sep 4, 2025

Copy link
Copy Markdown
Contributor

This is a bip for encrypted backup, an encryption scheme for bitcoin wallet related metadata.

Mailing list post: https://groups.google.com/g/bitcoindev/c/5NgJbpVDgEc

@pythcoiner pythcoiner marked this pull request as draft September 4, 2025 06:47
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-encrypted-backup.md Outdated
Comment thread bip-0138.md
@pythcoiner

Copy link
Copy Markdown
Contributor Author

thanks for the review! will address comments tmr!

@Sjors

Sjors commented Sep 4, 2025

Copy link
Copy Markdown
Member

Open questions

  • Deterministic nonce: Currently the nonce is generated randomly. Is it safe to produce a deterministic nonce, e.g. hash("NONCE" || plaintext || key_1 || … || key_n), or are there known security concerns with this approach?

In general nonce reuse is unsafe because if you make multiple backups over time, e.g. as you add more transaction labels, you would be reusing the nonce with different message. By including the plaintext in the nonce, you do address that concern.

However it still seems unwise to mess with cryptographic standards. It doesn't seem worth the risk for saving 32 bytes on something that's going to be at least a few hundred bytes for a typical multisig.

@shocknet-justin

Copy link
Copy Markdown

Concept ACK, seems adjacent to how some lightning tools enable users to recover SCB's with just their seed to identify and decrypt the backup. Makes sense for descriptors to have something similar.

@pythcoiner pythcoiner force-pushed the encrypted_descriptor branch 7 times, most recently from 1e4ca34 to 3b6b6ad Compare September 5, 2025 06:30
@Sjors

Sjors commented Sep 5, 2025

Copy link
Copy Markdown
Member

Concept ACK

@pythcoiner

Copy link
Copy Markdown
Contributor Author

(not yet finish addressing comments)

Comment thread bip-encrypted-backup.md Outdated
@KeysSoze

KeysSoze commented Sep 9, 2025

Copy link
Copy Markdown

Hi @pythcoiner,

By coincidence, two weeks ago I started working on a proposal for a "Standard Encrypted Wallet Payload" to be placed inside an "Encrypted Envelope". The "Wallet Payload" contains descriptors and metadata but can also act as a full wallet backup including transactions, UTXOs and addresses. The proposal is very much a work in progress.

I only just found this discussion so am reading through it to compare it to my proposal. The descriptor backup in the "Wallet Payload" of my proposal seems to have some overlap with the BIP proposed here. If there is too much overlap I may reconsider progressing with my proposal.

As mentioned, my proposal is very much a work in progress but the wallet payload proposal can be found here:

https://gist.github.com/KeysSoze/7109a7f0455897b1930f851bde6337e3

Maybe jump to the test vector section to see what a basic backup of a descriptor and some meta data would look like prior to encryption.

https://gist.github.com/KeysSoze/7109a7f0455897b1930f851bde6337e3#test-vectors

As my proposal is designed to be modular and extensible the encryption envelopes may be extended to offer Multiparty Encryption and Authentication. See:

https://gist.github.com/KeysSoze/7109a7f0455897b1930f851bde6337e3#user-content-Expanding_the_Security_Model

I have already started documenting an encryption envelope that uses AES-256-GCM and password protection:

https://gist.github.com/KeysSoze/866d009ccd082edf6802df240154b20d

I have not written a reference implementation yet but there are well established python and Rust libraries for CBOR and COSE that should make implementing the BIPs relatively simple.

@pythcoiner

pythcoiner commented Sep 13, 2025

Copy link
Copy Markdown
Contributor Author

Hi @pythcoiner,

By coincidence, two weeks ago I started working on a proposal for a "Standard Encrypted Wallet Payload" to be placed inside an "Encrypted Envelope". The "Wallet Payload" contains descriptors and metadata but can also act as a full wallet backup including transactions, UTXOs and addresses. The proposal is very much a work in progress.

Hi @KeysSoze, this work seems more related/parallel to the wallet_backup specs I've work on few month ago.

But I've adopted a slightly different approach by simply using JSON.

FYI we already implemented this wallet backup format in Liana wallet and I plan to work on a BIP proposal relatively soon.

@pythcoiner

Copy link
Copy Markdown
Contributor Author

I think we should have this BIP specify how descriptors are stored. They are the main use case for the encryption scheme and they're essential to it.

I'm ok with that, if we do for BIP-0380 we should also define how BIP-0388 is encoded also, as in almost the cases the resulting backup should be lower in size compared to 380

@Sjors

Sjors commented May 29, 2026

Copy link
Copy Markdown
Member

Supporting BIP 388 makes sense. My first thought would be to have it refer to the BIP 380 rules, then add a few extra fields in the JSON / extra lines in the text representation.

Since Bitcoin Core doesn't support BIP388 (yet?), I probably won't implement that part in the my c++ proof-of-concept.

@pythcoiner

pythcoiner commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

I find the current BIP text unclear about how multiple data blobs are supposed to be encoded.

hum I initially do not plan to allow several blobs in order to keep a clear boundary between the decryption logic and the content parsing, in which case you think it could be better to allow several blobs?

@pythcoiner

Copy link
Copy Markdown
Contributor Author

The term CONTENT is also too ambiguous.

I agree it sould be CONTENT_KIND or CONTENT_TYPE

@Sjors

Sjors commented Jun 1, 2026

Copy link
Copy Markdown
Member

There should be one encrypted blob, but the plaintext should be able to contain multiple pieces of data.

@murchandamus

Copy link
Copy Markdown
Member

Hey @pythcoiner, how is this coming along? Anything we can help with? I didn’t expect a number assignment to derail progress. ;)

@pythcoiner

Copy link
Copy Markdown
Contributor Author

Hey @pythcoiner, how is this coming along? Anything we can help with? I didn’t expect a number assignment to derail progress. ;)

@murchandamus I've been underwater past weeks (and I still am), I've started looking ito the padding, I have a draft implem of it that I want finalize and will also adress Sjors suggestion before updating test vectors.

I'll try to push this forward in the coming week

@pythcoiner

Copy link
Copy Markdown
Contributor Author

update:

I'll try to re-review myself during the weekend with a fresh brain

@jonatack jonatack removed the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Jun 18, 2026
@Sjors

Sjors commented Jun 19, 2026

Copy link
Copy Markdown
Member

Thanks for the updates!

I think all I need now, is something like this (with better terminology):

diff --git a/bip-0138.md b/bip-0138.md
index 2aa276b..db6b783 100644
--- a/bip-0138.md
+++ b/bip-0138.md
@@ -267,5 +267,5 @@ Implementations MUST reject empty payloads.
 defined in `ENCRYPTION` where `PAYLOAD` is encoded following this format:

-`CONTENT` `LENGTH` `PLAINTEXT` (`PADDING`)
+`CONTENT` `LENGTH` `PLAINTEXT` (`CONTENT` `LENGTH` `PLAINTEXT` ... ) (`PADDING`)

 `LENGTH`: variable-length integer representing the length of `PLAINTEXT` in bytes. It MUST
@@ -274,8 +274,6 @@ be present.
 `PLAINTEXT`: the `LENGTH` bytes of payload data.

-`PADDING`: OPTIONAL bytes after `PLAINTEXT`, up to the end of the decrypted `PAYLOAD`.
-Parsers MUST consume exactly `LENGTH` bytes of `PLAINTEXT` and MUST ignore everything after
-it. These bytes are reserved for size padding (see Padding) and/or vendor-specific data,
-the same way trailing bytes after `CIPHERTEXT` are reserved and ignored.
+`PADDING`: OPTIONAL bytes after the final `PLAINTEXT`, up to the end of the decrypted
+`PAYLOAD`. Parsers MUST consume exactly `LENGTH` bytes of each `PLAINTEXT`.

 #### Padding
@@ -308,12 +306,11 @@ All variable-length integers are encoded as
 #### Content

-`CONTENT` is a variable length field defining the type of `PLAINTEXT` being encrypted,
-it follows this format:
+`CONTENT` is a variable length field defining the type of the following `PLAINTEXT`.
+It follows this format:

 `TYPE` (`LENGTH`) `DATA`

-`CONTENT` is a single `TYPE (LENGTH) DATA` triple, one blob describing `PLAINTEXT`.
-It is not a sequence of entries; it is immediately followed by the payload `LENGTH`
-and `PLAINTEXT`.
+Each `CONTENT` field is a single `TYPE (LENGTH) DATA` triple, one blob describing the
+`PLAINTEXT` item immediately following it.

 `TYPE`: 1-byte unsigned integer identifying how to interpret `DATA`.
@@ -341,6 +338,6 @@ the remaining payload bytes.

 For an unknown `TYPE` less than `0x80`, parsers MUST consume its `LENGTH` bytes of
-`DATA`, treat the content type as unknown, and continue with the payload `LENGTH`
-and `PLAINTEXT`.
+`DATA`, treat the content type as unknown, consume the following payload `LENGTH` and
+`PLAINTEXT`, and continue.

 For an unknown `TYPE` greater than or equal to `0x80`, parsers MUST reject the

I think the test vectors lack coverage for rejecting an all-zero nonce.

I updated Sjors/bitcoin#109 with the latest changes, plus my remaining wish list item.

Though I also want to take a closer look at my implementation to see if it reveals any issues with the proposal.

@jonatack jonatack added the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Jun 19, 2026
@pythcoiner

Copy link
Copy Markdown
Contributor Author

I think all I need now, is something like this (with better terminology):

hum, I'm not strongly against that, but i dont see an usecase in the current context:

  • over the 4 payloads quoted in the spec (BIPs 380/388/329/139), I dont see any combination really valuable as 139 is to me a simple way to wrap everything that you can get with 380+329 or 388+329
  • I agree it could be be useful in the future, but if it's intended to be used to encrypt a future bip, I just think the bip can then define how to bundle

In fine, I'm happy to add it if there is concrete examples (I may have missed obvious usecase btw)

@Sjors

Sjors commented Jun 22, 2026

Copy link
Copy Markdown
Member

I prefer to keep each piece of content simple, so e.g. I would have a one blob with the BIP388 policy (or BIP380 descriptors), one blob with BIP329 transaction labels, and maybe a partially signed transaction for a recovery scenario.

By allowing multiple pieces of content (in a single encrypted blob), it's easy to expand later.

I don't believe in the approach of BIP129 of trying to define a format to store everything in a single JSON blob (not opposed to it either).

if it's intended to be used to encrypt a future bip, I just think the bip can then define how to bundle

I don't think you can add support for multiple pieces of content in a single encrypted blob later, you'd have to break compatibility.

@pythcoiner

Copy link
Copy Markdown
Contributor Author

@Sjors I've added what you suggested

Comment thread bip-0138.md

| Value | Definition |
|:-------|:---------------------------------------|
| 0x00 | Reserved |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude suggested the following to make padding work with multiple items:

 | Value  | Definition                             |
 |:-------|:---------------------------------------|
-| 0x00   | Reserved                               |
+| 0x00   | End of content items; padding follows  |
 | 0x01   | BIP Number (big-endian uint16)         |
 | 0x02   | Vendor-Specific Opaque Tag             |

(and elsewhere don't say 0x00: parsers MUST reject the payload.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also how I've implemented it, I'll update the spec

Sjors added a commit to Sjors/bitcoin that referenced this pull request Jun 30, 2026
When walking the CONTENT/LENGTH/PLAINTEXT items of a decrypted payload,
stop at the first 0x00 TYPE byte and treat the remaining bytes as padding,
rather than rejecting the payload. This lets encoders zero-fill a payload up
to a padding bucket (BIP138 "Padding") without breaking decryption.

This is kept as a separate commit because the BIP138 text is not yet
unambiguous: the content-type table still lists 0x00 as "reject", while the
payload section describes the first 0x00 as the end of the item sequence.
See bitcoin/bips#1951 for the proposed clarification.
Other malformed content (e.g. an unknown TYPE >= 0x80) is still rejected.
@Sjors

Sjors commented Jun 30, 2026

Copy link
Copy Markdown
Member

I updated Sjors/bitcoin#109 with your changes in 56deb75; pretty much the same code since I already anticipate the change.

I added a commit that implements my suggested change in how to handle padding in #1951 (review), but I'm happy to switch to a different approach.

@pythcoiner

Copy link
Copy Markdown
Contributor Author

Something I have in mind is to optionally pad the list of individual secrets with decoy entries, random values indistinguishable from a real individual secret. The padding could round the count up to fixed buckets, like start at 5 then double on overflow (5 -> 10 -> 20)
@Sjors @bigspider

@pythcoiner

Copy link
Copy Markdown
Contributor Author

and on the same topic, as derivation paths are optionals maybe we should state in the spec that for decryption, if common derivation path are not present in the list, the implem should iterate over the list of derivation paths for account 0-10.
@Sjors would you implement such a thing?

@Sjors

Sjors commented Jul 1, 2026

Copy link
Copy Markdown
Member

pad the list of individual secrets with decoy entries

I like that idea, assuming it doesn't add too much overhead relative to the descriptor size. Could round up to Fibonacci numbers >= 3, so most simple setups have a good anonymity set.

the implem should iterate over the list of derivation paths for account 0-10

BIP87 mainnet account 0 seems like a good default to check (m/87h/0h/0h). You could expand that:

  • common derivation paths: m/44h, m/49h, m/84h, m/86h as well as m/48h
  • a couple of accounts

If the user is expecting testnet / signet, they should just specify that (m/.../1h).

If you have access to the seed then we can suggest a fairly wide range to scan. But when you need to fetch an xpub from an external device, and possibly deal with a permission prompt on that device, I would stick to m/87h/0h/0h as the default. And then perhaps suggest iterating over accounts first if nothing is found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New BIP PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author

Projects

None yet

Development

Successfully merging this pull request may close these issues.