I noticed core-rs stores state tree prefixes in hexadecimal serialization both on-chain and in the database. So, instead of storing hex path 0x0000…, it would store 0x30303030…
While it is a simpler way of storing a sequence of nibbles, it is obviously not as efficient as it could be. For example, the in-memory representation of “AddressNibbles” uses contiguous sequences of nibbles with an 8-bit length indicator. Ethereum goes even further and uses a 4-bit odd/even indicator.
I’m going to explore a few ways how we can improve efficiency, in this post.
If time allows it, I’ll write up some proof-of-concept Go code.
A more efficient encoding for the database
I can imagine historical reasons for storing prefixes with hex encoding. Perhaps browser engines imposed restrictions on core-js that only ASCII-compatible keys can be stored in the database. Since all the database code now directly depends on LMDB, I don’t think this is a problem anymore.
Whether changing the key encoding on Nimiq 1.0 nodes is worth doing is debatable. core-rs nodes have hit production usage already and such a change would require a full re-sync or complex upgrading code.
However, for Nimiq 2.0 (core-rs-albatross) it seems to make sense. There is no blockchain to lose and it’s in heavy development anyways. Cutting the key size in half could allow for some good performance gains.
A comment by @paberr in the core-rs database code supports this, by the way 
impl AsDatabaseBytes for AddressNibbles {
fn as_database_bytes(&self) -> Cow<[u8]> {
// TODO: Improve AddressNibbles, so that no serialization is needed.
let v = Serialize::serialize_to_vec(&self);
Cow::Owned(v)
}
}
Here are a few encoding suggestions.
Prefixing a 4-bit odd/even indicator
- The first nibble (“prefix”) is the indicator.
- Prefix
0b0000/0x0means an odd number of nibbles follows, - Prefix
0b1000/0x8means an even number of nibbles follows. - A single
0x0nibble is used as padding at the end.
Example encodings:
-
[]=>80 -
[1]=>01 -
[8, 4]=>8840 -
[8, 8, 4]=>0884
Converting between the compressed form and the nibble list is fairly simple,
and this encoding is space-efficient. However, it would create an unnatural
split on the key index, and make in-order iteration impossible.
Suffixing a 4-bit odd/even indicator
- The last nibble (“suffix”) is the indicator.
- Suffix
0b0000/0x0means an odd number of nibbles precedes, - Suffix
0b11111111/0xFFmeans an even number of nibbles precedes.
Example encodings:
-
[]=>FF -
[1]=>10 -
[8, 4]=>84FF -
[8, 8, 4]=>8840
Conversion is slightly more complicated here in this case.
I like this one the most.
Prefixing the 8-bit length
This encoding is a bit more exotic, but it would allow for efficiently iterating the list of accounts/leaves in-order, without touching branches (select by prefix 0x28...).
It’s probably also less efficient that the other proposed encodings.
Changing the “on-chain” tree
Changing the accounts tree itself is probably not productive.
However it might decrease the amount of data sent over the wire when doing a full light sync.

I’m going to implement one of the aforementioned serialization schemes in core-rs and run some benchmarks.