Random targets for routing table refresh clustered in limited far-range buckets #5950

maqi · 2025-03-24T10:36:36Z

maqi
Mar 24, 2025

Rust-libp2p’s Kademlia protocol periodically refreshes its RT to maintain an updated view of the network.
The process involves generating random target keys for each bucket, attempting to ensure that each non-empty bucket is refreshed.

However, due to constraints in target generation and the probabilistic nature of hitting specific buckets, the actual distribution of refresh targets is skewed towards the farthest buckets :

the probability of within 16 attempts to hit the bucket is :

    bucket-255,  0.9999847412
    bucket-254,  0.9899774042
    bucket-253,  0.8819329130
    bucket-252,  0.6439258695
    bucket-251,  0.3982896966
    bucket-250,  0.2227348291
    bucket-249,  0.1179361554
    bucket-248,  0.0607019041
    bucket-247,  0.0307963823
    bucket-246,  0.0155110790
    bucket-245,  0.0077839549
    bucket-244,  0.0038991056
    bucket-243,  0.0019513379
    bucket-242,  0.0009761156
    bucket-241,  0.0004881695
    bucket-240,  0.0002441127
    bucket-239,  0.0001220633
    bucket-238,  0.0000610334
    bucket-237,  0.0000305171
    bucket-236,  0.0000152587

if hasn’t hit the bucket after 16 attempts, the last generated target still got used

The final generated targets are actually:

most of time, only scattered among the farthest 6~10 buckets (bucket-255 ~ bucket-246)
with bucket-255 has around 1/2 of targets, bucket-254 has 1/4 targets, and so on

This results in

far-range buckets got over queried un-necessarily
middle-range buckets receiving infrequent refreshes, leading to potential inaccuracies in the RT

Wondering if this has been considered previously? and why such scheme got decided to be used ?

elenaf9 · 2025-03-26T01:10:43Z

elenaf9
Mar 26, 2025
Maintainer

@guillaumemichel maybe something you can answer?

0 replies

guillaumemichel · 2025-03-26T08:26:34Z

guillaumemichel
Mar 26, 2025
Maintainer

Thanks for the heads up @elenaf9

@maqi your summary seems correct, it is a valid concern.

I am not sure why this design choice was adopted, but I agree that it is suboptimal.

About refreshing buckets

since the peer id is required in the query message, it is necessary to have the preimage of the kademlia identifier, so a brute force is usually required
it isn't necessary to refresh buckets that are full already (with online peers), since no peer is going to peer replaced.
- buckets that are further away (255, 254, etc.) tend to remain full since they have the more candidates, and eventually unstable peers are replaced, so only stable peers remain
- hence it would be more important to refresh close buckets (242, 241, etc.) since it is more likely that they aren't full or they contain offline nodes
the closest (non-empty) buckets are automatically refreshed when the node looks up for its own peer id (it will discover its closest peers)

How the refresh process can be improved

don't make the bootstrap request multiple times for the same bucket (e.g if no key was found for bucket X, still make the query for the last generated peerid falling in bucket Y for which we already have a peerid)
increase the max number of generated keys per bucket so that the probability of generating a key up to bucket 240 is non negligible
- bucket 240 should be enough for a network of 20k peers, double the network size => add 1 to the last populated bucket
generate keys for all buckets in a single loop, and assign the generated key with the corresponding prefix size, dropping keys for which we already have the correct prefix
- this process is more efficient but still is a brute force
use known peers (e.g from the routing table) matching the correct bucket for bootstrap queries
- peers should be chosen at random among known ones
- location in the keyspace isn't random, but it is better than not refreshing some buckets
use a precomputed preimage list to choose peerids from
- the Go implementation is doing this -> https://github.com/libp2p/go-libp2p-kbucket/blob/master/bucket_prefixmap.go
- adds a large file to the code base
- no brute force required
- probably the best random distribution, but actually not really random

1 reply

maqi Mar 26, 2025
Author

Hi, @guillaumemichel

Thank you very much for the reply.
And, yes, your proposed improvements do make more sense, and some of them we had already deployed in higher level to carry out manual get_closest network queries, as a supplement to the existing libp2p refresh scheme (bootstrap).

For sure, if the libp2p native refresh scheme can be improved, that will be much better.
As higher level calls are always more expensive than the builit-in.

Thanks again,
and looking forward to any further improvement follow-up

dirvine · 2025-03-26T14:33:56Z

dirvine
Mar 26, 2025

Hey @guillaumemichel That's some good feedback. I think the pre split buckets approach of libp2p is good in some ways but does skew the algorithm to create some targeting like behaviour which is deep and non intuitive. Dragons and all that live there :-) However the approach seems reasonable. In traditional KAD the number of buckets is more correct in terms of network size due to the splitting of buckets when full approach and in some ways can be easier to reason with (i.e. most likely the close bucket is the non full one and so on). Dead does aside.

A pre-image for search seems something to be careful about. If we search in a bucket for more nodes but use a precalculated fixed address we are likely to get the nodes closest to the address as the result for nodes found "in that bucket" from our perspective. So using a computed pre-image of addresses to query for may give a more constant set of very close nodes in a bucket, if we are not careful. i.e. the address space of the knowledge we have about that part of the network is confined to a smaller less distributed group than a random bucket search in that bucket.

I think it is worth looking at using the bucket address (common leading bits to you) and than doing a random number to fill out the rest of the address. i.e. in bucket n-100 create a (255-100 = 155) bit random number and append that to the bucket address. In that way we can probe across the address range of the bucket each time as opposed to a more constant target within the bucket. It would be safe to use the same random number (so create once per pass) and use the common leading bits from each bucket we wish to test as the beginning of our target address. So that it's more efficient in cpu terms.

That would be akin to trad kad using the closest bucket as a target for a random find node.

I think this is what you mean by

generate keys for all buckets in a single loop, and assign the generated key with the corresponding prefix size, dropping keys for which we already have the correct prefix
this process is more efficient but still is a brute force

But just being sure here.

The other issue is random nodes appearing in buckets further down the bucket list, where in KAD they would all usually fit in the first non full bucket, here they could appear at random in a low numbered bucket (i.e. in a 20K network we find a node in bucket 90 and one in bucket 140 and then a couple in another low bucket etc.). This gives a potentially sparse bucket approach which is another thing we should consider IMO. It would be smashing to have entirely even distributions but I think we need to consider not so and especially with sybil type attacks and more that sparse buckets do perhaps need some consideration. It's all pretty deep IMO, but interesting for sure.

1 reply

guillaumemichel Mar 26, 2025
Maintainer

I think it is worth looking at using the bucket address (common leading bits to you) and than doing a random number to fill out the rest of the address. i.e. in bucket n-100 create a (255-100 = 155) bit random number and append that to the bucket address

I agree that it would be the perfect solution. However, the preimage of the kademlia key/address should be included in the message payload, so it cannot work. Because of the requirement to add the PeerID/identifier in the message, it prevents from hand picking a kademlia key, forcing a brute force if user wants to control the kademlia key.

I also agree that the list of preimage isn't great, but it is probably the least worse that we can do.

The list of preimages can be generated locally on each node so that it is harder to predict which preimages will be used by which nodes.
When nodes refresh the buckets by looking up these preimages, they will discover multiple peers matching the target bucket, not only the closest ones to the preimage target. They are free to add any of these peers to the bucket.

dirvine · 2025-03-26T15:28:37Z

dirvine
Mar 26, 2025

Nice @guillaumemichel appreciate the input

When nodes refresh the buckets by looking up these preimages, they will discover multiple peers matching the target bucket, not only the closest ones to the preimage target.

I think perhaps it's worth checking that maybe nodes use the first found (in the first beta replies etc.) in the bucket as opposed to closest? It would be more random/spread in that range, perhaps?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Random targets for routing table refresh clustered in limited far-range buckets #5950

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Random targets for routing table refresh clustered in limited far-range buckets #5950

Uh oh!

Uh oh!

maqi Mar 24, 2025

Replies: 4 comments · 2 replies

Uh oh!

elenaf9 Mar 26, 2025 Maintainer

Uh oh!

guillaumemichel Mar 26, 2025 Maintainer

Uh oh!

maqi Mar 26, 2025 Author

Uh oh!

dirvine Mar 26, 2025

Uh oh!

guillaumemichel Mar 26, 2025 Maintainer

Uh oh!

dirvine Mar 26, 2025

maqi
Mar 24, 2025

Replies: 4 comments 2 replies

elenaf9
Mar 26, 2025
Maintainer

guillaumemichel
Mar 26, 2025
Maintainer

maqi Mar 26, 2025
Author

dirvine
Mar 26, 2025

guillaumemichel Mar 26, 2025
Maintainer

dirvine
Mar 26, 2025