Kademlia full bucket node eviction logic #5970

mickvandijke · 2025-04-02T15:33:37Z

mickvandijke
Apr 2, 2025

Hey everyone,

I've been reviewing the Kademlia implementation in Rust libp2p, specifically how it handles the eviction of nodes from full buckets when a new pending node is introduced.

According to the Kademlia specification, existing nodes in a bucket should always be favored. When a new node attempts to join a full bucket, the least recently used (LRU) node should only be replaced if it is found to be unresponsive. This is typically done via a PING check. The original Kademlia paper also suggests an optimization where nodes are not pinged immediately but only when the application actually needs to send a message to the LRU node.

However, looking at the implementation here,
it seems that a pending node will almost always replace the LRU node within a full bucket. Instead of immediately pinging the LRU node to check for liveness, libp2p defers this check until a natural reconnection occurs. Meanwhile, the pending node is assigned a timeout (defaults to 60s). When this timeout expires, the pending node will replace the current LRU node—unless that LRU node is still connected, in which case the pending node is discarded.

So this means that, unless all nodes within a full bucket are always connected, a pending node will almost always replace the LRU node after the timeout?

I see two potential issues with this approach:

Potentially unnecessary eviction of live nodes: If we never naturally reconnect to the LRU node during the pending node's timeout period, the LRU node may be replaced—even if it is still live.
The LRU node at timeout might differ from the original LRU node: By the time the pending node's timeout expires, the LRU node in the bucket could have changed. This means a different node than the one originally considered for replacement might end up being evicted.

Does anyone have insights on the reasoning behind this design choice?

Thank you!

drHuangMHT · 2025-04-07T02:26:09Z

drHuangMHT
Apr 7, 2025

I found this after searching for "ping":

rust-libp2p/protocols/kad/src/handler.rs

Lines 883 to 887 in a898a04

    
           Poll::Ready(Some(Ok(KadRequestMsg::Ping))) => { 
        
               tracing::warn!("Kademlia PING messages are unsupported"); 
        
               *this = InboundSubstreamState::Closing(substream); 
        
           }

The line was added in #3152.
ping @guillaumemichel Any idea?

EDIT: libp2p-kad doesn't expect libp2p-ping to exist, so I believe this is the only way to issue the ping. However the ping is neither issued nor handled.

1 reply

guillaumemichel Apr 8, 2025
Maintainer

Kademlia ping is deprecated. A kademlia FIND_NODE request can serve the purpose of a ping, so there is no need to depend on libp2p-ping.

guillaumemichel · 2025-04-08T10:10:43Z

guillaumemichel
Apr 8, 2025
Maintainer

rust-libp2p/protocols/kad/src/kbucket/bucket.rs

Lines 310 to 365 in 9f85d5c

    
               /// Inserts a new node into the bucket with the given status. 
        
               /// 
        
               /// The status of the node to insert determines the result as follows: 
        
               /// 
        
               ///   * `NodeStatus::Connected`: If the bucket is full and either all nodes are connected or 
        
               ///     there is already a pending node, insertion fails with `InsertResult::Full`. If the 
        
               ///     bucket is full but at least one node is disconnected and there is no pending node, the 
        
               ///     new node is inserted as pending, yielding `InsertResult::Pending`. Otherwise the bucket 
        
               ///     has free slots and the new node is added to the end of the bucket as the most-recently 
        
               ///     connected node. 
        
               /// 
        
               ///   * `NodeStatus::Disconnected`: If the bucket is full, insertion fails with 
        
               ///     `InsertResult::Full`. Otherwise the bucket has free slots and the new node is inserted 
        
               ///     at the position preceding the first connected node, i.e. as the most-recently 
        
               ///     disconnected node. If there are no connected nodes, the new node is added as the last 
        
               ///     element of the bucket. 
        
               pub(crate) fn insert( 
        
                   &mut self, 
        
                   node: Node<TKey, TVal>, 
        
                   status: NodeStatus, 
        
               ) -> InsertResult<TKey> { 
        
                   match status { 
        
                       NodeStatus::Connected => { 
        
                           if self.nodes.len() >= self.capacity { 
        
                               if self.first_connected_pos == Some(0) || self.pending.is_some() { 
        
                                   return InsertResult::Full; 
        
                               } else { 
        
                                   self.pending = Some(PendingNode { 
        
                                       node, 
        
                                       status: NodeStatus::Connected, 
        
                                       replace: Instant::now() + self.pending_timeout, 
        
                                   }); 
        
                                   return InsertResult::Pending { 
        
                                       disconnected: self.nodes[0].key.clone(), 
        
                                   }; 
        
                               } 
        
                           } 
        
                           let pos = self.nodes.len(); 
        
                           self.first_connected_pos = self.first_connected_pos.or(Some(pos)); 
        
                           self.nodes.push(node); 
        
                           InsertResult::Inserted 
        
                       } 
        
                       NodeStatus::Disconnected => { 
        
                           if self.nodes.len() >= self.capacity { 
        
                               return InsertResult::Full; 
        
                           } 
        
                           if let Some(ref mut p) = self.first_connected_pos { 
        
                               self.nodes.insert(*p, node); 
        
                               *p += 1; 
        
                           } else { 
        
                               self.nodes.push(node); 
        
                           } 
        
                           InsertResult::Inserted 
        
                       } 
        
                   } 
        
               }

From glancing at the code, it seems that insert() is the only function writing to self.pending. insert() seems to be used only when a node MUST be added to the bucket (if full find a peer to remove to accommodate a spot for the new peer), and not during bucket refreshes.

IMO seeing the buckets as a LRU cache may be misleading. It is important to prune peers from the bucket as soon as they are offline because holding unreachable peers in the routing table hurts the network. However, it isn't convenient to keep pinging peers too often. The best strategy is probably to periodically ping peers from the buckets, remove them when they don't answer. And for buckets that are not full, run bootstrap like query in order to find peers in these buckets to fill in the available spots.

Hence the replacement strategy isn't exactly LRU, but rather "keep peers as long as they are online". No peer should be replaced unless it is offline.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kademlia full bucket node eviction logic #5970

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Kademlia full bucket node eviction logic #5970

Uh oh!

mickvandijke Apr 2, 2025

Replies: 2 comments · 1 reply

Uh oh!

Uh oh!

drHuangMHT Apr 7, 2025

Uh oh!

guillaumemichel Apr 8, 2025 Maintainer

Uh oh!

guillaumemichel Apr 8, 2025 Maintainer

mickvandijke
Apr 2, 2025

Replies: 2 comments 1 reply

drHuangMHT
Apr 7, 2025

guillaumemichel Apr 8, 2025
Maintainer

guillaumemichel
Apr 8, 2025
Maintainer