DWDHT, an acronym for "dWeb DHT," is a protocol that forms a distributed hash table of dataset identifiers (dWeb network addresses) and the peers that are announcing them. Each database identifier and its announcing peers form what is referred to as a "swarm," which can be queried for connected peers or joined (announced) by a specified amount of peers. dWeb's DHT is literally how a particular dataset is discovered and downloaded from the peers that are openly seeding it (announcing it). Each peer who is announcing a dataset on a long-term basis, by design, becomes a DHT node and is responsible for storing a specific portion of dWeb's DHT. Put another way, each DHT node stores a specific portion of the dWeb's dataset identifiers, along with the peers who currently possess the data related to each of those data identifiers.
While each DHT node stores various swarms, it also stores information regarding other DHT nodes, identified by
node identifiers, which can be mathematically determined to possess information regarding a specific swarm. dWeb's DHT is based on the
Kademilia Distributed Hash Table popularized by BitTorrent and others. Nodes and data in dWeb's DHT are assigned 160-bit integers as IDs, while swarms are stored in the form of key-value pairs, where the key is a 32 byte value generated by a one-way hash function, such as SHA-1 (normally a dWeb network address), where the value is an object containing public and LAN-based peers that are announcing the key (the dataset identifier).
Node & Data Distribution
The DHT defines the distance between 2 DHT nodes
j by the bitwise exclusive OR operations
d(i,j) = i (XOR) j. This distance means for any given key
i and a distance
L > 0, there can only be a single key
j that satisfies
d(i,j) = L [ZHAN13].
All key-value pairs are stored on
k nodes whose UIDs are closest to the actual key.
k is an important parameter that helps determine data redundancy and how stable the DHT is at any time. Each node
i always maintains multiple
k-bucket stores a list of other DHT nodes, which are organized in an order that reflects the most recently active nodes. The node that is most recently active is stored at the tail, while the least active node is stored at the head.
The node whose distance from node
i is in the range of [pow(2,m), pow(2,m+1)] is stored in the
k-bucket (node that 0 <
m < 160). The nodes in the
k-buckets are regarded as the neighbors of the node
i. A dWeb DHT node dynamically updates its neighbors upon receiving any messages from them. This process can be better explained more specifically, when node
i receives a message from another DHT node
j, which is located in the
k-bucket, where the
k-bucket of node
i, will be updated in the following way:
j already exists in the
j to the tail of the list, as node
j is the most recently seen.
j is not in the
k-bucket and the bucket has fewer than
k nodes, node
j at the tail of the list.
-If the bucket is full,
i PINGs the node at the head of that particular
-If the head node responds, node
i makes it to the tail and ignores node
i removes the head node and inserts
j at the tail.
PING - Probes a DHT node to check whether it's online or not.
STORE - Used to store a key-value pair.
FIND_NODE - Finds a set of nodes that are closest to a given node.
FIND_VALUE - Operates like
FIND_NODE but returns a stored value.
These RPC-like primitives work in a recursive way, which improves the efficiency of dWeb's DHT. A
lookup procedure is initiated by the
FIND_VALUE primitives, where the lookup initiator chooses
B nodes from its closest
k-buckets and send many parallel
FIND_NODE requests to these
B nodes. If the node being searched for in a given iteration is not found, the lookup initiator resends the
FIND_NODE to the nodes that were found during the previous recursive operation and repeats this iterative functionality.
A KV pair may be stored on multiple DHT nodes. Thanks to the recursive procedure explained above, the key-value pair spreads across the DHT network every hour. This process insures that multiple replicas of data exist across the network. Every key-value is deleted 24 hours after it is initially pushed into the network.
Joining and Leaving
i joins dWeb's DHT, it is assumed that it is aware of or knows about node
j. The joining process consists of multiple steps as follows:
j into its
i starts a node lookup procedure for its own ID, where
i is made aware of other newer nodes.
i updates the
During this process, node
i strengthens its
k-buckets and inserts itself into other nodes'
k-buckets. When other nodes leave or fail, they DO NOT notify any other node. There is no need for a special procedure to cope with node departures, as these mechanisms insure that leaving nodes will be removed from the
Announcing a swarm means that you're either: announcing a dWeb network address that doesn't exist on dWeb's DHT and therefore become the first peer related to the address; or announcing a dWeb network address that does exist on dWeb's DHT and are added to a list of peers who are announcing the address.
A swarm entry in dWeb's DHT, takes on the following format:
If a peer is announcing a dWeb network address that matches a key in the DHT, the entry is mutated to include the peer's announced IP address and port. When announcing a dWeb network address, a peer must set their public port and public IP address during the announcement and can choose to set a LAN-based address and port as well, so that those on the same public IP who are performing a DHT lookup, can retrieve local announcers for a particular dWeb address. It's important to note that when announcing a dWeb network address on dWeb's DHT, the announcer becomes a DHT node on the network by design.
A swarm can be looked up by its dWeb network address, which is a 32 byte buffer (normally a hash of something). When performing a lookup, the querying peer is temporarily an announcing peer, but is removed as an announcing peer (unannounces), once the lookup is finalized.
A lookup for a particular dWeb key, returns the following data:
DHT Bootstrap Nodes
dWeb's network utilized various
bootstrap nodes to launch the initial dWeb network. These 3 bootstrap nodes are as follows:
These nodes going down would not affect the dWeb or a user's ability to announce or lookup dWeb network addresses, due to the way other DHT nodes on the network share data and information regarding each other. dWeb developers have the option of launching their own bootstrap nodes so that their apps have a point of entry into the dWeb's network of DHT nodes. Those nodes would initially use a set of already existing nodes to gain access to the DHT and would no longer need access to the nodes used for bootstrapping. Any node on the DHT can be used to bootstrap a new node.
DWDHT Reference Implementation
You can use this reference implementation to allow dWeb-based applications (desktop, mobile or web) to act as DHT nodes.
You can also launch your own dWeb DHT node via the command-line, by using the DHT CLI here.
dWeb Swarm API
The dWeb swarm API, also known as "Swarm Programming" is a high level API built on top of the DWDHT Protocol used for finding and connecting to peers of a particular swarm and interacting with the dWeb DHT.
It expands the default set of parameters when creating a DHT node, adds specific swarm events where callbacks can be used to react to specific swarm-based events and introduces several functions, like
connect, which make it easy to programmatically interact with the dWeb's DHT.
DHT Node Initiation Parameters
Using the dWeb Swarm API, a DHT node can be created with the
dwebswarm() function, using the following parameters:
bootstrap - An array of bootstrap servers used to initiate the dWeb DHT node.
ephemeral - A boolean that is set to
false if this is intended to be a long running DHT node.
maxPeers - The total amount of peers that the node initiator will connect to.
maxServerSockets - The number to restrict the number of server socket based peer connections. Set to
Infinity by default.
maxClientSockets - The number to restrict the number of client socket based peer connections. Set to
Infinity by default.
validatePeer - A function for applying filters before connecting to a peer.
queue - An object for configuring peer management behavior.
requeue - An array of backoff times in milliseconds every time a failing peer connection is retried.
forget - An object for when to forget certain peer characteristics and treat them as fresh connections again.
unresponsive - How long to wait before forgetting that a peer has become unresponsive.
banned - How long to wait before forgetting that a peer has been banned.
multiplex - Set to
true in order to reuse existing connections between peers across multiple dWeb network addresses.
Swarm Event Emitters
Using the dWeb Swarm API, applications can listen for the following events:
connection - A new connection has been created. Event should be handled by using the socket. This emits an
info object that describes the connection using the following details:
info object, the
info.ban() method can be called to ban the connected peer and will keep the DHT node from connecting to the peer in the future. The
info.backoff() method can be called to get the DHT node to backoff from connecting to the peer.
disconnection - A connection has been dropped. Emits an
info object that is identical to the
info object, describing the dropped connection.
peer - A new peer has been discovered on the network and has been queued for connection. Emits a
peer object, including the port, host referrer, and dWeb network address peer was discovered under. Also includes a boolean on whether the peer is a LAN-based peer.
peer-rejected - A peer has been rejected as a connection candidate. Emits a
peer object that is identical to the
peer object, describing the rejected peer.
updated - Emitted once a discovery cycle for a particular dWeb network address has completed. Emits a
key object that identifies the dWeb network address the discovery cycle is related to.
For more information on dWeb Swarm's events, take a look at the README.md file within the reference implementation of the dWeb Swarm API here.
Swarm API Functions
join - Join the swarm for a given
topic (dWeb network address). This will cause peer to be discovered for the topic (
topic (Buffer) - The dWeb network address to list under. Must be 32 bytes in length.
announce (Boolean) - List this peer under the topic as a connectable target. Defaults to
lookup (Boolean) - Look for peers in the topic and attempt to connect to them. If
announce is false, this automatically becomes true.
onion - A function that is called when your topic has been fully announced to the local network and the DHT.
connect() - Establish a connection to a given peer. You usually won't need to use this function, because the DHT node connects to discovered peers automatically. Accepts a
peer object (identical to the
peer object emitted by the
peer event) for the peer the function is to establish a connection with. Also has a callback function that emits a connection error (err), and the established TCP or UDP socket (socket) and details regarding the established connection (details).
leave - Leave the swarm for a given dWeb network address.
topic (Buffer) - The identifier of the peer-group to delist from.
onleave - A function that is called when your topic has been fully unannounced to the local network and the DHT.
For more in-depth examples on how to call these functions within dWeb-based applications, view the