Nyzo techDesignJoining the cycle

Joining the cycle

This page was last reviewed at version 595. If the NodeManager class has been updated since that version, this page may not reflect the latest behavior. Please review the release notes to answer any questions.

When a Nyzo verifier is started, it asks other verifiers for information on the current state of the blockchain. The location of verifiers are specified in the file /var/lib/nyzo/production/trusted_entry_points. A copy of this file is included in the Git repository, and it contains entries for verifier0.nyzo.co through verifier9.nyzo.co. Since the deactivation of the official Nyzo verifiers, these names point to multiple arbitrarily selected in-cycle verifiers.

In order to be added to the NodeManager, which is necessary for eventual consideration in the lottery, a verifier must be able to send node-join messages from an IP address and receive node-join messages at the same address. Additionally, a verifier must have exclusive control over that IP address. If multiple verifiers are broadcasting from an IP address, neither will be entered into the lottery.

The following explanations describe the interactions between a single out-of-cycle verifier and a single in-cycle verifier. The end result of this process is addition to the lottery pool of that in-cycle verifier. Because this is a democratic system, and because the in-cycle verifier has only a single vote, the out-of-cycle verifier must repeat this process with the entire cycle in order to have an optimal probability of joining the cycle.

The first step for a new verifier is requesting the cycle from trusted entry points. Due to code changes over time, the following code refers to the "mesh," and that term is incorrect. The "mesh" is an early term that described the set of all in-cycle and out-of-cycle nodes, and it has fallen out of favor in our development as the separation between the cycle and candidate (out-of-cycle) nodes has arisen. This code actually requests just the list of in-cycle nodes from the trusted entry points.

// Send mesh requests to all trusted entry points.
AtomicInteger numberOfMeshResponsesPending = new AtomicInteger(trustedEntryPoints.size());
for (TrustedEntryPoint entryPoint : trustedEntryPoints) {
    fetchMesh(entryPoint, numberOfMeshResponsesPending);
    sendNodeJoinMessage(entryPoint);
}

In the fetchMesh() method, a node-join message is enqueued for every node in the response. So, as long as one trusted entry point is aware of a node, this node will send a join message to it.

private static void fetchMesh(TrustedEntryPoint entryPoint, AtomicInteger numberOfMeshResponsesPending) {

Message meshRequest = new Message(MessageType.MeshRequest15, null);
Message.fetchTcp(entryPoint.getHost(), entryPoint.getPort(), meshRequest, new MessageCallback() {
@Override
public void responseReceived(Message message) {

// Enqueue node-join requests for all nodes in the response.
MeshResponse response = (MeshResponse) message.getContent();
for (Node node : response.getMesh()) {
NodeManager.enqueueNodeJoinMessage(node.getIpAddress(), node.getPortTcp());
}

numberOfMeshResponsesPending.decrementAndGet();
}
});
}

The NodeManager.enqueueNodeJoinMessage() method adds an entry to a map, keyed on the IP address of the receiver. This provides natural de-duplication when the same node is received in the MeshResponse from multiple trusted entry points.

public static void enqueueNodeJoinMessage(byte[] ipAddress, int port) {

nodeJoinRequestQueue.put(ByteBuffer.wrap(ipAddress), port);
}

After the responses are received from the trusted entry points, this node sends node-join requests to all nodes in the node-join request queue.

// Instruct the node manager to send the node-join messages. The queue is based on IP address, so deduping
// naturally occurs and only one request is typically sent to each node at this point. The -1 value tells
// the node manager to empty the queue.
NodeManager.sendNodeJoinRequests(-1);

As we can see here, in its initialization process, a verifier should send node-join requests to every verifier in the cycle. This behavior applies to both in-cycle and out-of-cycle verifiers.

After initial clearing of the node-join queue, the verifier initializes the local blockchain state. It then clears the node-join request queue a second time in case some delayed responses from trusted entry points have caused additional entries to be added.

ChainInitializationManager.initializeFrozenEdge(trustedEntryPoints);

// In order to process efficiently, we need to be well-connected to the cycle. If there are slow-downs that
// have prevented connection to this point, they should be addressed before entering the main verifier loop.
// We set 75% of the current cycle as a threshold, as it is the minimum required for automatic consensus.
NodeManager.sendNodeJoinRequests(-1);
NodeManager.updateActiveVerifiersAndRemoveOldNodes();

A verifier should have sufficient connection to the cycle at this point. If it does not, a supplemental connection process continues to fetch the cycle from trusted entry points and send node-join requests to the nodes in the responses.

// In order to process efficiently, we need to be well-connected to the cycle. If there are slow-downs that
// have prevented connection to this point, they should be addressed before entering the main verifier loop.
// We set 75% of the current cycle as a threshold, as it is the minimum required for automatic consensus.
NodeManager.sendNodeJoinRequests(-1);
NodeManager.updateActiveVerifiersAndRemoveOldNodes();
int meshRequestIndex = 0;
while (NodeManager.getNumberOfActiveCycleIdentifiers() < BlockManager.currentCycleLength() * 3 / 4) {
System.out.println("entering supplemental connection process because only %d in-cycle " +
"connections have been made for a cycle size of %d (%.1f%%)",
NodeManager.getNumberOfActiveCycleIdentifiers(), BlockManager.currentCycleLength(),
NodeManager.getNumberOfActiveCycleIdentifiers() * 100.0 / BlockManager.currentCycleLength()));
System.out.println("missing in-cycle verifiers: " + NodeManager.getMissingInCycleVerifiers());

// Fetch the mesh from one trusted entry point.
numberOfMeshResponsesPending = new AtomicInteger(1);
fetchMesh(trustedEntryPoints.get(meshRequestIndex), numberOfMeshResponsesPending);
meshRequestIndex = (meshRequestIndex + 1) % trustedEntryPoints.size();

// Wait up to two seconds for the mesh response to return.
for (int i = 0; i < 10 && numberOfMeshResponsesPending.get() > 0; i++) {
ThreadUtil.sleep(200L);
}

// Clear the node-join request queue. Then, sleep one second to allow more requests to return, and wait
// until the message queue has cleared. Finally, before the loop condition is checked again, update the
// active verifiers to reflect any that have been added since the last iteration.
NodeManager.sendNodeJoinRequests(-1);
ThreadUtil.sleep(1000L);
MessageQueue.blockThisThreadUntilClear();

NodeManager.updateActiveVerifiersAndRemoveOldNodes();
}

When considering this process, keep in mind that connectedness to the cycle serves different purposes for in-cycle and out-of-cycle verifiers. For in-cycle verifiers, connectedness to the rest of the cycle is essential for maintaining communication necessary to build the blockchain. While in-cycle-to-in-cycle connectedness is crucial for proper operation of Nyzo, it is not especially interesting when considering the process of admitting new verifiers to the cycle.

Out-of-cycle verifiers have no responsibility for building the blockchain, and they have no power in the system at all. For them, connectedness serves a different purpose. Maintaining an exclusive presence at a particular IP address lets in-cycle verifiers know that an out-of-cycle verifier controls the IP address. IP addresses are the scarce resource used to control admission into the new-verifier lottery.

In the verifier run loop, a block of "mesh-maintenance" operations is performed roughly once per block duration. As the comment explains, these operations continue to be performed at a regular time interval even if the blockchain is not freezing blocks at normal speed.

// These are mesh-maintenance operations. These were previously performed when a block was frozen,
// but they have been moved to a separate condition, based on block interval, to ensure that they
// still happen regularly when the cycle is experiencing problems or for an out-of-cycle verifier
// that is not always tracking the blockchain.
if (lastMeshMaintenanceTimestamp < System.currentTimeMillis() - Block.blockDuration) {
lastMeshMaintenanceTimestamp = System.currentTimeMillis();

// Reload the node-join queue. The node manager maintains a counter to ensure it is only
// performed once per cycle equivalent.
NodeManager.reloadNodeJoinQueue();

// Send up to 10 node-join requests. Previously, these were all sent when the mesh was
// requested. Now, they are enqueued and sent a few at a time to reduce the spike in network
// activity.
NodeManager.sendNodeJoinRequests(10);

// Update the top-voted verifier. This is done periodically to save frequent derivation from the
// vote map.
NewVerifierVoteManager.updateTopVerifier();

// Update the new-verifier vote.
NewVerifierQueueManager.updateVote();
}

We have seen several calls to the NodeManager.sendNodeJoinRequests() method. In NodeManager, the queue of node-join requests is stored as a map of IP addresses to port numbers. This method simply removes some number of IP addresses from the map, retrieving their port numbers, and sends node-join requests to those IP addresses on the specified ports. Positive argument values indicate sending a specified maximum number of messages from the queue, while negative argument values indicate emptying the queue.

public static void sendNodeJoinRequests(int count) {

// This method is called from multiple places, and threading issues could result in an odd state for the
// queue. Rather than adding synchronization or logic to deal with this, the try/catch will ensure that
// any exceptions do not leave this method.
try {
// A positive value indicates the specified number of requests should be sent from the queue, emptying the
// queue if the queue size is less than or equal to the specified number. A negative number indicates that
// the queue should be emptied, regardless of its size.
if (count < 0) {
count = nodeJoinRequestQueue.size();
}

for (int i = 0; i < count && !nodeJoinRequestQueue.isEmpty(); i++) {

ByteBuffer ipAddressBuffer = nodeJoinRequestQueue.keySet().iterator().next();
Integer port = nodeJoinRequestQueue.remove(ipAddressBuffer);

if (port != null && port > 0) {
nodeJoinRequestsSent.incrementAndGet();

// This is the V2 node-join message.
Message nodeJoinMessage = new Message(MessageType.NodeJoinV2_43, new NodeJoinMessageV2());
Message.fetchTcp(IpUtil.addressAsString(ipAddressBuffer.array()), port, nodeJoinMessage,
new MessageCallback() {
@Override
public void responseReceived(Message message) {

if (message != null && message.getContent() instanceof NodeJoinResponseV2) {

updateNode(message);

NodeJoinResponseV2 response = (NodeJoinResponseV2) message.getContent();
NicknameManager.put(message.getSourceNodeIdentifier(), response.getNickname());
}
}
});
}
}
} catch (Exception ignored) { }
}

Now that we have examined how verifiers send node-join requests, we should look at how verifiers process those requests. As mentioned above, in-cycle and out-of-cycle verifiers both send these requests, but they serve different purposes for in-cycle and out-of-cycle verifiers.

In MeshListener, a set is defined with message types that are not allowed over UDP. Because IP addresses can be spoofed over UDP, node-join requests and node-join responses are disallowed. As the comment explains, node-join responses are not processed on incoming connections. However, adding the response type to this set improves robustness and readability.

// To promote forward compatibility with messages we might want to add, the verifier will accept all readable
// messages except those explicitly disallowed. The response types should not be processed for incoming messages,
// but adding them to this set adds another level of protection.
private static final Set<MessageType> disallowedUdpTypes = new HashSet<>(Arrays.asList(MessageType.NodeJoinV2_43,
MessageType.NodeJoinResponseV2_44));

In the MeshListener.readDatagramPacket() method, this set is used to ensure that UDP messages of these types are ignored.

...
if (message != null && !disallowedUdpTypes.contains(message.getType())) {
...

This provides an important guarantee for node-join messages. Due to the nature of TCP connection negotiation, the receiver of these messages knows with certainty that the messages actually originate from the IP address where they claim to originate.

Node-join requests received over TCP, along with responses to locally initiated node-join requests (also sent over TCP), are relayed to NodeManager.updateNode().

Requests are processed in the MeshListener.response() method.

public static Message response(Message message) {

// This is the single point of dispatch for responding to all received messages.

Message response = null;
try {
...
MessageType messageType = message.getType();

if (messageType == MessageType.Transaction5) {
...
} else if (messageType == MessageType.NodeJoinV2_43) {

NodeManager.updateNode(message);

NodeJoinMessageV2 nodeJoinMessage = (NodeJoinMessageV2) message.getContent();
NicknameManager.put(message.getSourceNodeIdentifier(), nodeJoinMessage.getNickname());

// Send a UDP ping to help the node ensure that it is receiving UDP messages
// properly.
Message.sendUdp(message.getSourceIpAddress(), nodeJoinMessage.getPortUdp(),
new Message(MessageType.Ping200, null));

response = new Message(MessageType.NodeJoinResponseV2_44, new NodeJoinResponseV2());

} else if (messageType == MessageType.FrozenEdgeBalanceListRequest45) {
...
}
}
} catch (Exception e) {
...
}

return response;
}

Responses are processed as they are received in the Verifier.sendNodeJoinMessage() method.

private static void sendNodeJoinMessage(TrustedEntryPoint trustedEntryPoint) {

System.out.println("sending node-join messages to trusted entry point: " + trustedEntryPoint);

Message message = new Message(MessageType.NodeJoinV2_43, new NodeJoinMessageV2());
Message.fetchTcp(trustedEntryPoint.getHost(), trustedEntryPoint.getPort(), message,
new MessageCallback() {
@Override
public void responseReceived(Message message) {
if (message != null) {

NodeManager.updateNode(message);

NodeJoinResponse response = (NodeJoinResponse) message.getContent();
if (response != null) {

NicknameManager.put(message.getSourceNodeIdentifier(),
response.getNickname());

if (!ByteUtil.isAllZeros(response.getNewVerifierVote().getIdentifier())) {
NewVerifierVoteManager.registerVote(message.getSourceNodeIdentifier(),
response.getNewVerifierVote(), false);
}
}
}
}
});
}

These are the only two places in the entire codebase that call the NodeManager.updateNode() method.

In earlier versions of Nyzo, the NodeManager.updateNode() method was called for several different messages. Its scope was reduced to accommodate the sentinel, and its scope has been reduced further to tighten the security of the NodeManager. The latest significant change was in version 572, which eliminated immediate reciprocal node-join messages to cap the rate at which new verifiers can be added to the queue. In version 595, legacy node-join messages were removed to improve code readability.

public static void updateNode(Message message) {

// In previous versions, more types of requests were registered to increase mesh density. However, to make the
// system more flexible, we have changed this to only update a node when explicitly requested to do so through
// a node join.
if (message.getType() == MessageType.NodeJoinV2_43 || message.getType() == MessageType.NodeJoinResponseV2_44) {

PortMessageV2 portMessage = (PortMessageV2) message.getContent();
int portTcp = portMessage.getPortTcp();
int portUdp = portMessage.getPortUdp();

// Determine whether this is a node-join response. This is one of the pieces of information used to
// determine whether a node is added to the map immediately or if it is deferred to the node-join queue.
boolean isNodeJoinResponse = message.getType() == MessageType.NodeJoinResponseV2_44;

// Update the node.
updateNode(message.getSourceNodeIdentifier(), message.getSourceIpAddress(), portTcp, portUdp,
isNodeJoinResponse);

} else if (message.getType() == MessageType.MissingBlockVoteRequest23 ||
message.getType() == MessageType.MissingBlockRequest25) {

// This is not a full update. Instead, to offset our marking of in-cycle nodes as inactive, we allow a
// missing block vote request or a missing block request to reactivate the node. These requests are
// typically made when a node comes back online after a temporary network issue.
Node node = ipAddressToNodeMap.get(ByteBuffer.wrap(message.getSourceIpAddress()));
if (node != null) {
node.markSuccessfulConnection();
}
} else {
LogUtil.println("unrecognized message type in updateNode(): " + message.getType());
}
}

This method now responds to only four messages: NodeJoinV2_43, NodeJoinResponseV2_44, MissingBlockVoteRequest23, and MissingBlockRequest25.

Looking at disallowedNonCycleTypes in the Message class, we can see that MissingBlockVoteRequest23 and MissingBlockRequest25 will only be processed if they are received from in-cycle verifiers. So, these will not affect the join queue.

private static final Set<MessageType> disallowedNonCycleTypes = new HashSet<>(Arrays.asList(MessageType.BlockVote19,
MessageType.NewVerifierVote21, MessageType.MissingBlockVoteRequest23, MessageType.MissingBlockRequest25));

This leaves just two messages, NodeJoinV2_43 and NodeJoinResponseV2_44, possibly adding a node to the join queue. Further examination of the logic, though, will show that only NodeJoinResponseV2_44 actually performs this action.

The private overload of NodeManager.updateNode() does the low-level work of manipulating maps and nodes.

private static void updateNode(byte[] identifier, byte[] ipAddress, int portTcp, int portUdp,
boolean isNodeJoinResponse) {

if (identifier != null && identifier.length == FieldByteSize.identifier && ipAddress != null &&
ipAddress.length == FieldByteSize.ipAddress && !IpUtil.isPrivate(ipAddress)) {

// Try to get the node from the map.
ByteBuffer ipAddressBuffer = ByteBuffer.wrap(ipAddress);
Node existingNode = ipAddressToNodeMap.get(ipAddressBuffer);

if (existingNode != null && ByteUtil.arraysAreEqual(existingNode.getIdentifier(), identifier)) {
// This is the case when there is already a node at the IP with the same identifier. Update the ports
// and mark a successful connection.
existingNode.setPortTcp(portTcp);
if (portUdp > 0) {
existingNode.setPortUdp(portUdp);
}
existingNode.markSuccessfulConnection();
} else {
// If the existing node is not null, remove it.
if (existingNode != null) {
ipAddressToNodeMap.remove(ipAddressBuffer);
}

// Now, determine what to do with the new node.
ByteBuffer identifierBuffer = ByteBuffer.wrap(identifier);
if (BlockManager.verifierInCurrentCycle(identifierBuffer) || isNodeJoinResponse) {
// All in-cycle nodes, in addition to out-of-cycle nodes due to node-join responses, are added now,
// subject to a limit per verifier. Set the timestamp of the node so that it is immediately eligible
// for the lottery if sufficient history is not present.
int instanceCount = 0;
for (Node mapNode : ipAddressToNodeMap.values()) {
if (ByteUtil.arraysAreEqual(mapNode.getIdentifier(), identifier)) {
instanceCount++;
}
}
if (instanceCount < maximumNodesPerInCycleVerifier) {
Node node = new Node(identifier, ipAddress, portTcp, portUdp);
if (!haveNodeHistory) {
node.setQueueTimestamp(System.currentTimeMillis() -
NewVerifierQueueManager.lotteryWaitTime);
}
ipAddressToNodeMap.put(ipAddressBuffer, node);
if (!BlockManager.verifierInCurrentCycle(identifierBuffer)) {
LogUtil.println("added new out-of-cycle node to NodeManager: " +
NicknameManager.get(identifier));
}
}
} else {
// Out-of-cycle nodes due to node joins are added to a map for later querying.
newNodeIpToPortMap.put(ipAddressBuffer, portTcp);
LogUtil.println("added new out-of-cycle node to queue: " + NicknameManager.get(identifier));
if (newNodeIpToPortMap.size() > maximumNewNodeMapSize) {
newNodeIpToPortMap.remove(newNodeIpToPortMap.keySet().iterator().next());
LogUtil.println("removed node from new out-of-cycle queue due to size");
}
}

// If the node that was just processed is the local verifier and not the temporary entry, remove the
// temporary entry.
if (!ByteUtil.isAllZeros(ipAddress) &&
ByteUtil.arraysAreEqual(identifier, Verifier.getIdentifier())) {
ipAddressToNodeMap.remove(ByteBuffer.wrap(new byte[4]));
}
}
}
}

In all of this, only one line will add a new node to the node map.

ipAddressToNodeMap.put(ipAddressBuffer, node);

Additionally, when the haveNodeHistory flag is false, all new nodes are given a timestamp that enables immediate entry into the lottery.

if (!haveNodeHistory) {
node.setQueueTimestamp(System.currentTimeMillis() -
NewVerifierQueueManager.lotteryWaitTime);
}

This page is under active development.