Nyzo techRelease notesNyzo 476: mesh reopening 2

Nyzo 476: mesh reopening 2

Nyzo version 476 (commit on GitHub) is the second of two updates to allow the mesh to be reopened for new verifiers.

The BlacklistManager was added to protect in-cycle verifiers from excess network traffic. The blacklistDuration, the time for which an IP address typically remains in the blacklist after a violation, is 10 minutes. The useIpTables argument can be changed by an individual operator if they do not want the verifier to use the system firewall to restrict connections.

In the static block, all iptables rules are flushed to avoid problems with lingering rules that may have been set by a previous instance of the verifier.

RN_476 code 0

BlacklistManager.addToBlacklist() adds a single IP address to the blacklist map. The address is only added when the BlockManager is initialized and has a complete cycle, because erroneous identification of verifiers as being out-of-cycle might happen otherwise.

RN_476 code 1

BlacklistManager.inBlacklist() is used to enforce the blacklist.

RN_476 code 2

BlacklistManager.getBlacklistSize() is used by BlacklistStatusResponse to provide information to the verifier operator.

RN_476 code 3

BlacklistManager.performMaintenance() removes in-cycle node addresses from the blacklist. The allows the blacklist to mitigate attacks or inadvertent bursts from in-cycle verifiers without causing long-term mesh connectivity issues.

This method also removes expired nodes from the blacklist to control memory usage.

RN_476 code 4

BlacklistManager.setIpTableEntry() runs the iptables command to add or remove a firewall entry. The entry, when active, drops TCP packets from the specified source address to the MeshListener port.

RN_476 code 5

BlacklistManager.runProcess() is a helper method used by the setIpTableEntry() method. BlacklistManager.readStream() is a helper method used to read the input and error streams of the process in the runProcess() method.

RN_476 code 6

In the Block.chainScore() method, the base offset for a new verifier was changed from -6 to -2. This gives only the top new verifier a score lower than the next in-cycle verifier, where previously the top two new verifiers were assigned scores lower than the next in-cycle verifier.

RN_476 code 7

In BlockManager, the currentAndNearCycleSet was added so that top-voted verifiers could be treated similarly to in-cycle verifiers for messaging purposes.

RN_476 code 8

Synchronization was removed from the BlockManager.verifiersInCurrentCycleList() and BlockManager.verifiersInCurrentCycleSet() methods to improve efficiency.

RN_476 code 9

Synchronization was removed from the BlockManager.verifierInCurrentCycle() method, and the BlockManager.verifiersInCurrentAndNearCycleSet() method was added. The diff's ordering of changes is mildly deceptive.

RN_476 code 10

The BlockManager.verifierInOrNearCurrentCycle() method provides lookup from the currentAndNearCycleSet.

RN_476 code 11

In BlockManager.updateVerifiersInCurrentCycle(), the currentAndNearCycleSet is populated with the contents of the currentCycleList and NewVerifierVoteManager.topVerifiers().

RN_476 code 12

In BlockVoteManager, the minimumVoteInterval was increased from 2.0 seconds to 5.0 seconds to improve stability of the voting process. The flipVoteMap was added to prevent a verifier's vote from being changed unless the same changed vote was received two consecutive times, separated by the minimumVoteInterval.

RN_476 code 13

The full Message object for a BlockVote is now passed to the BlockVoteManager.registerVote() method. The fields of this message are now stored on the vote to allow the votes to be shared later in BlockWithVotesResponse objects.

RN_476 code 14

The next section of the diff is largely due to indentation changes. Setting of the receiptTimestamp, with the addition of new fields, was just explained. Most of the vote registration code pictured here is unchanged. The new code, at lines 51 and 52, is fetching of the existing vote and checking whether it is null.

RN_476 code 15

If the existingVote is null, the new vote is registered.

RN_476 code 16

When a verifier has changed its vote, the flipVoteMap is now used. This is an important new protection to avoid one-block forks in the blockchain. Instead of accepting and registering changed votes immediately, the changed votes are stored in the flipVoteMap, and a confirmation vote is required before the vote is updated in the primary map.

RN_476 code 17

In BlockVoteManager.removeOldVotes(), votes are now retained for 40 blocks behind the frozen edge to support the BlockWithVotesResponse. The new flipVoteMap is cleaned in this method, also.

RN_476 code 18

BlockVoteManager.votesAtHeight() was planned to be a temporary entry in the StatusResponse. However, it has been more useful than anticipated, so it will be retained.

RN_476 code 19

BlockVoteManager.numberOfVotesAtHeight() provides the size of the vote map at the specified height. This is used in UnfrozenBlockManager.updateVote() to delay attempts to reach consensus until a suitable number of votes for the height have been tabulated.

RN_476 code 20

BlockVoteManager.getLocalVotes() was removed. This method was previously used by NodeJoinResponse, but it is no longer included in the updated, less burdensome initialization process.

RN_476 code 21

In BlockVoteManager.requestMissingVotes(), fallback wait times were reduced to improve the speed of consensus when votes have been dropped. The call to registerVote() now passes the entire Message instead of just the identifier and BlockVote.

RN_476 code 22

Some log statements that were no longer needed were removed from ChainInitializationManager.

RN_476 code 23

Counters were added to MeshListener to track how many messages are accepted and rejected under the new message rules.

RN_476 code 24

In the outer thread of MeshListener.start(), the inline inner thread started in response to accepted sockets has been removed.

RN_476 code 25

The replacement code first checks if the socket is connected to a blacklisted IP address. If the IP is blacklisted, the socket is closed immediately. Otherwise, a thread is started and the clientSocket is passed to the readMessageAndRespond() method for processing. The appropriate counters are incremented in both cases.

RN_476 code 26

The readMessageAndRespond() method processes the clientSocket. It reads the request message from the socket's input stream, produces a response, and writes that response to the socket's output stream. At the end of the method, the socket is closed.

RN_476 code 27

In MeshListener.response(), the call to BlockVoteManager.registerVote() was changed to match the changes in that method. This allows additional fields from the message to be temporarily stored on the BlockVote objects to facilitate later production of the BlockWithVotesResponse.

RN_476 code 28

The call to NodeManager.updateNode() was removed from the condition for MessageType.BootstrapRequestV2_35. This message type is no longer processed by that method. A condition was added to produce a BlockWithVotesResponse for MessageType.BlockWithVotesRequest37.

RN_476 code 29

In response to BlacklistStatusRequest416, a BlacklistStatusResponse is now produced. This allows the operator of a verifier to monitor the blacklist to ensure it is not causing communication problems.

RN_476 code 30

Accessors were added for numberOfMessagesRejected and numberOfMessagesAccepted. These are used by BlacklistStatusResponse.

RN_476 code 31

In Message, the whitelist set was added. This is used for IP addresses that are exempt from the blacklist. The disallowedNonCycleTypes set contains messages that are not allowed to be sent from verifiers not in the cycle.

RN_476 code 32

The fullMeshMessageTypes set are messages for which out-of-cycle verifiers are also potential data sources for random-node fetches.

In the static block of Message, the whitelist is loaded.

RN_476 code 33

In Message.broadcast(), the BlockManager.verifiersInCurrentAndNearCycleSet() method is now used. This does not change behavior. Previously, this method assembled its own set of verifiers in and near the current cycle.

RN_476 code 34

In Message.fetchFromRandomNode(), the fullMeshMessageTypes are now considered when selecting a random node. Full-mesh message types can be fetch from any node, while other types must be fetched from in-cycle nodes.

RN_476 code 35

Logging statements were added to communicate the node to which the request is made or if a node could not be found.

RN_476 code 36

The Message.fetch() method now filters outgoing messages to avoid sending messages that would result in blacklisting.

RN_476 code 37

In Message.fromBytes(), out-of-cycle verifiers sending disallowed message types are added to the blacklist. The blacklist improves efficiency of message rejection. The first message must be read to determine the verifier identifier. After the IP address is added to the blacklist, the socket can be closed before reading the message.

RN_476 code 38

In Message.processContent(), deserialization was added for BlockWithVotesRequest37, BlockWithVotesResponse38, and BlacklistStatusResponse417.

RN_476 code 39

Message.loadWhitelist() loads a list of IP addresses from /var/lib/nyzo/production/whitelist. These addresses are exempt from the blacklist.

RN_476 code 40

In MessageQueue, the inBadState field was removed. This field was used to debug stalls of the MessageQueue, and it is no longer needed.

RN_476 code 41

The sleep in the MessageQueue.blockThisThreadUntilClear() loop was reduced from 0.5s to 0.1s. This allows the method to complete faster on average. Also, a 0.05s sleep was added after the loop to give the last message in the queue additional time to complete processing.

RN_476 code 42

In MessageQueue.add() and MessageQueue.next(), logging of inBadState was removed.

RN_476 code 43

In the main MessageQueue loop, the print statement for inBadState was removed, and printing of the exception was eliminated. Setting of the MessageQueue.lastMessageStatus field was added. In the event of future problems with the MessageQueue, this field can be used to determine what may have caused a stall.

RN_476 code 44

BlockWithVotesRequest37, BlockWithVotesResponse38, BlacklistStatusRequest416, and BlacklistStatusResponse417 were added to MessageType.

RN_476 code 45

In NewVerifierQueueManager, the consecutiveBlocksVotingForSameVerifier field was added to ensure that a verifier does not receive a vote for too long if it is not joining the cycle.

RN_476 code 46

In NewVerifierQueueManager.updateVote(), the vote is now registered locally even if this verifier is not in the cycle. A later condition is applied to avoid broadcast of votes from out-of-cycle verifiers. This is the condition that is true when the vote changes, so consecutiveBlocksVotingForSameVerifier is reset to 1.

RN_476 code 47

When the vote has not changed, consecutiveBlocksVotingForSameVerifier is incremented. After allowing for approximately 50 blocks more than the blockchain-enforced entry interval, a selected verifier is demoted to give another verifier a chance to join.

RN_476 code 48

An accessor is provided for currentVote. This is used by MeshStatusResponse.

RN_476 code 49

In NewVerifierVoteManager.topVerifiers(), the list of verifiers is now limited to a size of 3. The list is now displayed.

RN_476 code 50

In NodeManager, the new persistedQueueTimestamps map provides lookup of timestamps from previous runs of the verifier. The queue timestamps are periodically written to the queueTimestampsFile, and this file is loaded into the map in the class's static block.

RN_476 code 51

When a new node is created, the persistedQueueTimestamps map is consulted. If a favorable timestamp for the node is available, it is applied to the node.

RN_476 code 52

The NodeManager.demoteIdentifier() method sets the timestamps for a specified identifier to the current timestamp. This sends the nodes to the end of the queue.

RN_476 code 53

When the NodeJoinResponse is received in NodeManager.sendNodeJoinMessage(), the block votes from that response are no longer registered. These were originally included to allow a new verifier to quickly become aware of the state of consensus, but they were unhelpful and a waste of bandwidth. While the NodeJoinResponse still understands the inclusion of these votes in its serialized form, they are no longer serialized, and they are discarded if present during deserialization.

RN_476 code 54

NodeManager.demoteInCycleNodes() is called periodically to ensure that verifiers that drop from the cycle are not immediately placed at the top of the entrance queue.

RN_476 code 55

NodeManager.persistQueueTimestamps() writes queue timestamps to a file so that queue information does not reset each time a verifier restarts.

RN_476 code 56

NodeManager.loadPersistedQueueTimestamps() reads the timestamp file into the persistedQueueTimestamps map. This map is then used to assign timestamps when nodes are added to the primary map.

RN_476 code 57

In UnfrozenBlockManager.updateVote(), a comment was updated to clarify that the 0.2-second time calculation offset was not solely accounting for network jitter.

RN_476 code 58

In the vote calculation of UnfrozenBlockManager.updateVote(), attempt for consensus is now delayed until votes from at least 75% of the cycle have been received. This avoids possible selection of unpreferred blocks based on the receipt of coherent votes from a small portion of the cycle.

RN_476 code 59

In UnfrozenBlockManager.updateVote(), the vote is now broadcast regardless of whether it has changed. The new consensus process relies on multiple consecutive vote messages to change a vote, and this does not cause a problem with excess message traffic, because votes are still spaced by BlockVoteManager.minimumVoteInterval.

RN_476 code 60

In UnfrozenBlockManager.castVote(), the entire vote message is now registered with BlockVoteManager to support building of the BlockWithVotesResponse. The Verifier.inCycle() method is now used instead of the BlockManager.verifierInCurrentCycle() method. This change is purely for succinctness and does not change behavior.

RN_476 code 61

In UnfrozenBlockManager.attemptToFreezeBlock(), the delay and re-check before freezing a block was removed. This check is no longer necessary due to the vote-flip mechanism.

RN_476 code 62

The UnfrozenBlockManager.performMaintenance() method is an encapsulation of behavior previously in the UnfrozenBlockManager.attemptToFreezeBlock() method. The only new code in this method is setting of lastBlockVoteTimestamp to a value of 0L. All other changes are indentation differences.

RN_476 code 63

In Verifier, three counters were added for controlling the out-of-cycle tracking process. The intent of these counters is to transition all verifiers to use the new block-with-votes messages as support allows, falling back to legacy messages as necessary.

RN_476 code 64

The new Verifier.inCycle() method is now used instead of the equivalent BlockManager.verifierInCurrentCycle() call. Fetching of a block based on the bootstrap response is now logged.

RN_476 code 65

Logging was also added to indicate when verifier initialization has completed and the main loop is about to commence.

RN_476 code 66

An additional check, intended to avoid blacklisting, is now performed before a block is transmitted to the cycle.

RN_476 code 67

Requests for unfrozen blocks and for individual block votes are no longer allowed from out-of-cycle verifiers. So, if this verifier is not in the cycle, those requests are no longer made. Instead, requests are now made for frozen blocks bundled with votes. As this is a newer message that may not be widely supported, the old message that requests frozen blocks without votes is used as a fallback.

RN_476 code 68

Cleanup due to freezing of a block is now logged. BlacklistManager.performMaintenance() and UnfrozenBlockManager.performMaintenance() are now included in this process.

RN_476 code 69

When a block at a height divisible by 100 is frozen, the queue timestamps are written to a file so they will persist between runs of the verifier. Before the file is written, in-cycle nodes are demoted so they will not be able to jump to the front of the queue if they are removed from the cycle.

RN_476 code 70

The Verifier.inCycle() convenience method was added to improve code readability.

RN_476 code 71

Verifier.requestBlockWithVotes() requests blocks with bundled votes. This allows a verifier, in a single message, to get both the block and the votes that show the cycle accepted the block. The block is registered with UnfrozenBlockManager. The block-vote messages are reconstructed, and the votes are registered with the BlockVoteManager. Blocks will then be frozen with the typical mechanism.

Successes and failures are tracked to allow this verifier to adjust usage of this method if the cycle does not yet have broad support. As more of the cycle adopts this version and supports the new message, usage of the less-safe legacy method will automatically be tapered and eventually eliminated.

RN_476 code 72

Verifier.requestBlockWithoutVotes() falls back to the universally supported BlockRequest11 message. This method is less safe because it does not verify votes and could potentially freeze an incorrect block.

RN_476 code 73

A missing parenthesis was added to BlockResponse.toString().

RN_476 code 74

In BlockVote, the comment for timestamp was modified to note that the timestamp in the vote is redundant. Also, three fields were added for storing properties from the Message. These fields are used to construct the BlockWithVotesResponse.

RN_476 code 75

Accessors and mutators were added for the new fields.

RN_476 code 76

The new BlockWithVotesRequest contains the height for a request.

RN_476 code 77

BlockWithVotesResponse contains a Block and a list of BlockVote objects.

RN_476 code 78

When serializing the BlockWithVotesResponse, the Block is included first, followed by the BlockVote list. While BlockVote objects contain height and blockHash values, these do not need to be written for each individual vote, as they are all the same as the height and hash of the block.

RN_476 code 79

Deserialization in BlockWithVotesResponse.fromByteBuffer() reassembles the block first. Then, the method reassembles the list of BlockVote objects. Note that the response does not require a block to be included, but if a block is not included, the list of BlockVote objects must be empty.

An overload of toString() is provided for convenience.

RN_476 code 80

The list of blockVotes was removed from NodeJoinResponse. These votes were unhelpful and a waste of bandwidth.

RN_476 code 81

To maintain compatibility, the serialized version of NodeJoinResponse is still aware of the possible presence of the list of block votes. However, an empty list is always serialized, and any votes present in a serialized version are discarded.

RN_476 code 82

PingResponse now uses the Message.putString() and Message.getString() convenience methods. This does not change behavior.

RN_476 code 83

BlacklistStatusResponse is a MultilineTextResponse that reports the count of messages rejected due to the blacklist, count of messages accepted, and blacklist size.

RN_476 code 84

BlacklistStatusResponse is serialized with a 16-bit list length specifier followed by the list of character strings.

RN_476 code 85

BlacklistStatusResponse.toString() displays the number of lines in the response.

RN_476 code 86

MeshStatusResponse has a new constant, maximumNumberOfLines, to limit the list size.

RN_476 code 87

The list of nodes in MeshStatusResponse is now ordered by queue timestamp instead of identifier.

RN_476 code 88

The TODO comments describe persistence of queue timestamps and demotion of in-cycle nodes in NodeManager. These features were both implemented in this version, and the TODO comments were inadvertently left in the code.

RN_476 code 89

The currentNewVerifierVote is retrieved and marked in the list with "C". The topVerifiers are retrieved and marked in the list with their indices. The list is limited to maximumNumberOfLines.

RN_476 code 90

To support more than 256 lines, the list length was changed from a 1-byte (8-bit) integer to a 2-byte (16-bit) integer. This is a breaking change, incompatible with the previous version of the message.

RN_476 code 91