EMBEDDED: Bitcoin core tutorial & code walk through (Part 7)

In part 7, the tutorial will analyse the networking code of Bitcoin core.

The net.cpp and net_processing.cpp contain the bulk of the socket handling and network message processing.

In net.cpp:
In the CConnman::Start(CScheduler&, Options) , this function initialises the connection options, such as maximum connections, maximum buffer size, and starts threads. CConnman::Start() is called in AppInitMain() as "connman.Start(scheduler, connOptions)".

The threads are listed below.

The ThreadSocketHandler reads from socket and puts the messages into vRecvMsg. The select() is used to listen to file descriptor sets, the accept() is used to accept from fdSetRecv, and recv() is used to read from fdSetRecv into the buffer. If received bytes is > 0, the CNode->ReceiveMsgBytes() is called to store buffer into vRecvMsg**.
The ThreadMessageHandler reads the messages from vRecvMsg, processes and sends out the messages. The handler is a loop. It loops through std::vector<CNode>. For each node, it calls ProcessMessages() to read messages and SendMessages() to send out messages.
The ThreadDNSAddressSeed finds addresses from DNS seeds. It loops through std::vector<CDNSSeedData>. From the DNS host, it look up the IP address. If IP address is found, it stores the ip address and port number in std::vector<CAddress>.
The ThreadOpenAddedConnections opens network connections to added nodes. The handler loops through std::vector<AddedNodeInfo>. If not connected, it calls function OpenNetworkConnection().
The ThreadOpenConnections is a loop. It prepares feeler connection setup. Feeler connection is short lived connections, used to test if address is online or offline The purpose of feeler connections is to increase the number of online addresses. It opens network connection from "CAddrMan addrman" variable in CConnman. It uses the addrman to setup "CAddrInfo addr". If addr is valid and feeler flag is setup, It calls function OpenNetworkConnection().

**The vRecvMsg is declared as std::list<CNetMessage> in CNode.

In net_processing.cpp:
In ProcessMessages(CNode*, CConnman&, std::atomic<bool>&), if std::dequeue<CInv> vRecvGetData is not empty, it calls ProcessGetData() to get from CNode->vRecvGetData. Then, it declares std::list<CNetMessage> msgs, and uses splice() to get from CNode->vProcessMsg to msgs. vProcessMsg is also of type std::list<CNetMessage>. After that, it checks the header for validity, set message size, initialise CDataStream& vRecv, compare checksum. Subsequently, it calls ProcessMessage() and pass vRecv to it.

In ProcessMessage(CNode*, std::string&, CDataStream&, int64_t, CChainParams&, CConnman&, td::atomic<bool>&) , if the command type is VERSION, it deserialise the vRecv to nVersion, nServiceInt, nTime. It checks the services offered, if peers' services not matched, it pushes reject message to the peers and returns. If version is less than minimum required. it pushes reject message to the peers, and returns. The CConnman.PushMessage() is used to push messages. Otherwise, it push version ack message to the peers. Then, if fInbound is false, the code uses CNode->PushAddress() to advertise its own address.

If the command type is VERACK, it pushes send headers message to peers if version is greater than sendheaders version. It pushes send compact block message to peers if version is greater than ids block version.

If the command type is ADDR, it reads from vRecv to std::vector<CAddress> vAddr. Then, it calls RelayAddress(). If addr is reacheable, it stores them by calling CConnman.AddNewAddress().

If the command type is SENDHEADERS, it sets the CNodeState fPreferHeaders to true.

If the command type is SENDCMPCT, it reads from vRecv to fAnnounceUsingCMPCTBLOCK and nCMPCTBLOCKVersion. Then it sets the CNodeState flags.

If the command type is INV, it reads from vRecv to std:vector<CInv> vInv. If size is too big, it calls Misbehaving(). Then, it loops through vInv, if the inventory msg type is MSG_BLOCK, it pushes get headers message.

If the command type is GETDATA, it also reads from vRecv to vInv, checks for size. After that, it calls ProcessGetData().

If the command type is GETBLOCKS, it reads from vRecv into locator and hashStop. It activates the best chain from most_recent_block by calling ActivateBestChain(). It uses locator and chainActive**, loop thru the chainActive, and push newly created CInv by calling PushInventory().

**chainActive is the blockchain, starts from genesis block and ends with tip, of class CChain, declared in validation.cpp. ActivateBestChain() is called in init.cpp to initialise blockchain.

If the command type is GETBLOCKTXN, it reads from vRecv to BlockTransactionsRequest req. If older block is requested, it calls ProcessGetData() to send block response and returns. Otherwise, it read blocks from disk and calls SendBlockTransactions().

If the command type is GETHEADERS, it reads from vRecv into locator and hashStop. It pushes the headers message after using std::vector<CBlock> vHeaders to store CBlockIndex * pindex. The pindex is from locator value.

If the command type is TX, it reads from vRecv to CTransactionRef ptx. Then, it creates double ended queue of COutPoint (vWorkQueue) and vector of uint256 (vEraseQueue), creates CInv of MSG_TX. If inv is not available, it stores inv hash to vWorkQueue. It loops through if work queue is empty, calls RelayTransaction() and stores orphan hash to work queue and erase queue. If missing input is true, it sets that orphan parents are rejected. Else, it calls AddToCompactExtraTransactions() and lastly it checks for nDos flag.

If the command type is CMPCTBLOCK, it read from vRecv to CBlockHeaderAndShortTxIDs cmpctblock. It calls ProcessNewBlockHeaders(). If fAlreadyInFlight is set, push get data message out. It checks the chainActive height, check block transaction request from compact block tx count.

If the command type is BLOCKTXN, it reads from vRecv to BlockTransactions resp. It opens a shared_ptr to CBlock pblock, checks Read status from pblock and resp. If status is invalid, it calls Misbehaving() and returns. If status is failed, it push get data message. Else it calls MarkBlockAsReceived().

If the command type is HEADERS, it reads from vRecv to CBlockHeader vector. If it is block announcement and headers is at the end, push get headers message. Then, calls UpdateBlockAvailability(). If header msg is at max size, peer may have more headers, push the get headers message again. If headers are valid, ends in block that is greater than block in the tip, download as much as possible by calling MarkBlockAsInFlight().

If the command type is BLOCK, it reads from vRecv to pblock, shared_ptr to CBlock. Then calls ProcessNewBlock().

If the command type is GETADDR, it loops thru addr from CConnman and push addr using CNode->PushAddress().

If the command type is MEMPOOL, it checks for bloom filter and bandwidth limit.

If the command type is PING, it push pong message with nonce. Nonce is read from vRecv.

If the command type is PONG, and vRecv.in_avail() is bigger than size of nonce, read vRecv to nonce. It checks nonce to find matching ping, process pong msg only if there is matching ping.

The ProcessGetData(CNode*, Consensus::Params&, CConnman&std::atomic<bool>&), it loops thru CNode->vRecvGetData. If inv type is MSG_BLOCK , MSG_FILTERED_BLOCK, MSG_CMPCT_BLOCK, or MSG_WITNESS_BLOCK, it calls ActivateBestChain() if not yet validated. It checks blocks for data, pushes message of BLOCK if MSG_WITNESS_BLOCK. If MSG_FILTERED_BLOCK , it needs to send merkle block, push block message and serialise_transaction message. If MSG_CMPCT_BLOCK, it pushes message of block or compact block.

Code to Bitcoin protocol mapping

Referring to Bitcoin protocol, Bitcoin core nodes work on p2p network. The new nodes download blocks from sync nodes, using block-first or header-first method.

For block-first download method:

Message	`getblocks`	`inv`	`getdata`	`block`
From→To	IBD→Sync	Sync→IBD	IBD→Sync	Sync→IBD
Payload	One or more header hashes	Up to 500 block inventories (unique identifiers, hash of block's header)	One or more block inventories	One serialized block

**IBD : initial block download, refers to new node which is just trying to download blocks

and for header-first download method:

Message	`getheaders`	`headers`	`getdata`	`block`
From→To	IBD→Sync	Sync→IBD	IBD→Many	Many→IBD
Payload	One or more header hashes	Up to 2,000 block headers	One or more block inventories derived from header hashes	One serialized block

The ProcessMessage() code indeed processes the Bitcoin protocol messages.

EMBEDDED

Monday, 6 November 2017

Bitcoin core tutorial & code walk through (Part 7) - networking

No comments:

Post a Comment