Thursday 16 November 2017

Bitcoin core tutorial & code walk through (Part 9) - P2PKH/P2SH

In part 9 of the tutorial, the Bitcoin transaction will be analysed. Specifically, we will look at two types of transaction.
  • Pay to Public Key Hash (P2PKH)
  • Pay to Script Hash (P2SH)
As an example, a typical Bitcoin transaction is shown below:
Input:
Previous tx: f5d8ee39a430901c91a5917b9f2dc19d6d1a0e9cea205b009ca73dd04470b9a6
Index: 0
scriptSig: 304502206e21798a42fae0e854281abd38bacd1aeed3ee3738d9e1446618c4571d10
90db022100e2ac980643b0b82c0e88ffdfec6b64e3e6ba35e7ba5fdd7d5d6cc8d25c6b241501

Output:
Value: 100000000
scriptPubKey: OP_DUP OP_HASH160 404371705fa9bd789a2fcd52d2c580b65d35549d
OP_EQUALVERIFY OP_CHECKSIG
In input field, previous tx is the hash of the previous transaction. Index chooses the specific output in the transaction. ScriptSig contains the signature to satisfy the transaction conditions, for the recipient who is spending the bitcoin. This input uses the previous transaction of f5d8... and chooses output 0 of that transaction (as seen in Index: 0). 
In output field, scriptPubKey defines the conditions to spend the bitcoins. Value is the satoshi to be sent. One BTC is 100,000,000 satoshi. This output sends 1 BTC, it is sent to Bitcoin address 4043... 

P2PKH

scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
scriptSig: <sig> <pubKey>
As seen above, the scriptSig contains sender public key and signature of the sender. The scriptPubKey contains the hash of the receiver's public key. The recipient of P2PKH Bitcoin transaction, checks the signature and the public key hash. The public key must generate the hash that matches the pubKeyHash. The sender signature can be verified using the sender's public key.

The scriptSig part is what the receiver uses to spend the money that they got from P2PKH. Because they are spending the money, at that point they would be the new sender. The <pubKey> for this new transaction hashes to the receiver's <pubKeyHash> from the old transaction when they first got the money.
The checking process is as below:
StackScriptDescription
Empty.<sig> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGscriptSig and scriptPubKey are combined.
<sig> <pubKey>OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGConstants are added to the stack.
<sig> <pubKey> <pubKey>OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGTop stack item is duplicated.
<sig> <pubKey> <pubHashA><pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGTop stack item is hashed.
<sig> <pubKey> <pubHashA> <pubKeyHash>OP_EQUALVERIFY OP_CHECKSIGConstant added.
<sig> <pubKey>OP_CHECKSIGEquality is checked between the top two stack items.
trueEmpty.Signature is checked for top two stack items.
For the source code, in the bitcoin-tx.cpp, there is a separate main() function that calls CommandLineRawTx(), which in turn calls MutateTx(). MutateTx() handles adding key and value to CMutableTransaction. The key and value are declare as std::string key, valueThe bitcoin-tx.cpp is compiled into bitcoin-tx binary.

In MutateTx(), depending on the command line option, MutateTxAddOutPubKey() adds value and scriptPubKey to the output, using public key from command line input. MutateTxAddOutAddr() adds value and scriptPubKey to the output, using address from command line input.

For bitcoind binary, in rest.cpp, the rest_getutxos(HTTPRequest *, std::string &) finds the txid of the UTXO, and calls ScriptPubKeyToUniv() to include the scriptPubKey in output.

In rest.cpp, the rest_tx(HTTPRequest*, std::string &) prepares the transaction with CTransactionRef class, then for JSON format, calls TxToUniv() to write the HTTP request.

In core_write.cpp, the TxToUniv(CTransaction&, uint256&, UniValue&) pushes scriptSig to vin and scriptPubKey to vout. For vout, it calls ScriptPubKeyToUniv() to include the scriptPubKey in vout.

The rest_tx() and rest_getutxos() are in uri_prefixes structure. This structure is registered in HTTP handler in StartREST().  StartREST() is called by AppInitServers() in init.cpp.
P2SH
P2SH lets the sender funds a transaction using 20 byte hash. The script supplied to redeem must hash to the scriptHash. 
Without P2SH, the scripts are shown below:
locking script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG
unlocking script: <sig1> <sig2>
With P2SH, the scripts become:

redeem script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG 
locking script: OP_HASH160 <redeem script Hash> OP_EQUAL
unlocking script: <sig1> <sig2>

So with P2SH, the locking script is simplified. The actual scripts in transactions are as below:

scriptPubKey: OP_HASH160 <scriptHash> OP_EQUAL 
scriptSig: <sig1> <sig2> OP_m <pubKey1> ... OP_n OP_CHECKMULTISIG
The locking script is OP_HASH160 <scriptHash> OP_EQUAL. The locking script is a simplified form of multisig script. From multisig script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG, the mutisig script hashes to 20 byte value of 8ac1d7a2fa204a16dc984fa81cfdf86a2a4e1731. Therefore, the locking script becomes OP_HASH160 8ac1d7a2fa204a16dc984fa81cfdf86a2a4e1731 OP_EQUAL.
The scriptSig <sig1> <sig2> OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG, is presented when recipient wants to spend the bitcoins.

The checking process:
StackScriptDescription
Empty.OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG OP_HASH160 <scriptHash> OP_EQUALredeem script checked with locking script, to make sure scriptHash matches
true<sig1> <sig2> OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIGunlocking script executed to unlock redeem script
trueEmpty.Signatures validated in the order of the keys in the script.
For the source code, to be continued...

Friday 10 November 2017

Bitcoin core tutorial & code walk through (Part 8) - blockchain

In part 8, we will look at code that handles blockchain.

In validation.cpp,
ProcessNewBlockHeaders(std::vector<CBlockHeader>&, CValidationState&, CChainParams&, CBlockIndex**) loops through the CBlockHeaders, for each header, it calls AcceptBlockHeader().

**ProcessNewBlockHeaders() is called in ProcessMessage() in net_processing.cpp.

ProcessNewBlock(CChainParams&, std::shared_ptr<const CBlock>, bool, bool) calls CheckBlock(), if that is successful, it calls AcceptBlock(). Then, the best chain is activated using ActivateBestChain().

**ProcessNewBlock() is called in ProcessMessage() in net_processing.cpp.

AcceptBlock(std::shared_ptr<const CBlock>&, CValidationState&, CChainParams&, CBlockIndex**, bool, CDiskBlockPos*, bool) calls AcceptBlockHeader(). It checks if it already has the block and the block has more work to advance blockchain tip. It also checks for block height if the block is too too ahead for blockchain pruning. If the above conditions are fulfilled, it calls CheckBlock(). If header is valid and there is sufficient work, merkle tree and segwit merkle tree are good, calls NewPowValidBlock() to announce new block to peer nodes. Then, it calls WriteBlockToDisk(). Finally, the ReceivedBlockTransactions() is called to connect the new block to the chain.

AcceptBlockHeader(CBlockHeader&, CValidationState&, CChainParams&, CBlockIndex**) firstly checks for genesis block. If block header is already known, the block is not accepted. It checks for previous block and bad block. It calls CheckBlockIndex().

CheckBlockIndex(Consensus::Params&) iterates over the entire blockchain, and checks for consistency using CBlockIndex *pindex. For example, it checks for block height, the chainwork length. tree validity.

CheckBlock(CBlock&, CValidationState&, Consensus::Params&, bool, bool) calls CheckBlockHeader() for checking header validity. Next, it check merkle root. After that, it checks for size limits, duplicate coinbase. The transactions is checked using CheckTransaction().

In consensus/tx_verify.cpp.
CheckTransaction(CTransaction&, CValidationState&, bool) firstly check for empty vin and vout. Then, it checks for negative or too large output value, duplicate inputs.

Monday 6 November 2017

Bitcoin core tutorial & code walk through (Part 7) - networking

In part 7, the tutorial will analyse the networking code of Bitcoin core.

The net.cpp and net_processing.cpp contain the bulk of the socket handling and network message processing.

In net.cpp:
In the CConnman::Start(CScheduler&, Options) , this function initialises the connection options, such as maximum connections, maximum buffer size, and starts threads. CConnman::Start() is called in AppInitMain() as "connman.Start(scheduler, connOptions)".

The threads are listed below.

  1. The ThreadSocketHandler reads from socket and puts the messages into vRecvMsg. The select() is used to listen to file descriptor sets, the accept() is used to accept from fdSetRecv, and recv() is used to read from fdSetRecv into the buffer. If received bytes is > 0, the CNode->ReceiveMsgBytes() is called to store buffer into vRecvMsg**.
  2. The ThreadMessageHandler reads the messages from vRecvMsg, processes and sends out the messages. The handler is a loop.  It loops through std::vector<CNode>. For each node, it calls ProcessMessages() to read messages and SendMessages() to send out messages.
  3. The ThreadDNSAddressSeed finds addresses from DNS seeds. It loops through std::vector<CDNSSeedData>. From the DNS host, it look up the IP address. If IP address is found, it stores the ip address and port number in std::vector<CAddress>.
  4. The ThreadOpenAddedConnections opens network connections to added nodes.  The handler loops through std::vector<AddedNodeInfo>. If not connected,  it calls function OpenNetworkConnection().
  5. The ThreadOpenConnections is a loop. It prepares feeler connection setup. Feeler connection is short lived connections, used to test if address is online or offline The purpose of feeler connections is to increase the number of online addresses. It opens network connection from "CAddrMan addrman" variable in CConnman. It uses the addrman to setup "CAddrInfo addr". If addr is valid and feeler flag is setup, It calls function OpenNetworkConnection().
**The vRecvMsg is declared as std::list<CNetMessage> in CNode.


In net_processing.cpp:
In ProcessMessages(CNode*, CConnman&, std::atomic<bool>&), if std::dequeue<CInv> vRecvGetData is not empty, it calls ProcessGetData() to get from CNode->vRecvGetData. Then, it declares std::list<CNetMessage> msgs, and uses splice() to get from CNode->vProcessMsg to msgs. vProcessMsg is also of type std::list<CNetMessage>. After that, it checks the header for validity, set message size, initialise CDataStream& vRecv, compare checksum. Subsequently, it calls ProcessMessage() and pass vRecv to it.

In ProcessMessage(CNode*, std::string&, CDataStream&, int64_t, CChainParams&, CConnman&, td::atomic<bool>&) , if the command type is VERSION, it deserialise the vRecv to nVersion, nServiceInt, nTime. It checks the services offered, if peers' services not matched, it pushes reject message to the peers and returns. If version is less than minimum required. it pushes reject message to the peers, and returns. The CConnman.PushMessage() is used to push messages. Otherwise, it push version ack message to the peers. Then, if fInbound is false, the code uses CNode->PushAddress() to advertise its own address.

If the command type is VERACK, it pushes send headers message to peers if version is greater than  sendheaders version. It pushes send compact block message to peers if version is greater than ids block version.

If the command type is ADDR, it reads from vRecv to std::vector<CAddress> vAddr. Then, it calls RelayAddress(). If addr is reacheable, it stores them by calling CConnman.AddNewAddress().

If the command type is SENDHEADERS, it sets the CNodeState fPreferHeaders to true.

If the command type is SENDCMPCT, it reads from vRecv to fAnnounceUsingCMPCTBLOCK and nCMPCTBLOCKVersion. Then it sets the CNodeState flags.

If the command type is INV, it reads from vRecv to std:vector<CInv> vInv. If size is too big, it calls Misbehaving(). Then, it loops through vInv, if the inventory msg type is MSG_BLOCK, it pushes get headers message.

If the command type is GETDATA, it also reads from vRecv to vInv, checks for size. After that, it calls ProcessGetData().

If the command type is GETBLOCKS, it reads from vRecv into locator and hashStop. It activates the best chain from most_recent_block by calling ActivateBestChain(). It uses locator and chainActive**, loop thru the chainActive, and push newly created CInv by calling PushInventory().

**chainActive is the blockchain, starts from genesis block and ends with tip, of class CChain, declared in validation.cpp. ActivateBestChain() is called in init.cpp to initialise blockchain.

If the command type is GETBLOCKTXN, it reads from vRecv to BlockTransactionsRequest req.  If older block is requested, it calls ProcessGetData() to send block response and returns. Otherwise, it read blocks from disk and calls SendBlockTransactions().

If the command type is GETHEADERS, it reads from vRecv into locator and hashStop. It pushes the headers message after using std::vector<CBlock> vHeaders to store CBlockIndex * pindex. The pindex is from locator value.

If the command type is TX, it reads from vRecv to CTransactionRef ptx. Then, it creates double ended queue of COutPoint (vWorkQueue) and vector of uint256 (vEraseQueue), creates CInv of MSG_TX. If inv is not available, it stores inv hash to vWorkQueue. It loops through if work queue is empty, calls RelayTransaction() and stores orphan hash to work queue and erase queue. If missing input is true, it sets that orphan parents are rejected. Else, it calls AddToCompactExtraTransactions() and lastly it checks for nDos flag.

If the command type is CMPCTBLOCK, it read from vRecv to CBlockHeaderAndShortTxIDs cmpctblock. It calls ProcessNewBlockHeaders(). If fAlreadyInFlight is set, push get data message out. It checks the chainActive height, check block transaction request from compact block tx count.

If the command type is BLOCKTXN, it reads from vRecv to BlockTransactions resp. It opens a shared_ptr to CBlock pblock, checks Read status from pblock and resp. If status is invalid, it calls Misbehaving() and returns. If status is failed,  it push get data message. Else it calls MarkBlockAsReceived().

If the command type is HEADERS, it reads from vRecv to CBlockHeader vector. If it is block announcement and headers is at the end, push get headers message. Then, calls UpdateBlockAvailability(). If header msg is at max size, peer may have more headers, push the get headers message again. If headers are valid, ends in block that is greater than block in the tip, download as much as possible by calling MarkBlockAsInFlight().

If the command type is BLOCK, it reads from vRecv to pblock, shared_ptr to CBlock. Then calls ProcessNewBlock().

If the command type is GETADDR, it loops thru addr from CConnman and push addr using CNode->PushAddress().

If the command type is MEMPOOL, it checks for bloom filter and bandwidth limit.

If the command type is PING, it push pong message with nonce. Nonce is read from vRecv.

If the command type is PONG, and vRecv.in_avail() is bigger than size of nonce, read vRecv to nonce. It checks nonce to find matching ping,  process pong msg only if there is matching ping.

The ProcessGetData(CNode*, Consensus::Params&, CConnman&std::atomic<bool>&), it loops thru  CNode->vRecvGetData. If inv type is MSG_BLOCK , MSG_FILTERED_BLOCK, MSG_CMPCT_BLOCK, or MSG_WITNESS_BLOCK, it calls ActivateBestChain() if not yet validated. It checks blocks for data, pushes message of BLOCK if MSG_WITNESS_BLOCK. If MSG_FILTERED_BLOCK , it needs to send merkle block, push block message  and serialise_transaction message. If MSG_CMPCT_BLOCK, it pushes message of block or compact block.


Code to Bitcoin protocol mapping

Referring to Bitcoin protocol, Bitcoin core nodes work on p2p network. The new nodes download blocks from sync nodes, using block-first or header-first method.

For block-first download method:
Messagegetblocksinvgetdatablock
From→ToIBD→SyncSync→IBDIBD→SyncSync→IBD
PayloadOne or more header hashesUp to 500 block inventories (unique identifiers, hash of block's header)One or more blockinventoriesOne serialized block
**IBD : initial block download, refers to new node which is just trying to download blocks

and for header-first download method:
Messagegetheadersheadersgetdatablock
From→ToIBD→SyncSync→IBDIBDManyManyIBD
PayloadOne or more header hashesUp to 2,000 block headersOne or more block inventories derived from header hashesOne serialized block

The ProcessMessage() code indeed processes the Bitcoin protocol messages.

Saturday 4 November 2017

Bitcoin core tutorial & code walk through (Part 6) - signal

In part 6 of the tutorial, the signals used in bitcoin core will be discussed.

Looking at init.cpp, it defines registerSignalHandler() function. It registers signal handler to a signal in linux style. The "struct sigaction" is same as what is available in linux. In AppInitBasicSetup(), three signal handlers are registered respectively for SIGTERM, SIGINT, SIGHUP.

In net.h, "struct CNodeSignals" is declared. It defines boost C++ library style signal. This line
boost::signals2::signal<bool (CNode*, CConnman&, std::atomic<bool>&), CombinerAll> ProcessMessages
means a ProcessMessages signal is defined. The return type of the connector is bool, the connector takes in 3 parameters. The "CombinerAll" is a combiner. 

In net_processing.cpp, compare the function declaration bool ProcessMessages(CNode*, CConnman&, const std::atomic<bool>&)
with the signal connector signature. The signature exactly matches.

In net_processing.cpp, the slot is connected to the signal in RegisterNodeSignals(CNodeSignals& nodeSignals).
Both the slot and signal are called ProcessMessages, as seen in the line below !! nodeSignals.ProcessMessages.connect(&ProcessMessages);

In net.h, it defines the combiner. The result_type is the return value of the combiner.
struct CombinerAll {
    typedef bool result_type;


    template<typename I>

    bool operator()(I first, I last) const

    {

        while (first != last) {

            if (!(*first)) return false;
            ++first;
        }
        return true;
    }
};

The combiner takes in two input iterator, "first" and "last". It compares all connector return values, and returns true if all values are equal.

In validationinterface.h, "struct CMainSignals" is declared. It also defines boost C++ library style signal. This line

boost::signals2::signal<void (const CBlockIndex *, const CBlockIndex *, bool fInitialDownload)> UpdatedBlockTip;

means for the UpdatedBlockTip signal, the connector returns void, and takes in 3 parameters.

In validationinterface.cpp, the connector is

g_signals.UpdatedBlockTip.connect(boost::bind(&CValidationInterface::UpdatedBlockTip, pwalletIn, _1, _2, _3));

The boost:bind() is used. It means it stores a copy of pwalletIn->UpdateBlockTip(_1, _2, _3). That is the three paramters as in the connector.

In validation.cpp, the UpdatedBlockTip signal is called with three parameters.

GetMainSignals().UpdatedBlockTip(pindexNewTip, pindexFork, fInitialDownload);

So, for headless bitcoin core source code, linux and boost style signal are used.