Thursday, 16 November 2017

Bitcoin core tutorial & code walk through (Part 9) - P2PKH/P2SH

In part 9 of the tutorial, the Bitcoin transaction will be analysed. Specifically, we will look at two types of transaction.
  • Pay to Public Key Hash (P2PKH)
  • Pay to Script Hash (P2SH)
As an example, a typical Bitcoin transaction is shown below:
Input:
Previous tx: f5d8ee39a430901c91a5917b9f2dc19d6d1a0e9cea205b009ca73dd04470b9a6
Index: 0
scriptSig: 304502206e21798a42fae0e854281abd38bacd1aeed3ee3738d9e1446618c4571d10
90db022100e2ac980643b0b82c0e88ffdfec6b64e3e6ba35e7ba5fdd7d5d6cc8d25c6b241501

Output:
Value: 100000000
scriptPubKey: OP_DUP OP_HASH160 404371705fa9bd789a2fcd52d2c580b65d35549d
OP_EQUALVERIFY OP_CHECKSIG
In input field, previous tx is the hash of the previous transaction. Index chooses the specific output in the transaction. ScriptSig contains the signature to satisfy the transaction conditions, for the recipient who is spending the bitcoin. This input uses the previous transaction of f5d8... and chooses output 0 of that transaction (as seen in Index: 0). 
In output field, scriptPubKey defines the conditions to spend the bitcoins. Value is the satoshi to be sent. One BTC is 100,000,000 satoshi. This output sends 1 BTC, it is sent to Bitcoin address 4043... 

P2PKH

scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
scriptSig: <sig> <pubKey>
As seen above, the scriptSig contains sender public key and signature of the sender. The scriptPubKey contains the hash of the receiver's public key. The recipient of P2PKH Bitcoin transaction, checks the signature and the public key hash. The public key must generate the hash that matches the pubKeyHash. The sender signature can be verified using the sender's public key.

The scriptSig part is what the receiver uses to spend the money that they got from P2PKH. Because they are spending the money, at that point they would be the new sender. The <pubKey> for this new transaction hashes to the receiver's <pubKeyHash> from the old transaction when they first got the money.
The checking process is as below:
StackScriptDescription
Empty.<sig> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGscriptSig and scriptPubKey are combined.
<sig> <pubKey>OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGConstants are added to the stack.
<sig> <pubKey> <pubKey>OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGTop stack item is duplicated.
<sig> <pubKey> <pubHashA><pubKeyHash> OP_EQUALVERIFY OP_CHECKSIGTop stack item is hashed.
<sig> <pubKey> <pubHashA> <pubKeyHash>OP_EQUALVERIFY OP_CHECKSIGConstant added.
<sig> <pubKey>OP_CHECKSIGEquality is checked between the top two stack items.
trueEmpty.Signature is checked for top two stack items.
For the source code, in the bitcoin-tx.cpp, there is a separate main() function that calls CommandLineRawTx(), which in turn calls MutateTx(). MutateTx() handles adding key and value to CMutableTransaction. The key and value are declare as std::string key, valueThe bitcoin-tx.cpp is compiled into bitcoin-tx binary.

In MutateTx(), depending on the command line option, MutateTxAddOutPubKey() adds value and scriptPubKey to the output, using public key from command line input. MutateTxAddOutAddr() adds value and scriptPubKey to the output, using address from command line input.

For bitcoind binary, in rest.cpp, the rest_getutxos(HTTPRequest *, std::string &) finds the txid of the UTXO, and calls ScriptPubKeyToUniv() to include the scriptPubKey in output.

In rest.cpp, the rest_tx(HTTPRequest*, std::string &) prepares the transaction with CTransactionRef class, then for JSON format, calls TxToUniv() to write the HTTP request.

In core_write.cpp, the TxToUniv(CTransaction&, uint256&, UniValue&) pushes scriptSig to vin and scriptPubKey to vout. For vout, it calls ScriptPubKeyToUniv() to include the scriptPubKey in vout.

The rest_tx() and rest_getutxos() are in uri_prefixes structure. This structure is registered in HTTP handler in StartREST().  StartREST() is called by AppInitServers() in init.cpp.
P2SH
P2SH lets the sender funds a transaction using 20 byte hash. The script supplied to redeem must hash to the scriptHash. 
Without P2SH, the scripts are shown below:
locking script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG
unlocking script: <sig1> <sig2>
With P2SH, the scripts become:

redeem script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG 
locking script: OP_HASH160 <redeem script Hash> OP_EQUAL
unlocking script: <sig1> <sig2>

So with P2SH, the locking script is simplified. The actual scripts in transactions are as below:

scriptPubKey: OP_HASH160 <scriptHash> OP_EQUAL 
scriptSig: <sig1> <sig2> OP_m <pubKey1> ... OP_n OP_CHECKMULTISIG
The locking script is OP_HASH160 <scriptHash> OP_EQUAL. The locking script is a simplified form of multisig script. From multisig script: 2 <pubKey1> <pubKey2> <pubKey3> 3 OP_CHECKMULTISIG, the mutisig script hashes to 20 byte value of 8ac1d7a2fa204a16dc984fa81cfdf86a2a4e1731. Therefore, the locking script becomes OP_HASH160 8ac1d7a2fa204a16dc984fa81cfdf86a2a4e1731 OP_EQUAL.
The scriptSig <sig1> <sig2> OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG, is presented when recipient wants to spend the bitcoins.

The checking process:
StackScriptDescription
Empty.OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIG OP_HASH160 <scriptHash> OP_EQUALredeem script checked with locking script, to make sure scriptHash matches
true<sig1> <sig2> OP_2 <pubKey1> <pubKey2> <pubKey3> OP_3 OP_CHECKMULTISIGunlocking script executed to unlock redeem script
trueEmpty.Signatures validated in the order of the keys in the script.
For the source code, to be continued...

Friday, 10 November 2017

Bitcoin core tutorial & code walk through (Part 8) - blockchain

In part 8, we will look at code that handles blockchain.

In validation.cpp,
ProcessNewBlockHeaders(std::vector<CBlockHeader>&, CValidationState&, CChainParams&, CBlockIndex**) loops through the CBlockHeaders, for each header, it calls AcceptBlockHeader().

**ProcessNewBlockHeaders() is called in ProcessMessage() in net_processing.cpp.

ProcessNewBlock(CChainParams&, std::shared_ptr<const CBlock>, bool, bool) calls CheckBlock(), if that is successful, it calls AcceptBlock(). Then, the best chain is activated using ActivateBestChain().

**ProcessNewBlock() is called in ProcessMessage() in net_processing.cpp.

AcceptBlock(std::shared_ptr<const CBlock>&, CValidationState&, CChainParams&, CBlockIndex**, bool, CDiskBlockPos*, bool) calls AcceptBlockHeader(). It checks if it already has the block and the block has more work to advance blockchain tip. It also checks for block height if the block is too too ahead for blockchain pruning. If the above conditions are fulfilled, it calls CheckBlock(). If header is valid and there is sufficient work, merkle tree and segwit merkle tree are good, calls NewPowValidBlock() to announce new block to peer nodes. Then, it calls WriteBlockToDisk(). Finally, the ReceivedBlockTransactions() is called to connect the new block to the chain.

AcceptBlockHeader(CBlockHeader&, CValidationState&, CChainParams&, CBlockIndex**) firstly checks for genesis block. If block header is already known, the block is not accepted. It checks for previous block and bad block. It calls CheckBlockIndex().

CheckBlockIndex(Consensus::Params&) iterates over the entire blockchain, and checks for consistency using CBlockIndex *pindex. For example, it checks for block height, the chainwork length. tree validity.

CheckBlock(CBlock&, CValidationState&, Consensus::Params&, bool, bool) calls CheckBlockHeader() for checking header validity. Next, it check merkle root. After that, it checks for size limits, duplicate coinbase. The transactions is checked using CheckTransaction().

In consensus/tx_verify.cpp.
CheckTransaction(CTransaction&, CValidationState&, bool) firstly check for empty vin and vout. Then, it checks for negative or too large output value, duplicate inputs.

Monday, 6 November 2017

Bitcoin core tutorial & code walk through (Part 7) - networking

In part 7, the tutorial will analyse the networking code of Bitcoin core.

The net.cpp and net_processing.cpp contain the bulk of the socket handling and network message processing.

In net.cpp:
In the CConnman::Start(CScheduler&, Options) , this function initialises the connection options, such as maximum connections, maximum buffer size, and starts threads. CConnman::Start() is called in AppInitMain() as "connman.Start(scheduler, connOptions)".

The threads are listed below.

  1. The ThreadSocketHandler reads from socket and puts the messages into vRecvMsg. The select() is used to listen to file descriptor sets, the accept() is used to accept from fdSetRecv, and recv() is used to read from fdSetRecv into the buffer. If received bytes is > 0, the CNode->ReceiveMsgBytes() is called to store buffer into vRecvMsg**.
  2. The ThreadMessageHandler reads the messages from vRecvMsg, processes and sends out the messages. The handler is a loop.  It loops through std::vector<CNode>. For each node, it calls ProcessMessages() to read messages and SendMessages() to send out messages.
  3. The ThreadDNSAddressSeed finds addresses from DNS seeds. It loops through std::vector<CDNSSeedData>. From the DNS host, it look up the IP address. If IP address is found, it stores the ip address and port number in std::vector<CAddress>.
  4. The ThreadOpenAddedConnections opens network connections to added nodes.  The handler loops through std::vector<AddedNodeInfo>. If not connected,  it calls function OpenNetworkConnection().
  5. The ThreadOpenConnections is a loop. It prepares feeler connection setup. Feeler connection is short lived connections, used to test if address is online or offline The purpose of feeler connections is to increase the number of online addresses. It opens network connection from "CAddrMan addrman" variable in CConnman. It uses the addrman to setup "CAddrInfo addr". If addr is valid and feeler flag is setup, It calls function OpenNetworkConnection().
**The vRecvMsg is declared as std::list<CNetMessage> in CNode.


In net_processing.cpp:
In ProcessMessages(CNode*, CConnman&, std::atomic<bool>&), if std::dequeue<CInv> vRecvGetData is not empty, it calls ProcessGetData() to get from CNode->vRecvGetData. Then, it declares std::list<CNetMessage> msgs, and uses splice() to get from CNode->vProcessMsg to msgs. vProcessMsg is also of type std::list<CNetMessage>. After that, it checks the header for validity, set message size, initialise CDataStream& vRecv, compare checksum. Subsequently, it calls ProcessMessage() and pass vRecv to it.

In ProcessMessage(CNode*, std::string&, CDataStream&, int64_t, CChainParams&, CConnman&, td::atomic<bool>&) , if the command type is VERSION, it deserialise the vRecv to nVersion, nServiceInt, nTime. It checks the services offered, if peers' services not matched, it pushes reject message to the peers and returns. If version is less than minimum required. it pushes reject message to the peers, and returns. The CConnman.PushMessage() is used to push messages. Otherwise, it push version ack message to the peers. Then, if fInbound is false, the code uses CNode->PushAddress() to advertise its own address.

If the command type is VERACK, it pushes send headers message to peers if version is greater than  sendheaders version. It pushes send compact block message to peers if version is greater than ids block version.

If the command type is ADDR, it reads from vRecv to std::vector<CAddress> vAddr. Then, it calls RelayAddress(). If addr is reacheable, it stores them by calling CConnman.AddNewAddress().

If the command type is SENDHEADERS, it sets the CNodeState fPreferHeaders to true.

If the command type is SENDCMPCT, it reads from vRecv to fAnnounceUsingCMPCTBLOCK and nCMPCTBLOCKVersion. Then it sets the CNodeState flags.

If the command type is INV, it reads from vRecv to std:vector<CInv> vInv. If size is too big, it calls Misbehaving(). Then, it loops through vInv, if the inventory msg type is MSG_BLOCK, it pushes get headers message.

If the command type is GETDATA, it also reads from vRecv to vInv, checks for size. After that, it calls ProcessGetData().

If the command type is GETBLOCKS, it reads from vRecv into locator and hashStop. It activates the best chain from most_recent_block by calling ActivateBestChain(). It uses locator and chainActive**, loop thru the chainActive, and push newly created CInv by calling PushInventory().

**chainActive is the blockchain, starts from genesis block and ends with tip, of class CChain, declared in validation.cpp. ActivateBestChain() is called in init.cpp to initialise blockchain.

If the command type is GETBLOCKTXN, it reads from vRecv to BlockTransactionsRequest req.  If older block is requested, it calls ProcessGetData() to send block response and returns. Otherwise, it read blocks from disk and calls SendBlockTransactions().

If the command type is GETHEADERS, it reads from vRecv into locator and hashStop. It pushes the headers message after using std::vector<CBlock> vHeaders to store CBlockIndex * pindex. The pindex is from locator value.

If the command type is TX, it reads from vRecv to CTransactionRef ptx. Then, it creates double ended queue of COutPoint (vWorkQueue) and vector of uint256 (vEraseQueue), creates CInv of MSG_TX. If inv is not available, it stores inv hash to vWorkQueue. It loops through if work queue is empty, calls RelayTransaction() and stores orphan hash to work queue and erase queue. If missing input is true, it sets that orphan parents are rejected. Else, it calls AddToCompactExtraTransactions() and lastly it checks for nDos flag.

If the command type is CMPCTBLOCK, it read from vRecv to CBlockHeaderAndShortTxIDs cmpctblock. It calls ProcessNewBlockHeaders(). If fAlreadyInFlight is set, push get data message out. It checks the chainActive height, check block transaction request from compact block tx count.

If the command type is BLOCKTXN, it reads from vRecv to BlockTransactions resp. It opens a shared_ptr to CBlock pblock, checks Read status from pblock and resp. If status is invalid, it calls Misbehaving() and returns. If status is failed,  it push get data message. Else it calls MarkBlockAsReceived().

If the command type is HEADERS, it reads from vRecv to CBlockHeader vector. If it is block announcement and headers is at the end, push get headers message. Then, calls UpdateBlockAvailability(). If header msg is at max size, peer may have more headers, push the get headers message again. If headers are valid, ends in block that is greater than block in the tip, download as much as possible by calling MarkBlockAsInFlight().

If the command type is BLOCK, it reads from vRecv to pblock, shared_ptr to CBlock. Then calls ProcessNewBlock().

If the command type is GETADDR, it loops thru addr from CConnman and push addr using CNode->PushAddress().

If the command type is MEMPOOL, it checks for bloom filter and bandwidth limit.

If the command type is PING, it push pong message with nonce. Nonce is read from vRecv.

If the command type is PONG, and vRecv.in_avail() is bigger than size of nonce, read vRecv to nonce. It checks nonce to find matching ping,  process pong msg only if there is matching ping.

The ProcessGetData(CNode*, Consensus::Params&, CConnman&std::atomic<bool>&), it loops thru  CNode->vRecvGetData. If inv type is MSG_BLOCK , MSG_FILTERED_BLOCK, MSG_CMPCT_BLOCK, or MSG_WITNESS_BLOCK, it calls ActivateBestChain() if not yet validated. It checks blocks for data, pushes message of BLOCK if MSG_WITNESS_BLOCK. If MSG_FILTERED_BLOCK , it needs to send merkle block, push block message  and serialise_transaction message. If MSG_CMPCT_BLOCK, it pushes message of block or compact block.


Code to Bitcoin protocol mapping

Referring to Bitcoin protocol, Bitcoin core nodes work on p2p network. The new nodes download blocks from sync nodes, using block-first or header-first method.

For block-first download method:
Messagegetblocksinvgetdatablock
From→ToIBD→SyncSync→IBDIBD→SyncSync→IBD
PayloadOne or more header hashesUp to 500 block inventories (unique identifiers, hash of block's header)One or more blockinventoriesOne serialized block
**IBD : initial block download, refers to new node which is just trying to download blocks

and for header-first download method:
Messagegetheadersheadersgetdatablock
From→ToIBD→SyncSync→IBDIBDManyManyIBD
PayloadOne or more header hashesUp to 2,000 block headersOne or more block inventories derived from header hashesOne serialized block

The ProcessMessage() code indeed processes the Bitcoin protocol messages.

Saturday, 4 November 2017

Bitcoin core tutorial & code walk through (Part 6) - signal

In part 6 of the tutorial, the signals used in bitcoin core will be discussed.

Looking at init.cpp, it defines registerSignalHandler() function. It registers signal handler to a signal in linux style. The "struct sigaction" is same as what is available in linux. In AppInitBasicSetup(), three signal handlers are registered respectively for SIGTERM, SIGINT, SIGHUP.

In net.h, "struct CNodeSignals" is declared. It defines boost C++ library style signal. This line
boost::signals2::signal<bool (CNode*, CConnman&, std::atomic<bool>&), CombinerAll> ProcessMessages
means a ProcessMessages signal is defined. The return type of the connector is bool, the connector takes in 3 parameters. The "CombinerAll" is a combiner. 

In net_processing.cpp, compare the function declaration bool ProcessMessages(CNode*, CConnman&, const std::atomic<bool>&)
with the signal connector signature. The signature exactly matches.

In net_processing.cpp, the slot is connected to the signal in RegisterNodeSignals(CNodeSignals& nodeSignals).
Both the slot and signal are called ProcessMessages, as seen in the line below !! nodeSignals.ProcessMessages.connect(&ProcessMessages);

In net.h, it defines the combiner. The result_type is the return value of the combiner.
struct CombinerAll {
    typedef bool result_type;


    template<typename I>

    bool operator()(I first, I last) const

    {

        while (first != last) {

            if (!(*first)) return false;
            ++first;
        }
        return true;
    }
};

The combiner takes in two input iterator, "first" and "last". It compares all connector return values, and returns true if all values are equal.

In validationinterface.h, "struct CMainSignals" is declared. It also defines boost C++ library style signal. This line

boost::signals2::signal<void (const CBlockIndex *, const CBlockIndex *, bool fInitialDownload)> UpdatedBlockTip;

means for the UpdatedBlockTip signal, the connector returns void, and takes in 3 parameters.

In validationinterface.cpp, the connector is

g_signals.UpdatedBlockTip.connect(boost::bind(&CValidationInterface::UpdatedBlockTip, pwalletIn, _1, _2, _3));

The boost:bind() is used. It means it stores a copy of pwalletIn->UpdateBlockTip(_1, _2, _3). That is the three paramters as in the connector.

In validation.cpp, the UpdatedBlockTip signal is called with three parameters.

GetMainSignals().UpdatedBlockTip(pindexNewTip, pindexFork, fInitialDownload);

So, for headless bitcoin core source code, linux and boost style signal are used.

Thursday, 5 October 2017

Linux Information

1. Linux Auto Login

Sometimes we may want to skip the linux login prompt, for example, a remotely-powered linux server in the basement of a building. Here is how we do it on RedHat linux 7.3.

  • modify /etc/inittab , comment out:
    1:2345:respawn:/sbin/mingetty tty1
    Add:
    1:2345:/usr/bin/openvt -f -c 1 -w -- /bin/login -f root
    -c 1 for console 1
    -- for separator

  • modify ~/.bash_profile, add:
    startx
    to start the X windows
  • modify ~/.xinitrc, add:
    gnome-session
    or
    startkde

2. Linux Programming

Linux Command Line Tools
Some useful tools are listed here.
% nm <object-files>
list symbols from object files

% readelf -d <elf-files>
display dynamic info about ELF files

% ldd <executable-files>
list shared library dependencies

% ar crv <lib-files> <object-files>
create static library

***GNU Make***
Make commands is used extensively in Linux software development.

A) Make command line syntax:
The `-s' or `--silent' flag prevents all echoing, as if all commands are started with `@'.
eg. make -s zImage
The `-f' flag specifies a name of the makefile.
eg. make -f myMakefile
We can run make -n <target>, to tell make to print out what it should have done, without actually doing it.

B) The general Makefile syntax is:
target : dependency
 rules (or commands)
The command lines start with a tab character. Blank lines and lines of just comments may appear among the command lines; they are ignored. (But beware, an apparently "blank" line that begins with a tab is not blank! It is an empty command.)
To make a particular target, we need to pass its name to make as a parameter. Without a parameter, make will try to make the first target listed in the makefile.
Make invokes a shell for executing rules, and uses a new shell for each rule. To let all script commands appear on one logical line, we must add backslash at the end of each line. The @sign will tell make not to print out each command on standard output.

C) Makefile contents:
Makefiles contain five kinds of things: explicit rules, implicit rules, variable definitions, directives, and comments.

directive:
     include filenames...

When make processes an include directive, it suspends reading of the containing makefile and reads from each listed file in turn. When that is finished, make resumes reading the makefile in which the directive appears.
If you want make to simply ignore a makefile which does not exist and cannot be remade, with no error message, use the -include directive instead of include, like this:
     -include filenames...

D) Other Makefile stuff:
Targets that do not refer to files but are just actions are called phony targets.
objects = main.o kbd.o command.o display.o \
               insert.o search.o files.o utils.o

.PHONY : clean
     clean :
             -rm $(objects)

clean:
 -rm -f *.o
The rules of making the target "clean" comes without dependency. This mean that the target is always considered out of date, and its rule is always executed. Also notice that the command starts with "-". This tells make to ignore the result of the command, and so make clean will always succeed. The `-' is discarded before the command is passed to the shell for execution.
For example,

     clean:
             -rm -f *.o
A few special macros are defined in make. Here is the summary.
foo.o : foo.c
 gcc -c $(CFLAGS) %^ -o $@

$^ is the dependency (foo.c)
$@ is the target (foo.o)

foo.o : foo.c defs.h hack.h
 gcc -c $(CFLAGS) $< -o $@

$< is the first dependency (foo.c)

.c.a:
 gcc -c $<
 ar rv $@ $*.o

$* is the name of the target without the suffix
Make has a special syntax for dealing with library files. The syntax is lib(file.o). It means that the object file file.o is stored in the library file lib.a.
$(LIBRARY): $(LIBRARY)(db_api.o)
db_api.o: db_api.c db_api.h
For a project with multiple subdirectories, a main makefile in the main directory will invoke the sub-makefiles. The syntax is like this:
(cd subdirectory;$(MAKE))
or, equivalently, this:
$(MAKE) -C subdir
The brackets ensure that it is all processed by a single shell. Since a new shell is invoked for this, the program running the make doesn't execute the cd command. Only the shell invoked to carry out the rule is in a different directory.

E) Variable substitution
To substitute a variable's value, write a dollar sign followed by the name of the variable in parentheses or braces: either `$(foo)' or `${foo}' is a valid reference to the variable foo. This special significance of `$' is why you must write `$$' to have the effect of a single dollar sign in a file name or command.
The first flavor of variable is a Recursively expanded variable. Variables of this sort are defined by lines using `=' or by the define directive.
foo = $(bar)
bar = $(ugh)
ugh = Huh?

all:@echo $(foo)
It will echo `Huh?'. `$(foo)' expands to `$(bar)' which expands to `$(ugh)' which finally expands to `Huh?'.
To avoid all the problems and inconveniences of recursively expanded variables, there is another flavor: Simply expanded variables. Simply expanded variables are defined by lines using `:='
foo := $(bar)
bar := $(ugh)
ugh := Huh?

all:@echo $(foo)
This will echo nothing, as $(foo) is undefined here.
There is another way, called a Conditional variable assignment operator, because it only has an effect if the variable is not yet defined. This statement:
  FOO ?= bar
is exactly equivalent to this:
ifeq ($(origin FOO), undefined)
  FOO = bar
endif

F) String Substitution (in Makefile):
$(patsubst pattern,replacement,text)
Finds whitespace-separated words in text that match pattern and replaces them with replacement. Here pattern may contain a `%' which acts as a wildcard, matching any number of any characters within a word. If replacement also contains a `%', the `%' is replaced by the text that matched the `%' in pattern. `%' characters in patsubst function invocations can be quoted with preceding backslashes (`\'). Whitespace between words is folded into single space characters; leading and trailing whitespace is discarded.
For example,
$(patsubst %.c,%.o,x.c.c bar.c)
produces the value `x.c.o bar.o'.
Substitution references are a simpler way to get the effect of the patsubst function:
$(var:pattern=replacement)
is equivalent to
$(patsubst pattern,replacement,$(var))
The second shorthand simplifies one of the most common uses of patsubst: replacing the suffix at the end of file names.
$(var:suffix=replacement)
is equivalent to
$(patsubst %suffix,%replacement,$(var))
For example, you might have a list of object files:
objects = foo.o bar.o baz.o
To get the list of corresponding source files, you could simply write:
$(objects:.o=.c)
instead of using the general form:
$(patsubst %.o,%.c,$(objects))
Another example in real Makefile:
SUB = server client

%.build:
 (cd $(patsubst %.build, %, $@) && $(MAKE))

%.clean:
 (cd $(patsubst %.clean, %, $@) && $(MAKE) clean)

all:  $(patsubst %, %.build, $(SUB))
clean:  $(patsubst %, %.clean, $(SUB))

G) Conditional statement (in Makefile):
Three types: if-else-fi, ifeq-else-endif, ifdef-else-endif.
@if [ -z $(DESTDIR) ]; then \
    /sbin/depmod -ae ; \
elif [ -f $(SYSTEMMAP) ]; then \
    /sbin/depmod -ae -b $(DESTDIR) -F $(SYSTEMMAP) ; \
else \
    echo "Don't forget the target system."; \
fi
ifeq ($(strip $(foo)),)
  text-if-empty
else
  text-if-not-empty
endif
Don't specify $(foo) for variable referencing if used in ifdef statement
foo = $(bar)
ifdef foo
  frobozz = yes
else
  frobozz = no
endif


Using GNU GDB and gdbserver
GDB can be used to debug programs written in C and C++. GDB distribution contains gdbserver in one of its subdirectory. We run gdbserver on target platform in order to save space, and use GDB on host to connect to the gdbserver for debugging. For more information, click here.

Compiling and generating library images
The sequences of generating a *.a (static library)
a) gcc -c xxx.c
b) ar crv libxxx.a xxx.o

The sequences of generating a *.so (dynamic library)

gcc -c -fPIC xxx.c
gcc -shared -o libxxx.so xxx.o

Files, Pipes, Sockets
Everything is represented as a file under Linux. Even hardware devices are represented by files in Linux. The low-level system calls can be used to access the files, and similarly the hardware devices, pipes, sockets.
int open(const char *path, int flags);
- open system call returns a new file deescccriptor
size_t read(int fd, void * buf, size_t nbytes);
- read system call reads up to nbytes frrommm file associated with the file descriptor fd to buf
size_t write(int fd, const void * buf, size_t nbytes);
- write system call writes up to nbytes tooo the file referenced by the file descriptor fd from the buf
int close(int fd);
- close system call closes a file descriiptttor, so that it may be reused
int ioctl(int fd, int cmd, ...);
- ioctl system call provides an interfacce for controlling the behavior of devices
off_t lseek(int fd, long int offset, int whence);
- lseek system call sets the read/write pooointer of a file descriptor
The standard library and its header file stdio.h provide an interface to low-level system calls. The library provides functions to take care of the buffering of the devices. In standard library, the equivalent to a file descriptor is called a stream, and is implemented as FILE * .
FILE * fopen(const char *filename, const char modes);
- fopen library function returns a FILE * pointer
size_t fread(void * ptr, size_t size, size_t nitems, FILE * stream);
- fread library function reads to ptr frrommm the stream, for a record of size and a count of nitems
size_t fwrite(const void * ptr, size_t size, size_t nitems, FILE * stream);
- fwrite library function writes from thhe ptr to the stream, for a record of size and a count of nitems
int fclose(FILE * stream);
- fclose library function closes the speeciiified stream
int fseek(FILE * stream, long int offset, int whence);
- fseek library function sets the positiionnn in the stream for the next read or write on that stream.
Pipe is a primitive form of inter process communications. It allows the data flow from one process to go to another process. For shell commands, it is entered as :
cmd1 | cmd2
For high-level pipe function, they will operate on file stream.
FILE * popen(const char * command, const char * open_mode);
It allows a program to invoke another program as a new process, and either pass data or receive data from it.
int pclose(FILE * stream);
It closes the file stream associated with the pipe.
At the same time, a low-lovel pipe function provides a way of passing data between two programs, without the overhead of invoking a shell to interpret the requested command.
int pipe(int fd[2]);
It is passed an array of two integer file descriptors. Any data written to fd[1] can be read back from fd[0]. Low-level system calls, read and write, are used to access the data.
A named pipe exists in the file system as a special type of file, but behaves like the unnamed pipes we discussed earlier. It is called FIFO too.
int mkfifo(const char * filename, mode_t mode);
It creates a named pipe, using absolute pathname.
We can remove the FIFO by using the rm command, or from within a program by using the unlink function.
A FIFO exists as a named file, not as an open file descriptor. It must be opened before it can be read from or written to. The open and close system calls can be used for the purpose. The read and write system calls can be used to access the FIFO after it is opened.

Using GCC __func__ macro
GCC provides three magic variables which hold the name of the current function, as a string. The first of these is __func__, which is part of the C99 standard:
The identifier __func__ is implicitly declared by the translator as if, immediately following the opening brace of each function definition, the declaration
           static const char __func__[] = "function-name";
appeared, where function-name is the name of the lexically-enclosing function. This name is the unadorned name of the function.
__FUNCTION__ is another name for __func__. Older versions of GCC recognize only this name. However, it is not standardized. For maximum portability, we recommend you use __func__, but provide a fallback definition with the preprocessor:
     #if __STDC_VERSION__ < 199901L
     # if __GNUC__ >= 2
     #  define __func__ __FUNCTION__
     # else
     #  define __func__ ""
     # endif
     #endif
In C, __PRETTY_FUNCTION__ is yet another name for __func__. However, in C++, __PRETTY_FUNCTION__ contains the type signature of the function as well as its bare name, such as "void a::sub(int)".
Click here to see the examples: exam.c and exam.h file.

Debugging
There are 8 severity levels in linux kernel, defined in <linux/kernel.h>. DEFAULT_MESSAGE_LOGLEVEL is specified in kernel/printk.c, and applied to printk with no specified priority. If priority is less than console_loglevel, the message is displayed. If both klogd and syslogd are running on the system, kernel message are appended to /var/log/messages, independent of console_loglevel.
In <linux/kernel.h>, #define console_loglevel as DEFAULT_CONSOLE_LOGLEVEL. As printk writes message to a circular buffer, it then wakes up any process that is waiting for the message. If klogd is running, it retrieves kernel message and dispatch them to syslogd, which in turn check the settings in /etc/syslog.conf.

To use syslogd:





  • /etc/init.d/syslog -- script file
  • /etc/syslog.conf -- config file
  • create /var/log directory.

    An example syslog.conf config file:
    # cat syslog.conf
    *.*                      /var/log/messages
    

    AWK 
    It is an interpreted programming language for performing complex text processing tasks. It is also a simple text processing utility. It stands for the names of its authors: Aho, Weinberger & Kernighan.
    simple syntax:
    awk <search pattern> {<program actions>} <data filename>
    eg. awk '/gold/ {print $5,$6,$7,$8}' coins.txt
    full syntax:
    awk 'BEGIN   {<initialization>}
     <search pattern 1> {<program actions>}
     <search pattern 2> {<program actions>}
     ...
     END   {<final actions>}'
    
    eg. awk 'END {print NR, "conis"}' coins.txt
    
    NR - Awk's pre-defined varaibles, stands for Number of Records
    Awk regards each line of input data as composed of multiple fields, which are essentially words separated by blank spaces. A blank space is the default "field separator". To tell Awk to use another field separator, use -F.
    eg. awk -F\"  -- use " as the separator
    
    eg. awk -F\" '/REL/ {print $$2}' include/linux/version.h
    

    SED
    Stream editor
    sed -e s/<reg expr>/<replace text>/{flags}
    
    eg. sed -e '1,2 s/line/LINE/' test.txt
    eg. sed -e 's/cat/dog/g' test.txt
    
    regular expr
    ^ - match the beginning of the line
    $ - match the end of the line
    . - match any single character
    * - match arbitrary many occurrences of charater
    ? - match 0 or 1 instance of character
    

    Spinlock and Semaphore
  • Spinlocks are very small and fast, and can be used anywhere. If your task can't get the spinlock, your task keeps trying (spinning) until your task can.
  • Semaphore can have more than one holder at any time (the number decided at initialization time), although it is most commonly used as a single-holder lock (known as mutex). If your task can't get a semaphore, your task will put itself on the queue, and be woken up when the semaphore is released. This means the CPU will do something else while your task is waiting.
  • Semaphore is used for synchronization between user contexts. User Context means the kernel is executing on behalf of a particular process (ie. a system call or trap) or kernel thread. This is not to be confused with userspace. It can be interrupted by software or hardware interrupts.
  • Spinlock is used for synchronization between user context and interrupt context. spin_lock_bh() disables softirqs on that CPU, then grabs the lock. spin_lock_irq() is defined to disable interrupts on that cpu, then grab the lock.

    Exporting Symbols
  • In source file:
     #define EXPORT_SYMTAB
    
    and then after you write your function:
     EXPORT_SYMBOL(your_function_name);
    
  • In Makefile:

  •  export-objs := filename.o
    

    Networking commands
    Commands for device statistics:
    cat /proc/net/dev
      show device statistics
    cat /proc/net/snmp
      show snmp statistics
    
    
    Commands for routing table:
    route add -net 192.168.2.0 netmask 255.255.255.0 gw 192.168.1.1
      add gateway 192.168.1.1 to routing table for network 192.168.2.0
    route del -net 192.168.2.0/24
      delete routing table entry
    route -n
      show routing table status
    

    Commands for ethernet:
    ifconfig eth0 txqueuelen 10000
      increase the transmit queue of the network interface
    echo 1 > /proc/sys/net/ipv4/ip_forward
      enable ip forwarding
    /sbin/ethtool -K eth0 tso off
      disable TCP segment offloading
    /bin/uname -r
      show kernel version
    
    netstat -p --tcp
      show network connection for tcp protocol

    Reset procedures for MIPS and ARM




  • MIPS IDT438
    sys_reboot (kernel/sys.c)
    --> machine_restart (arch/mips/kernel/reeseet.c)
        --> _machine_restart (assigned in arch/mips/rc32438/79EB438/setup.c)
           --> idt_reset (arch/mips/rc32438/79EB438/reset.c)
    
  • ARM IXP425

  • sys_reboot (kernel/sys.c)
    --> machine_restart (arch/arm/kernel/prooceess.c)
        --> arch_reset (include/asm-arm/arch-ixp425/system.h)
    

    3. Shell Script Programming

    A. Passing parameters to shell script
    As a simple example, i have a shell script which requires to take in one parameter. I name it "cp_script".
    #!/bin/sh
    echo "Parameter " $1
    
    Then i call the shell script on the command line and pass it a parameter:
    $./cp_script madwifi
    Parameter madwifi
    
    That's it, using $1 in shell script to take in the command line parameter.

    B. If-else-fi in shell script:
    [ expr ] is used to see if an expression is true.
    if [ -f $dir/$file ] || [ -f $dir/$newfile ]; then
        echo "Either this filename [$file] exists"
        echo "Or this filename [$newfile] exists"
    elif [ -d $dir ]; then
        echo "This dirname [$dir] exists"
    else
        echo "Neither [$dir] or [$file or $newfile] exist"
    fi
    

    C. Variable declaration:
    We use something as:
    #!/bin/sh
    
    TC="/sbin/tc"
    
    $TC qdisc ls
    $TC qdisc del dev eth0 root
    $TC qdisc add dev eth0 root pfifo_fast
    
    
    
    
    D. The getopt command
    The getopt command is used to parse the command line parameters. It is made up of two parts: options and non-option parameters.
    -o specifies the short option, -n is the name of the program, -- is the start of non-option parameters
    #! /bin/bash
    
    echo "param" $@
    TEMP=`getopt -o nr -n oudanhodou -- "$@"`
    echo "temp" $TEMP
    unset TEMP
    
    
    Execution:
    $ ./oudanhodou -r
    param -r
    temp -r --
    
    $ ./oudanhodou -r fresh
    param -r fresh
    temp -r -- 'fresh'
    

    4. Linux Setup

    A. Linux network setup on Redhat 9
    /etc/hosts
    127.0.0.1  localhost
    
    /etc/host.conf
    order host,bind
    
    /etc/resolv.conf
    nameserver  192.168.4.254
    
    It is the IP of DNS server.
    /etc/services
    
    It contains port allocation information.

    B. Change ip address
    /etc/sysconfig/network-scripts/ifcfg-eth0
    IPADDR=192.168.4.239
    DHCP_HOSTNAME=192.168.4.251
    
    /sbin/ifup eth0
    
    C. Use dhcp client
    execute redhat-config-network
    
    then execute /sbin/dhclient
    
    settings in /etc/dhclient-eth0.conf
    
    For Dhcp server and client settings on embedded system, click on DHCP info.

    D. Update linux kernel
    1. create floppy boot disk
    ls /lib/modules
     2.4.2-2
    
    mkbootdisk --device /dev/fd0 2.4.2-2
    
    2. clean old kernel config
    make mrproper
    
    3. change whatever settings you want
    make menuconfig
    
    4. update the change
    make dep
    
    5. make it
    make bzImage
    
    6. make the modules
    make modules
    
    7. install the newly made modules
    make modules_install
    
    8. update the boot loader config file with new kernel information
    /etc/lilo.conf
    
    9. read from new boot loader config file and store into boot sector
    /sbin/lilo -v
    
    
    E. lilo information
    "/sbin/lilo -v" read the /etc/lilo.conf file, to determine what to write to MBR.
    lilo.conf example
    boot=/dev/hda #install lilo in the first harddisk
    map=/boot/map
    install=/boot/boot.b #specify new boot sector file
    prompt
    timeout=50 #wait for 5 sec
    lba32    #describe harddisk geometry
    default=linux
    image=/boot/vmlinus-2.4.20-8 #specify linux kernel
     label=linux
     root=/dev/hda1  #specify root partition
    image=/boot/vmlinus
     label=failsafe
     root=/dev/hda1
     initrd=/boot/initrd.img
    other=/dev/hda2
     label=windows
     table=/dev/hda
    
    
    F. Make ramdisk
    mkdir ramdisk
    cd ramdisk
    dd if=/dev/zero of=initrd bs=1k count=8192
     bs is the block size
    /sbin/mke2fs -vFm0 initrd 8192
    mkdir mnt
    sudo mount -o loop initrd mnt
    cd mnt
    cp -a .....
    cd ..
    sudo umount initrd
    
    To do "mount -o loop" on target platform:
    1. kernel got to have loopback device support enabled
    2. On file system, create /dev/loop file
    3. kernel got to enable ext2 support
    
    target bootloader configured with:
    bootparm1=/root=/dev/ram init=/linuxrc rw
    
    
    G. Starting X windows
    startx > log 2>&1
    gcc -Wl,-v 2>&1 | grep "GNU ld"
    

    H. Sudo List
    In /etc/sudoers file, we can add in a list of users to specify their permissions.
    User_Alias   SW_STAFF = julian, hunk, yeosv
    # User privilege specification
    root   ALL=(ALL) ALL
    SW_STAFF  ALL=NOPASSWD: ALL
    newcomer ALL=(ALL) NOPASSWD: ALL
    
    
    I. Kernel Booting Sequences
    This is the kernel booting sequences for ARM architecture processor. This information may apply to other architecture with minor modification.
    (arch/arm/boot/compressed/head.S)
     -> setup the stack
     -> call decompressed_kernel (arch/arm/boot/compressed/misc.c)
     -> jump to decompressed code
    
    (arch/arm/kernel/head.S)
    _stext   -> start_kernel (init/main.c)     [ task 0, idle task ]
       -> setup_arch (arch/arm/kernel/setup.c)
       -> trap_init
       -> init_IRQ
       -> sched_init
       -> softirq_init
       -> time_init
       -> console_init
       -> init_modules
       -> kmem_cache_init
       -> calibrate_delay
       -> mem_init
       -> kmem_cache_sizes_init
       -> fork_init
       -> proc_caches_init
       -> vfs_caches_init
       -> ...
       -> rest_init             [ launch init kernel thread ]
        -> init (init/main.c)     [ the "init" kernel thread ]
         -> do_basic_setup
         -> prepare_namespace
         -> launch /sbin/init
    
    include/asm-arm/arch-ixp425/memory.h
     contains PAGE_OFFSET and PHYS_OFFSET
    
    arch/arm/boot/Makefile
     contains ZTEXTADDR and ZRELADDR

    Saturday, 26 August 2017

    Bitcoin core tutorial & code walk through (Part 5) - rawtransaction

    1) In part 5, we will look at Bitcoin raw transaction. Raw transaction gives you full manual control of the Bitcoin transaction, with the possibility of making errors and losing bitcoins permanently.

    The bitcoin-cli command can be used to create raw transaction to send to any recipient address. There is decode raw transaction command to display the raw transaction that we have created. After that, we can use send raw transaction command to actually spend the bitcoin.

    For testing purpose, the regtest network is used.


    Firstly, we use listunspent command to find out the UTXO (the txid) in the wallet
    ./src/bitcoin-cli -regtest listunspent 

    Then, we use getaccountaddress to find out the default address (the address) 

    ./src/bitcoin-cli -regtest getaccountaddress ""

    Then, we use createrawtransaction command to construct the raw transaction
    ./src/bitcoin-cli -regtest createrawtransaction "[{\"txid\":\"2ca49406de60010f88876ac142f7846647942b33522a3a10bc493df480cffb34\",\"vout\":0}]" "{\"mjQgnYbydSCupJzkQS9HDBoP1CGcZZQMjt\":0.01}

    The txid is from the listunspent command and address is from the getaccountaddress command. The vout of 0 is the vector output associated with the txid. The 0.01 is the amount to be paid to the address.

    ** when forming a new transaction, the txid of vin, must be a valid txid in listunspent, the vout of vin, must be a valid vout of that referenced transaction.

    We can see that relationship using decoderawtransaction and listunspent, this decoderawransaction command shows the output of the transaction that was created. 
    {
      "txid": "23eb6ae82550ce2dfcc9014c9226a755609195100139cdd57e58d2a0509ea175",
      "hash": "23eb6ae82550ce2dfcc9014c9226a755609195100139cdd57e58d2a0509ea175",
      "version": 2,
      "size": 85,
      "vsize": 85,
      "locktime": 0,
      "vin": [
        {
          "txid": "2ca49406de60010f88876ac142f7846647942b33522a3a10bc493df480cffb34",
          "vout": 0,
          "scriptSig": {
            "asm": "",
            "hex": ""
          },
          "sequence": 4294967295
        }
      ],
      "vout": [
        {
          "value": 0.01000000,
          "n": 0,
          "scriptPubKey": {
            "asm": "OP_DUP OP_HASH160 a3a350969090584be8887193c4eb96125ae06226 OP_EQUALVERIFY OP_CHECKSIG",
            "hex": "76a914a3a350969090584be8887193c4eb96125ae0622688ac",
            "reqSigs": 1,
            "type": "pubkeyhash",
            "addresses": [
              "mjQgnYbydSCupJzkQS9HDBoP1CGcZZQMjt"
            ]
          }
        }
      ]
    }

    Then, we sign the raw transaction
    ./src/bitcoin-cli -regtest signrawtransaction 020000000134fbcf80f43d49bc103a2a52332b94476684f742c16a87880f0160de0694a42c0000000000ffffffff0140420f00000000001976a9142ab10fb15ad20af203ee529a973cdbb1a0c2e68688ac00000000
    {
      "hex": "020000000134fbcf80f43d49bc103a2a52332b94476684f742c16a87880f0160de0694a42c0000000049483045022100b5928c21d2ea2436e669948e6eb4036a0d1de9b70210d3b103d475c9cf5b7b0b02200ac3a7fd690b16607ba833bef0e15dd84a21f0b609692727af7838dd7ac21f9d01ffffffff0140420f00000000001976a9142ab10fb15ad20af203ee529a973cdbb1a0c2e68688ac00000000",
      "complete": true

    }

    Finally, we spend the bitcoins.
    ./src/bitcoin-cli -regtest sendrawtransaction 020000000134fbcf80f43d49bc103a2a52332b94476684f742c16a87880f0160de0694a42c0000000049483045022100b5928c21d2ea2436e669948e6eb4036a0d1de9b70210d3b103d475c9cf5b7b0b02200ac3a7fd690b16607ba833bef0e15dd84a21f0b609692727af7838dd7ac21f9d01ffffffff0140420f00000000001976a9142ab10fb15ad20af203ee529a973cdbb1a0c2e68688ac00000000
    error code: -26
    error message:
    256: absurdly-high-fee

    See here, if the ( input - output ) = transaction fee is too big, bitcoin core will reject to send the transaction.

    2) In this analysis, the source code that handles raw transaction RPC will be walked through.
    rpc/rawtransaction.cpp
    UniValue createrawtransaction(const JSONRPCRequest& request)
    This function creates an object of CMutableTransaction rawTx. It then sets the rawTx.nLockTime from parameter 2. It parses txid and vout, and push CTxIn object into rawTx. The function checks for output data or address. If address is used, it sets CScript scriptPubKey with the address. The the CTxOut is set with the amount and scriptPubKey. Lastly, it encodes rawTx with the function EncodeHexTx(rawTx) and returns the hex.

    UniValue decoderawtransaction(const JSONRPCRequest& request)
    The function creates an object of CMutableTransaction mtx, call DecodeHexTx() into mtx. It then calls TxToUniv() and returns the UniValue result.

    UniValue sendrawtransaction(const JSONRPCRequest& request)
    The function creates an object of CMutableTransaction mtx, decode the parameter with DecodeHexTx(). It creates CTransactionRef object , with the data of decoded mtx. It checks the mempool, the connection manager. If connection manager (g_connman) is available, it creates CInv object and push the inventory into CNode * pnode. It returns the transaction hash (txid).

    UniValue signrawtransaction(const JSONRPCRequest& request)
    The function creates std::vector<unsigned char> txData from the raw transaction hex data, and then CDataStream is made from the txData.
    Next, the std::vector<CMutableTransaction> txVariants is made from the CDataStream objects. Then, a new CMutableTransaction mergedTx is created from txVariants.
    Then, then CCoinsView and CCoinsViewCache are created to fetch previous inputs. 
    If the optional key is passed as parameters, it is stored in CKey object.
    It previous txout is passed as parameters, a COutPoint object and Coin are created to add to CCoinsView object.

    It also checks if redeem script is given or not in the parameters.
    Then, CKeyStore object is added. 
    In the signing part, a SignatureData sigdata is created. ProduceSignature() is called to sign and CombineSignature() is called to merge other signatures. The UpdateTransaction() is called to update the mergedTx object. Finally, the UniValue result is filled with hex and complete field and returned.

    script/sign.cpp
    bool ProduceSignature(const BaseSignatureCreator& creator, const CScript& fromPubKey, SignatureData& sigdata)
    This function takes the CScript, SignatureData, and BaseSignatureCreator, and calls SignStep() to update boolean variable named "solved". Depending the type of TX_SCRIPTHASH or TX_WITNESS_V0_KEYHASH or TX_WITNESS_V0_SCRIPTHASH, it processes and updates the sigdata. The return value is a logical "and" of VerifyScript() and the bool variable "solved".


    static bool SignStep(const BaseSignatureCreator& creator, const CScript& scriptPubKey, std::vector<valtype>& ret, txnouttype& whichTypeRet, SigVersion sigversion)
    The function consists of a big switch-case conditions. Depending on the key type, different actions are taken to sign the public key. The key type is decided using Solver().

    script/standard.cpp
    bool Solver(const CScript& scriptPubKey, txnouttype& typeRet, std::vector<std::vector<unsigned char> >& vSolutionsRet)
    This function firstly setup a std:multimap templates of txnouttype and CScript. If transaction is pay to script hash, or witness keyhash, or witness script hash, or null data; set the transaction type and return. If not, the function enters a for loop. In the for loop, it setup two CScript iterator, compare the iterators. If there is a match, it returns the transaction type if it is not TX_MULTISIG. Otherwise, the function set type to TX_NONSTANDARD and returns false.