Sunday 24 May 2020

Begin Machine Learning as a software engineer

In this post, i am writing about how to start applying machine learning (ML) in software program as a software engineer.
The picture below shows that in ML, we build ML model using data and results. This stands in contrast to traditional programming where we use software program to do the actual computation.
After the model is built, we use it to make predictions.
For sure, we can always create our own ML model. In order to do that, we have to get a rather good understanding of ML algorithms. If we are not able or not willing to create our own ML model (such as when we want to apply ML to a practical solution), we can re-use existing models in the ML libraries.
In machine learning (ML) programming, we choose the existing model, build the model architecture (such as a classifier), feed training data to the model, and use the trained model to make decisions on newly arrived data.
The difference between algorithm and model is:
This is the algorithm of linear regression with one variable 𝑦=𝑤0+𝑤1x
This is the model after applying data and results 𝑦=(5)+(-2)x
The purpose of training a model, is to adjust the model parameters so that the model fit well with the user supplied data.
We choose keras library as a starting point. There is Sequential model in the keras library. For this article, we look at the bank customer data and decides if the customer would stop using the bank’s services. This is a classification problem. We will use the Sequential model to do the classification.
In part one, we will do data preprocessing. Firstly, we import the numpy and pandas library. Pandas is a data manipulation library.
import numpy as np
import pandas as pd
Secondly, we import the dataset using pandas.
dataset = pd.read_csv(‘Churn_Modelling.csv’)
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
Then, we encode the categorical data (one hot encoding).
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]
We split the dataset into the training set and test set.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
After that, we apply feature scaling to the data so that data is adjusted to a particular range.
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
In part two, we will make an artificial neural network. Firstly, we import keras library.
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
Secondly, we use sequential model to build a ML classifier. Sequential model is a model with a sequence of layers; input, hidden, output layers.
classifier = Sequential()
We add the input layer and the first hidden layer.
classifier.add(Dense(units = 6, kernel_initializer = ‘uniform’, activation = ‘relu’, input_dim = 11))
We add the second hidden layer.
classifier.add(Dense(units = 6, kernel_initializer = ‘uniform’, activation = ‘relu’))
Lastly, we add the output layer.
classifier.add(Dense(units = 1, kernel_initializer = ‘uniform’, activation = ‘sigmoid’))
Now, compile the classifier. The adam optimizer is an adaptive moment estimator. N-th moment of random variable is the expected value of the variable to power of N. Optimizer decides how network weight will be updated.
classifier.compile(optimizer = ‘adam’, loss = ‘binary_crossentropy’, metrics = [‘accuracy’])
For binary classification, the binary_crossentropy loss function is suitable and should be used.
We fit the training set to the classifier.
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100)
In part three, we use the classifier to make predictions and evaluating the model. Firstly, we feed the test data to the classifier.
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
Secondly, we use the confusion matrix to evaluate the model.
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
When i examine the cm value, it is shown as:
It means out of 2000 samples, it makes correct predictions 1595 times.
The python and CSV files can be found at:

Sunday 2 February 2020

Bug descriptions and solutions

This post is about bugs and their solutions on embedded Linux.

Bug 1
Remove internet connectivity on Gateway device. For the first time, plug in USB device with swupdate.swu, no firmware update is performed. For the second time, plug in USB device with swupdate.swu, firmware update occurs. When internet connectivity is removed, it should not do firmware update. So it is a bug.

Solution:
Check the gateway system time against x.509 certificate validity period. If system time is outside x.509 validity period, swupdate does not run. That is because swupdate software checks the x.509 certificate. The problem has nothing to do with internet connectivity.

# openssl x509 -in x509.crt -text -nount
validity: not before
              not after
# date
-- show system time

Compare the system time and x509 certificate validity

Bug 2
When using mobile App to connect to Gateway device via BLE, multiple occurrence of DBUS error message "Rejected send message" is seen in journal log. This error message should not happen.

Solution:
Using dbus-monitor, we can see the dbus system message. There is method call from Bluez to config daemon, there is a signal and method return in response to the method call. The method return is rejected by the dbus daemon and dbus error message is printed out.

The gateway sends GATT Indicate to mobile app, mobile app responds with GATT Confirm. The Indicate is used to transfer data. Confirm is the acknowledgement. When mobile app sends Confirm to gateway, Bluez receives it, it sends dbus confirm to config daemon. And the next chunk of data is sent from config daemon to Bluez. The method return is also sent out by the config daemon. The config daemon uses dbus python library.  The solution is in dbus python library. We have to check the dbus must_not_reply flag. If the flag is set, we stop the library from sending out method return.

Bug 3
If Gateway is physically connected to ethernet and has a wifi, and if ethernet link is not able to up, the gateway is not able to use wifi. The gateway should be able to auto failover to wifi.

Solution
Use network manager to check the link, if the ethernet link is not up, auto switch to wifi. The network manager config file provides checking of the link status.

Bug 4
When doing high throughput network application over ethernet , the CPU is loaded to over 90%.
This slows down the response time of other processes.

Solution:
The ethernet driver was using interrupt  driven mode, switching to NAPI , and implement TSO  so that segmentation of the packets is handled by NIC card.  This way reduces the CPU load.

Thursday 4 April 2019

Security Token Offering

Concept

Security tokens are cryptographic blockchain based tokens that represent financial assets such as bonds, notes, debentures, shares, options, and private equities; as well as tokenised real assets.

It allows fractional ownership of assets. It meets regulation scrutiny.

Companies use STO to raise money from investors. STO investors are promised gains in the form of dividends, rewards (interest rates), or increase in the value of the company.


STO standards

-  ST-20, from Polymath, basically is ERC20 with whitelist investor
-  ERC1400, draft standard, tranches of security (different filings of same
underlying security, Reg D for U.S. investors and Reg S for foreigners), ERC
20 compatible, incorporates ERC1410
-  ERC1410, partially fungible token (organise tokens into set of partitions)
-  ERC1404, draft standard, with transfer restrictions, ERC20 compatible
-  R-Token, from Harbor, ERC20 with additional compliance checking

ERC1400
-  Transfer of tokens can be reversed
-  Token balance includes metadata - shareholder rights , other restrictions
-  Token separated into tranches
-  Standard UI to query transfer
-  Standard event for redemption and issuance
-  ST-20 results in ERC1400

ERC1404
-  Maintain a whitelist of investor addresses
-  Enforce complex restrictions
-  Support branded standards, such as ST-20 and R-token

STO issuance platforms
- Polymath, with DAPP for STO token issuance, using Poly tokens
- Harbor, using R-tokens
- Securitize, using DS protocols, issues security tokens on XRP and Ethereum
- Swarm, using src20 protocol
- Securrency, using CAT-20 token, with KYC and AML engines, compatible
with any blockchain
- tZERO, using tZERO token

Polymath STO steps
- Register ticker symbol
- Deploy smart token contract
- Add investor to whitelist
- Mint tokens for shareholder
- Setup STO parameters - start date, end date, supply cap
- Starts STO - deploy STO contract
It has modular approach: STO module, Transfer Manager module, etc.
It uses smart contract method, such as verifyTransfer method in Transfer Manager module, to validate transfer.



Wednesday 6 March 2019

Rails setup problems and solutions

After setting up rails, you could do 'rails -v', you get 'Rails 5.2.2'.

You copy or clone an existing rails project, cd to the project folder. You do 'rails -v', you get the error:
home/<username>/.rbenv/versions/2.5.3/lib/ruby/2.5.0/rubygems/core/ext/kernel_require.rb:59:in `require': cannot load such file -- bundler/setup (LoadError)

You run 'bundle install', you get the error:
home/<username>/.rbenv/versions/2.5.3/lib/ruby/2.5.0/rubygems.rb:289:in `find_spec_for_exe': can't find gem bundler (>= 0.a) with executable bundle (Gem::GemNotFoundException)

Then, to solve the problem, you run 'gem install bundler'.
You run 'bundle install' again, you still get the same error.

The solutions is:
Open Gemfile.lock in the project folder, check the BUNDLED_WITH . If it is 1.16.1, you need to do:
'gem install bundler -v 1.16.1'

Then, you can run 'bundle install'. It will be successful.

Similarly, you can run 'rails -v', it will show the version. The version is the one specified in Gemfile.lock



Friday 19 October 2018

Elliptic Curve Cryptography

An elliptic curve is a set of points that satisfies a math equation:
   y2 = x3 + ax + b

The graph looks like

‌The graph has interesting properties
any point on curve can be reflected on x-axis , and remains on the curve
any non-vertical line can intersect at most 3 points on the curve
Easy to go forward, hard to reverse -> property of trap-door function


An elliptic curve crypto-system can be defined by picking a prime number as a maximum, a curve equation and a public point on the curve.


A private key is a number N, and a public key is the public point dotted with itself N times. ( multiplied the public point N times) 

Computing the private key from the public key - elliptic curve discrete logarithm function , eg. y = g ^ x mod q 

Discrete logarithm function, hard to solve x (nobody knows x from y) (think of y as public key , x as private key)

It is a good trap-door function

It can obtain same level of security with smaller key size (compare to RSA)

For Bitcoin, secp256k1 is the parameters of the elliptic curve used in Bitcoin public key cryptography. The graph of secp256k1 elliptic curve:

secp256k1 details:

y2 = x3+ax+b over P, is defined by T = (P,a,b,G,n,h) , where:

a = 0, b = 7 , so y2 = x3+7
P = large prime number : 2256 - 232 - 29 - 28 - 27 - 26 - 24 - 1
G = 02 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798
h = 01
n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141

Saturday 29 September 2018

SHA Hash Algorithm

Important property:
  • One way from input to hash value, cannot reverse
  • Different input cannot generate the same hash value

Actual SHA example:
  • Choose a word to hash, eg CRYPTO
  • Convert the word to ASCII
         CRYPTO becomes 67 82 89 80 84 79
  • Convert from ASCII to binary
         01000011-01010010-01011001-01010000-01010100-01001111 
        (it becomes a 48 bit message)
  • Join and add 1 at the end
         0100001101010010010110010101000001010100010011111
  • Add zeros to make message equal to 448 mod 512, a 48 bit message with the added one will need to have 399 zeros added to the end
  • Add original message length to the 64 bit field (which is the left over field after the 448 modular arithmetic), and let the message become 16 sections of 32 bits
  • 01000011010100100101100101010000
    01010100010011111000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000000000
    00000000000000000000000000110000
  • Transform the 16 x 32 message into 80 words using step loop function, firstly, do ((14 XOR 9) XOR 3) XOR 1), we get
         01000011010100100101100101010000
  • Rotate left, we get
         10000110101001001011001010100000
  • Process is repeated until there are 80 words (one word = 32 bits)
  • The 1st, 3rd, 9th 14th words are chosen from the algorithm:
  • for i from 16 to 79
          w[i] = (w[i-3] xor w[i-8] xor w[i-14] xor w[i-16]) leftrotate 1

  • Run a set of operations on the 80 words in specific order using the five variables

  • H0 - 01100111010001010010001100000001
    H1 - 11101111110011011010101110001001
    H2 - 10011000101110101101110011111110
    H3 - 00010000001100100101010001110110
    H4 – 11000011110100101110000111110000
    • The Operations are combination of AND, OR and NOT operators
    • The five variables are from first 32 bits of the fractional part of the square roots of the first 5 prime numbers
    • The result : get five variables
    H0 – 01000100101010010111000100110011
    H1- 01010000111001010011100001011000
    H2-11110000010110000100011000111101
    H3-01001011111101111111000111100101
    H4-01000010110110011100101001001011

    • Convert the five variables to hex
    H0- 44a97133
    H1- 50e53858
    H2- f058463d
    H3 - 4bf7f1e5
    H4 - 42d9ca4b
    • Join the variables together, get the hash
    44a9713350e53858f058463d4bf7f1e542d9ca4b

    Tuesday 3 July 2018

    Run Bitcoind as a Docker Service

    This is a continuation from the previous blog article http://embedded-design-vic.blogspot.com/2018/06/run-bitcoind-in-docker-container.html


    1) append this line to the Dockerfile,
       so that bitcoind is started auto when container image is run
    ENTRYPOINT ./bin/bitcoind -datadir=node -daemon && /bin/bash

    2) rebuild the image
     docker build -t bitcoin-docker .

    3) tag image for upload to registry
    docker tag <image> username/repository:tag

    4) upload tagged image to registry
    docker push username/repository:tag 

    5) add a docker-compose.yml file
    version: "3"
    services:
      web:
        # replace username/repo:tag with your name and image details
        image: chaintope99/bitcoin:dev
        deploy:
          replicas: 5
          resources:
            limits:
              cpus: "0.1"
              memory: 50M
          restart_policy:
            condition: on-failure
        ports:
          - "5000:12001"
        networks:
          - webnet
    networks:
      webnet:

    6) init the swarm, swarm is a group of machines and joined as cluster
    swarm manager uses several strategies to run containers.
    docker swarm init

    7) run the specified docker compose file
    docker stack deploy -c <composefile> <appname>

    8) use curl to access the dockerised bitcoind service
    curl --user username:password --data '{"method": "getinfo"}' http://127.0.0.1:5000

    Monday 25 June 2018

    101 confirmations

    Questions:
    After mining, i do listtransactions command, i can see the address has amount of 50 coins in each transaction. Why getbalance command returns 0 ?

    <pre>
    $ src/bitcoin-cli -datadir=../datadir listtransactions
    [
      {
        "account": "",
        "address": "15454L2G44NuZoZrE2HdBwMmCchQVcGvKm",
        "category": "immature",
        "amount": 50.00000000,
        "label": "",
        "vout": 0,
        "confirmations": 2,
        "generated": true,
        "blockhash": "00000000df3ae70ae6c5f3eb24b0dca0c37ef55b76fad5396f1386aaab2b0027",
        "blockindex": 0,
        "blocktime": 1527831787,
        "txid": "24db59cbb12a8ffd3f1421931f2a6a2293b1b6437021af88119da95937c8f737",
        "walletconflicts": [
        ],
        "time": 1527831787,
        "timereceived": 1527831828,
        "bip125-replaceable": "no"
      }, 
      {
        "account": "",
        "address": "15454L2G44NuZoZrE2HdBwMmCchQVcGvKm",
        "category": "immature",
        "amount": 50.00000000,
        "label": "",
        "vout": 0,
        "confirmations": 1,
        "generated": true,
        "blockhash": "00000000eb85f984adc2905671aaa8663d505c0ee71fb5a0d47996f76a12f336",
        "blockindex": 0,
        "blocktime": 1527831949,
        "txid": "d47efd3725fb3be3072131ea6612b2e6581a876f11760e844e91ecd8b414f22e",
        "walletconflicts": [
        ],
        "time": 1527831949,
        "timereceived": 1527831973,
        "bip125-replaceable": "no"
      }
    ]
    $ src/bitcoin-cli -datadir=../datadir getbalance ""
    0.00000000
    </pre>

    Generated coins cannot be spent until the generation transaction has gone through 101 confirmations. Transactions that try to spend generated coins before this will be rejected. The 101 confirmations is the maturity time.

    The reason for this is that sometimes the block chain forks, blocks that were valid become invalid, and the mining reward in those blocks is lost. That is an unavoidable part of how Bitcoin works. If there was no maturation time, then whenever a fork happened, everyone who received coins that were generated on an unlucky fork (possibly through many intermediaries) would have their coins disappear, even without any sort of double-spend or other attack. On long forks, people could find coins disappearing from their wallets, even though there is no one actually attacking them and they had no reason to be suspicious of the money they were receiving. 

    For example, without a maturation time, a miner might deposit 50 BTC into an EWallet, and if the user withdraws money from a completely unrelated account on the same EWallet, the withdrawn money might just disappear if there is a fork and he/she is unlucky enough to withdraw coins that have been "tainted" by the miner's now-invalid coins. 

    Due to the way this sort of taint tends to "infect" transactions, far more than 50 BTC per block would be affected. Each invalidated block could cause transactions collectively worth hundreds of bitcoins to be reversed. The maturation time makes it impossible for anyone to lose coins by accident like this as long as a fork doesn't last longer than 100 blocks. 

    If a fork does last longer than 100 blocks, then the damage caused by invalidated transactions would likely be a huge disaster.  This is unlike to happen, as something would have to be seriously wrong with Bitcoin or the Internet for a fork to last this long.



    Friday 22 June 2018

    Run Bitcoind in Docker container


    Use Docker container to run Bitcoin core

    0) Install Docker
     Docker can be installed on Windows, Linux, MacOS

    1) Create new directory for running bitcoin core in Docker
     mkdir bitcoin-core-docker
     cd bitcoin-core-docker

    2) Create new dir  to store bitcoin.conf
     mkdir node; cd node

    3) Create bitcoin.conf and add the contents
    server=1
    regtest=1
    port=12000
    rpcport=12001
    rpcallowip=0.0.0.0/0
    rpcuser=username
    rpcpassword=password
    daemon=1
    txindex=1

    4) Build Docker container 
      cd .. ( to bitcoin-core-docker)
      create Dockerfile

    5) Add contents to Docker
    # Dockerfile must start with a FROM instruction
    # FROM instruction specifies the Base Image from which you are building
    # FROM <image>[:<tag>]
    FROM ubuntu:16.04

    ENV BTCVERSION=0.15.1

    ENV BTCPREFIX=/bitcoin-prefix

    RUN apt-get update && apt-get install -y git build-essential wget pkg-config curl libtool autotools-dev automake libssl-dev libevent-dev bsdmainutils libboost-system-dev libboost-filesystem-dev libboost-chrono-dev libboost-program-options-dev libboost-test-dev libboost-thread-dev

    WORKDIR /

    RUN mkdir -p /berkeleydb

    #download berkeley db
    RUN git clone -b 0.15 --single-branch  https://github.com/bitcoin/bitcoin.git

    WORKDIR /berkeleydb

    RUN wget http://download.oracle.com/berkeley-db/db-4.8.30.NC.tar.gz && tar -xvf db-4.8.30.NC.tar.gz && rm db-4.8.30.NC.tar.gz && mkdir -p db-4.8.30.NC/build_unix/build

    ENV BDB_PREFIX=/berkeleydb/db-4.8.30.NC/build_unix/build
    WORKDIR /berkeleydb/db-4.8.30.NC/build_unix

    RUN ../dist/configure --disable-shared --enable-cxx --with-pic --prefix=$BDB_PREFIX

    RUN make install

    RUN apt-get update && apt-get install -y libminiupnpc-dev libzmq3-dev  libprotobuf-dev protobuf-compiler libqrencode-dev

    WORKDIR /bitcoin

    RUN git checkout v${BTCVERSION} && mkdir -p /bitcoin/bitcoin-${BTCVERSION}

    WORKDIR /bitcoin

    RUN ./autogen.sh

    RUN ./configure CPPFLAGS="-I${BDB_PREFIX}/include/ -O2" LDFLAGS="-L${BDB_PREFIX}/lib/ -static-libstdc++" --prefix=${BTCPREFIX}

    RUN make

    RUN make install DESTDIR=/bitcoin/bitcoin-${BTCVERSION}

    RUN mv /bitcoin/bitcoin-${BTCVERSION}${BTCPREFIX} /bitcoin-${BTCVERSION} && strip /bitcoin-${BTCVERSION}/bin/* && rm -rf /bitcoin-${BTCVERSION}/lib/pkgconfig && find /bitcoin-${BTCVERSION} -name "lib*.la" -delete && find /bitcoin-${BTCVERSION} -name "lib*.a" -delete

    WORKDIR /

    RUN tar cvf bitcoin-${BTCVERSION}.tar bitcoin-${BTCVERSION}

    # copy bitcoin,conf
    ADD . /bitcoin-${BTCVERSION}

    # expose rpc port for the node to allow access from outside container
    EXPOSE 12001

    WORKDIR /bitcoin-${BTCVERSION}

    6)build the docker image
       docker build -t bitcoin-docker .
    • The -t flag sets a name for the image.
    • The . tells Docker to looker for Dockerfile in the current directory

     6.1)list the built images:
      docker images

    7) run the image in container
    docker run -it -p 5000:12001 bitcoin-docker
    • -it is required for interactive processes (like a bash shell)
    • -p maps Ubuntu port 5000 to the container’s exposed port 12001, which is where Bitcoin rpc will be listening 

    if everything works, docker present a bash shell. Both the node directory and the bitcoin.conf file were copied to the container by the ADD instruction in the Dockerfile, so they will be present in the current working directory. 

    In the bash shell, run
    7.1) bitcoind -datadir=node -daemon
    7.2) bitcoin-cli -datadir=node getinfo

    8) Connect to bitcoind from outside Docker (open second terminal window)
    curl --user username:password --data '{"method": "getinfo"}' http://127.0.0.1:5000

    9) In first terminal window, run this to stop bitcoind
     bitcoin-cli -datadir=node stop
    9.1) In second terminal window, exit the docker container
     exit

    Use Docker to run Bitcoin core as service
    TBC…

    PS: we can push the image to docker hub and share with others

    # tag the image

    docker tag  bitcoin <username>/bitcoin:custom
    # push to docker hub

    docker push <username>/bitcoin:custom
    # run the image, if image is not available locally, docker pull from repo

    docker run -it -p 5000:12001 <username>/bitcoin:custom

    Wednesday 6 June 2018

    Segregated Witness (Segwit)

    Segwit
    Introduction
    Segregated Witness (Segwit) [1], proposed in BIP 141 [5], was activated on August 24, 2017. The contributions of Segwit [2]:
    1) solve transaction malleability [3]
    2) mitigate block size limitation problem
    Problem
    1. Transaction malleability:
    When transaction is signed, the signature (script_sig) does not cover all the data in a transaction. Specifically, the script_sig is part of the transaction, the signature will not be able to sign script_sig. So the signature does not cover script_sig. The script_sig is added, after the transaction is created and signed.


    The script_sig is the tempering point. If script_sig changes, TXID will change. The script_sig can be changed by anyone has access to the corresponding private keys.
    2) Block size limitation problem
    Originally, Bitcoin does not have limit on block size. This allowed attackers to create large size block data. So a 1MB block size was introduced. The 1MB was a tradeoff, between network propagation times, node capability, and number of transactions that can fit into one block, etc [4].
    Proposal
    Segwit defines a new structure called witness. Signature and redeem script are moved into this structure, which is not included in the 1MB block size limit.
    1)Transaction structure

    The conventional transaction structure is used in TXID calculation, and script_sig is empty. Even if script_sig is tempered with, TXID does not change.
    2) Lock/Unlock script
    For a conventional P2PKH:
    scriptPubKey (lock script)
    OP_DUP OP_HASH160 <pubkey hash> OP_EQUALVERIFY OP_CHECKSIG
    scriptSig (unlock script)
    <sig> <pubkey>
    For Segwit P2WPKH:
    scriptPubKey (lock script)
    0 <pubkey hash>
    (unlock script)
    scriptSig
    Witness
    Empty
    <sig> <pubkey>
    In scriptPubKey, there are no opcodes, only 2 data (version and hash) is pushed. When the lock script of this pattern is set, it is evaluated as a conventional P2PKH script.  The signature and public key are obtained from witness instead of scriptSig.
    3) Witness extension method
    In the extension, Segwit introduces OP_CLTV (OP_NPO2) and OP_CSV (OP_NOP3)
    The witness structure
    <witness version>
    <witness program>
    For Segwit, witness version is 0, the witness program is P2WPKH if hash length is 20 bytes and P2WSH if it is 32 bytes.
    4) Address format
    Segwit uses Bech32 address format. It is based on BCH code instead of previously used Base58 encoding, so that error correction is possible [6]. There is no distinction between uppercase and lowercase letters. QR code is also compact

    5) Increase of block size
    The increase of block size from Segwit depends on the types of transaction.
    • Before Segwit
    block data ≦ 1,000,000 MB
    • After Segwit
    block weight = base size × 3 + total size
    base size: Size of transaction data not including witness
    total size: Size of transaction data including witness
    block weight ≦ 4,000,000 MB
    • blocks are non-Segwit transactions, block size is 1MB, same as before
    • all transactions in the block are transactions of P2WPKH with 1 input, 2 output, block size is about 1.6 MB.
    • block has one output and all other transactions are P2WPKH input, it is huge Tx, the block size is about 2.1 MB.
    • block consists of transactions of P2WSH with huge witness (all 15-of-15 multisig etc), the block size is about 3.7 MB.
    6) Changes in signature data
    The convention message digest items are based on the conventional transaction structure. The message digest items are:
    version, txin count, txins, txout count, txouts, locktime, sighash type
    For Segwit, the message digest items are:
    version
    hashPrevouts
    Hash of all input outpoint
    hashSequence
    Hash of all input sequence (TxIns)
    outpoint
    Previous output (32byte TXID + 4byte index) in TxIns
    script code
    value
    amount of coins held by TxIns
    sequence
    Sequence of TxIns
    hash output
    Hash of all outputs (TxOuts)
    locktime
    sighash type
    Segwit changes the calculation of transaction hash for signatures, so that each byte of a transaction is hashed twice, at most [7]. The sighash calculation cost is reduced.
    7) Witness commitment in Coinbase transaction
    For a conventional transaction, the merkle root calculation is shown as below. The merkle root is calculated using original Tx format. 

    Segwit adds the witness commitment. Merkle tree is constructed based on transaction data including signature data of witness. That merkle root is stored in one of coinbase transaction output to make commitment including the witness data.


    Effects and Challenges
    Segwit changes the consensus, P2P message, address format of Bitcoin protocol. It is amazing Segwit could be realised in soft fork.
    Segwit introduces witness extension method. It cancels transaction malleability and increases block size. The actual block size increase depends on the transaction type.
    References
    1. https://en.bitcoin.it/wiki/Segregated_Witness
    2. https://en.wikipedia.org/wiki/SegWit
    3. https://en.bitcoin.it/wiki/Transaction_malleability
    4. https://en.bitcoin.it/wiki/Block_size_limit_controversy
    5. https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki
    6. https://en.wikipedia.org/wiki/BCH_code
    7. https://bitcoincore.org/en/2016/01/26/segwit-benefits/#linear-scaling-of-sighash-operations