As mentioned in my previous blog, transactions form the fundamental aspect of a block chain system. In this blog we will get deep into understanding what exactly constitutes a “transaction” in a block chain framework. But it is challenging to cover all the technologies that makes a transaction possible in one blog. So this is the first part of a few blogs related to ‘Block Chain Transactions‘
Transactions are the central aspect of block chain system. Everything else revolves around the transaction – meaning creating a transaction, validating a transaction, packaging a transaction into blocks, propagating a transaction into the network etc. As you can see, understanding transactions becomes very important. Let us start by analyzing what exactly constitutes a transaction?
A transaction is a data structure and the key fields in the transaction are mentioned below:
- Input Counter
- Output Counter
Note: I have excluded other fields like version & Lock time to focus on the minimum necessary entities in this data structure.
If you look at the above entities, it looks straight forward, right? A transaction has some input and output. Logically thinking, an input should be from the person initiating the transaction saying “I am Michael Corleone and I am looking to pay 1 BTC from my bank account number ‘1234’ ”. Output should be saying that “Pay this 1 BTC to Emilio Barzini into his bank account ‘2345’”. This is exactly what is contained in a transaction. However there is no bank involved, neither are any person name identifiers involved. In order to understand transactions from a block chain perspective we have to ask several questions:
- “Michael Corleone” is a very common name? (Is it?!) . How do I uniquely identify this person?
- How do I know that this guy claiming as “Michael Corleone” is really him?
- If there are no banks and bank accounts, how can I know if Michael really owns this “1 BTC”?
- How can Michael uniquely refer to Emilio Barzini?
Voila! Probing questions open the door to possible answers. Well, the possible answers to the above questions should be something like this:
- For Qn. 1, maybe we need a unique identifier for each person. (Like SSN or technically a GUID).
- For Qn. 2, maybe we need to get his digital signature to prove that it is indeed him.
- For Qn. 3, may be there should be a de-centralized store where anyone can look at what this unique identifier owns and validate the amount and ownership.
- For Qn. 4, refer to answer 1.
What if there is a single powerful technology that can solve this? No, it’s not block chain! It’s something that we have been using in IT security for more than 4 decades – public key cryptography. Invented in 1970s, this beauty has been at the heart of information security ever since. This cryptography makes it possible for us to assign unique identifiers and then generate non-forgeable signatures associated with that unique identifier. I know that the sentence above is as confusing as it can get. To get some clarity, let’s dig a little deeper into public key cryptography.
Public Key Cryptography:
Public Key Cryptography (PKC) contains a pair of keys called Private Key and Public Key. The private key can be any random number. Then we use the private key and some algorithms to generate the public key. That’s the simplest version of PKC. If you are still curious, read on.
Block chain uses an advanced variation of Digital Signature Algorithm (DSA) called as Elliptical Curve DSA (ECDSA). The same algorithm is used by all participating nodes in a block chain network. As a starting point, a random number is chosen. (Note that this process is done by the system automatically whenever you generate a new key pair). It is advised to use a cryptographically secure pseudo random number generator (CSPRNG) for random number generations as it becomes critical for the overall security. The length of private keys in Block chain is 256. Hence there are 2 power 256 combination of possible keys that can be generated. Once the private key is generated, then the public key is generated by using the following function:
K = k * G
where k is the private key and G is the constant point called Generator point. The output “K” is the public key. Bitcoin uses a specific elliptic curve algorithm and constants called out by a standard called “secp256k1”. This is a very detailed area and its detailed knowledge is not required for working with block chain. From a simple technologist perspective, as far as you are able to generate public and private keys and use it for signature you are good. Also it’s not my intention to go into the details of ECDSA in this blog. However, for the curious mind I am going to do a little attempt to explain how DSA works in four parts.
Part 1: Behind the scenes
Step 1: Select the hashing algorithm to be used. Today SHA2 is the NIST recommended hashing algorithm. Let us call this hash function as H.
Step 2: Select the desired length of public and private keys. NIST and FIPS have specific recommendations here like say (2048, 256), (3072, 256) etc
Step 3: Select three values p, q and g based on certain conditions.
This (p, q, g) is shared along with H to all systems. Please note that these values are part of the software solution. As a user of a public key infrastructure we will never know what are the values of (p, q, g). There is no use of knowing the same as well. Also even if you somehow find it out, it is not a loophole to crack the PKC.
Part 2: Key Pair Generation
Step 4: Select a random (CSPRNG) private key (x) which is 256 bits long.
Step 5: Calculate the public key y = gX mod p.
Now we have the (x, y) where x is very public and y is a secret key known only to the user who owns the key. If this private key is lost, then the BTC is gone forever. So it’s recommended to backup the private keys and keep a backup of your backup as well! So if you are using a PKC based system or blockchain, you will not have exposure to steps 1 to 5. All you will see is magically a public & private key (x,y) is generated for you .
Part 3: Signature Creation:
Now let us understand how signature is created. Let us take a random message M. As a user, you pass the message M and your (x,y) to the PK solution. Everything that I mention below are done behind the scenes.
Step 1: Generate a random value k which is less than q.
Step 2: Calculate r = (g^k mod p) mod q.
Step 3: Calculate s = k^ (-1) (H(M) + xr) mod q
Step 4: Signature is (r,s)
As a user, you only see the output (r,s). Now from a block chain perspective, when you send the message please note that our objective is not to encrypt the message but to validate if the message has not been tampered with and is coming from the rightful owner. So when you publish the message, you will actually be publishing our public key (y) and the message (M) and the signature (r,s). Please note that this is inherently done by the block chain network when it propagates the messages into the block chain network.
Part 4: Signature Validation:
When another node in the system receives this message, it uses a complex algorithm similar to step 2,3 mentioned in part 3 utilizing the parameters H,M, y, (r,s) & (p,q,g). The value it generates (v) is then compared with r. If v = r, then the signature is valid and the public key belongs to the right holder of the private key. Also the message has not been tampered with because even a single bit change in M will alter the value of v.
This is how DSA operates. ECDSA is an advanced version of the same and uses advanced algorithms internally to come up with the signatures. Please note that this is inherently done by the block chain network when it validates a new message that it receives from the block chain network.From your standpoint, all this is hidden and taken care of by block chain architecture. But if you are still curious to know more about ECDSA, you can always read the specs on “secp256k1”.
In this blog, we started with transaction, however we ended up with ECDSA. This is the reason why I mentioned that we will cover “transactions” as a multi part series. Before I sign off, I will leave you with a chicken or egg problem. In a block chain transaction, what came first – transaction input or transaction output?