January 17, 2019

Walden by Henry David Thoreau

I recently finished reading Walden, by Henry David Thoreau.

The book describes Thoreau's time as he lived in a small cabin he built in the woods of Massachusetts, on the shores of Walden Pond. He describes building the cabin, living a simple life mostly separate from others, and the beauty of the environment throughout the seasons.

Thoreau could be described as a minimalist. He lives in a simple cabin, works a small garden in the mornings, and spends the afternoons taking walks through the woods or swimming in the pond. He lives on the bare necessities, and does the minimal amount of work to feed himself through the year, which leaves time for enjoyment of the little things.

The book was written in a different time, the are brief passages of narrative widely spaced between philosophical wanderings and vivid descriptions of the natural world. Some might call the book "boring", but I would rather call it "peaceful". He has a very down-to-earth, methodically rational worldview that is very different from the rushing, conclusion-jumping world of today's popular culture.

This is an example of classic American Literature, and I recommend that everyone should read it once.

-

January 16, 2019

A Simple Introduction to Crypto

Last weekend I was visiting with my grandmother and she said to me and my brothers "Can anybody explain crypto? I keep hearing about crypto on the news and I don't know what that is?"

We tried to briefly explain, but I don't think we did a good job. So I decided to lay out a simple groundwork to understand crypto that could be understood by anybody, even my grandmother.

The first thing to understand is that when the guys on the news talk about "crypto" they are probably talking about "cryptocurrencies", like Bitcoin, which could also be called cryptographic-currencies.

Let's start at the beginning: if you have a message written as letters, you can rewrite that as a big number. Here, let me demonstrate: lets's use a simple system where each letter corresponds to a two digit number, a is 01 on up to z is 26, make 00 a space and 27 a period and we can write a sentence. So to write "abc" we could use the number 010203, and 101112 would be "jkl".  Or the number 160529051800091900071805012027 is the message "peter is great." Actual cryptography will use ASCII or a similar system so that you have the whole alphabet, upper and lower case letters, a wide variety of punctuation, and numerals; but the underlying idea is the same - any message can be written as a really big number.

The next thing to understand is the idea of one-way or "trapdoor" functions. Let's take prime factoring as an example: what are the prime factors of 527 ? You might start by noticing it is odd, so not 2; then you start dividing each prime number going up - 3 does not work (if it is a multiple of 3 then the sum of the digits will also be); it's not a multiple of 5 (does not end in a 5 or 0); I don't know a trick for 7 but that does not divide evenly either; some people make it to 11 and then quit. But if I say what is 17 x 31 you might even be able to do it in your head: 10(17 x 3) + (17 x 1) -->  51_ + 17 -> 527. So you see that going one way (finding the prime factorization) takes much more work than going the other way (multiplying two primes). You can use a computer to make it easier, up to a point. If you have a "small" number the computer can factor it quickly, but as the number gets bigger the factorization takes longer and longer, so if you have a big enough number then not even the world's largest supercomputer can crack that prime factorization. (4096 bits should be enough for everybody)

People can then use such a one-way function to create what is called asymmetric cryptography. The idea here is that each person creates a pair of keys with a "public key" portion and a "private key" portion. A message is stored as a large number, a one-way function is used on it using the private key, and then anybody can check using the public key with the one-way function to prove that the message was made by that person. (Alternately, a message created using the public key can only be read by the person holding the private key, so this is also useful for secure communication).

As an example of a digital signature, the RSA system uses prime factorization, as mentioned above, to keep the private key secure. In RSA, a private key is made by taking two large primes (2048 bits long) and publishing their product (N) as part of the public key, along with an unrelated number (e). Using the two primes, the key generator also calculates e's modular inverse (d), which is a unique number, and stores that as the private key. Since you need the two primes to calculate d, and the number N is so large that it is impossible to factor, you can give other people the public key (e, N) and still the private key (d, N) will stay a secret. A message m (remember, the message is converted from letters to a really big number) is then signed by taking the modular exponentiation c = m^d mod N, and anybody can check that you signed it because they can easily calculate m = c^e mod N (this is true because e and d are modular inverses).

Once you have an asymmetric cryptographic system like RSA, or elliptic curve cryptography (ECC) which is more complicated but the basic idea is the same, then you can create a cryptocurrency. This is as simple as each person having a key-pair, and people can sign messages, or transactions, like "move $1 from {Peter's key} to {John's key}" - signed by {Peter's key}. Then everybody can check to see that was, in fact, signed by Peter. And if Peter had $1, then it is subtracted from his account and added to John's.

In a centralized system, with one company keeping a ledger with all the accounts, that will be sufficient. But if you are running a world-wide, peer-to-peer system and you receive such a transaction, how do you know Peter did not just sign a transaction giving all his money to Rachel instead and give that transaction to everybody else? You could say whichever message is received first is valid, but it is hard to get people spread around the world to agree on things like the order of messages because somebody else could have seen the messages in a different order.

The innovation of Bitcoin was to introduce the idea of a "blockchain" to serve as a secure, trustable ledger for transactions of digital money. Anybody can create transactions to move their own money within the system, called bitcoins, and these are shared with all users. A block is created by collecting valid transactions together and also lists the previous block. Thus a chain of these blocks is created, and balances are updated based on the transactions that are included in the blocks. So if Peter, who has 1 bitcoin in his account, creates one transaction that says "move 1 to John", and another that says "move 1 to Rachel", the person who creates the block will only include the one they heard first, and everybody will update the accounts based on the transaction that ends up in the blockchain; the other transaction will then be rejected by everybody.

In systems like Bitcoin, the people who publish these blocks to the blockchain are sometimes called "miners" because of the particular way in which Bitcoin introduces new money into the system: Each block is created with a certain amount of new bitcoin (started as 50 per block, cuts in half every 4 years, now at 12.5), and people making transactions include a "fee" to get their transaction included, these all go to the one person who makes the block (so people doing the work to check that transactions are valid and making the blocks are rewarded with a supply of new money, like people who work in mines are rewarded with a supply of new gold).

Naturally this incentivizes each person to have their own block included in the blockchain so they get the "miner reward", and if two different blocks are created at the same time which gets included? This is solved by the idea of "difficulty": each block is identified by a "hash function", another one-way function, which converts the contents into a number. The function is chosen to give an essentially random distribution. The difficulty is then calculated as a function of the number of leading zeros in the number. So 1234 would have a difficulty of 0, 0234 has a difficulty of 1 (probability of 1 in 10), and 0056 has a difficulty of 10 (1 in 100, ten times as hard as previous). Anyway, the next block has to meet a minimum difficulty score, which is adjusted periodically so that a new block is found roughly every ten minutes. If there are two competing blocks, the one included is always the one that has the greatest difficulty score. So the miners will build slightly different versions of a block and calculate the hash function until they find one with the right score.

The hash function is designed to be computationally difficult for computers. But a stronger computer will calculate it faster, and so in the beginning of bitcoin anybody could have their computer working on hashing blocks and expect to find a valid one every once in a while, a computer that was twice as fast would just get twice as many hits over a long period of time. Within a couple years of bitcoin starting, though, people had discovered that graphics cards could be programmed to do the hash calculation much faster (by orders of magnitude) than a normal computer CPU. So for a while people would buy high end graphics cards and stack them together. Within a few more years, though, specialty circuits were made which could do this calculation faster by a couple more orders of magnitude. Because of the way that the difficulty requirement is periodically redefined, these application specific circuits still generate about one block every ten minutes, while the chance that a normal computer will find a valid block is essentially 0, and all mining is controlled by a few companies in China that have built their own custom bitcoin-mining supercomputers.

The rule is that the only valid cryptocurrency blockchain is the one with the highest difficulty score, and for that there is nothing close to Bitcoin, which has been running since 2009. However, all those blocks add up, so to store the bitcoin blockchain requires several hundred gigabytes of memory. There are 1 TB disk drives available (1024 GB), so anybody can build a computer that is capable of holding all this data, and then they can run the check themself to show that any Bitcoin transaction is valid or not. This peer-to-peer structure makes Bitcoin more resilient than other types of digital currency which have a central point of failure. Because of this resiliency, the fact that Bitcoins can easily be sent anywhere around the globe instantly, and the fact that there is a defined limit to the total number of bitcoin (unlike US dollars, which can be printed whenever the US needs more money, causing inflation), Bitcoin can be used as a secure store of value or as a way to securely transfer funds globally, which is why the exchange rate has consistently increased over time (current exchange rate is about 3600 US dollars per bitcoin).

-