January 17, 2019

Walden by Henry David Thoreau

I recently finished reading Walden, by Henry David Thoreau.

The book describes Thoreau's time as he lived in a small cabin he built in the woods of Massachusetts, on the shores of Walden Pond. He describes building the cabin, living a simple life mostly separate from others, and the beauty of the environment throughout the seasons.

Thoreau could be described as a minimalist. He lives in a simple cabin, works a small garden in the mornings, and spends the afternoons taking walks through the woods or swimming in the pond. He lives on the bare necessities, and does the minimal amount of work to feed himself through the year, which leaves time for enjoyment of the little things.

The book was written in a different time, the are brief passages of narrative widely spaced between philosophical wanderings and vivid descriptions of the natural world. Some might call the book "boring", but I would rather call it "peaceful". He has a very down-to-earth, methodically rational worldview that is very different from the rushing, conclusion-jumping world of today's popular culture.

This is an example of classic American Literature, and I recommend that everyone should read it once.

-

January 16, 2019

A Simple Introduction to Crypto

Last weekend I was visiting with my grandmother and she said to me and my brothers "Can anybody explain crypto? I keep hearing about crypto on the news and I don't know what that is?"

We tried to briefly explain, but I don't think we did a good job. So I decided to lay out a simple groundwork to understand crypto that could be understood by anybody, even my grandmother.

The first thing to understand is that when the guys on the news talk about "crypto" they are probably talking about "cryptocurrencies", like Bitcoin, which could also be called cryptographic-currencies.

Let's start at the beginning: if you have a message written as letters, you can rewrite that as a big number. Here, let me demonstrate: lets's use a simple system where each letter corresponds to a two digit number, a is 01 on up to z is 26, make 00 a space and 27 a period and we can write a sentence. So to write "abc" we could use the number 010203, and 101112 would be "jkl".  Or the number 160529051800091900071805012027 is the message "peter is great." Actual cryptography will use ASCII or a similar system so that you have the whole alphabet, upper and lower case letters, a wide variety of punctuation, and numerals; but the underlying idea is the same - any message can be written as a really big number.

The next thing to understand is the idea of one-way or "trapdoor" functions. Let's take prime factoring as an example: what are the prime factors of 527 ? You might start by noticing it is odd, so not 2; then you start dividing each prime number going up - 3 does not work (if it is a multiple of 3 then the sum of the digits will also be); it's not a multiple of 5 (does not end in a 5 or 0); I don't know a trick for 7 but that does not divide evenly either; some people make it to 11 and then quit. But if I say what is 17 x 31 you might even be able to do it in your head: 10(17 x 3) + (17 x 1) -->  51_ + 17 -> 527. So you see that going one way (finding the prime factorization) takes much more work than going the other way (multiplying two primes). You can use a computer to make it easier, up to a point. If you have a "small" number the computer can factor it quickly, but as the number gets bigger the factorization takes longer and longer, so if you have a big enough number then not even the world's largest supercomputer can crack that prime factorization. (4096 bits should be enough for everybody)

People can then use such a one-way function to create what is called asymmetric cryptography. The idea here is that each person creates a pair of keys with a "public key" portion and a "private key" portion. A message is stored as a large number, a one-way function is used on it using the private key, and then anybody can check using the public key with the one-way function to prove that the message was made by that person. (Alternately, a message created using the public key can only be read by the person holding the private key, so this is also useful for secure communication).

As an example of a digital signature, the RSA system uses prime factorization, as mentioned above, to keep the private key secure. In RSA, a private key is made by taking two large primes (2048 bits long) and publishing their product (N) as part of the public key, along with an unrelated number (e). Using the two primes, the key generator also calculates e's modular inverse (d), which is a unique number, and stores that as the private key. Since you need the two primes to calculate d, and the number N is so large that it is impossible to factor, you can give other people the public key (e, N) and still the private key (d, N) will stay a secret. A message m (remember, the message is converted from letters to a really big number) is then signed by taking the modular exponentiation c = m^d mod N, and anybody can check that you signed it because they can easily calculate m = c^e mod N (this is true because e and d are modular inverses).

Once you have an asymmetric cryptographic system like RSA, or elliptic curve cryptography (ECC) which is more complicated but the basic idea is the same, then you can create a cryptocurrency. This is as simple as each person having a key-pair, and people can sign messages, or transactions, like "move $1 from {Peter's key} to {John's key}" - signed by {Peter's key}. Then everybody can check to see that was, in fact, signed by Peter. And if Peter had $1, then it is subtracted from his account and added to John's.

In a centralized system, with one company keeping a ledger with all the accounts, that will be sufficient. But if you are running a world-wide, peer-to-peer system and you receive such a transaction, how do you know Peter did not just sign a transaction giving all his money to Rachel instead and give that transaction to everybody else? You could say whichever message is received first is valid, but it is hard to get people spread around the world to agree on things like the order of messages because somebody else could have seen the messages in a different order.

The innovation of Bitcoin was to introduce the idea of a "blockchain" to serve as a secure, trustable ledger for transactions of digital money. Anybody can create transactions to move their own money within the system, called bitcoins, and these are shared with all users. A block is created by collecting valid transactions together and also lists the previous block. Thus a chain of these blocks is created, and balances are updated based on the transactions that are included in the blocks. So if Peter, who has 1 bitcoin in his account, creates one transaction that says "move 1 to John", and another that says "move 1 to Rachel", the person who creates the block will only include the one they heard first, and everybody will update the accounts based on the transaction that ends up in the blockchain; the other transaction will then be rejected by everybody.

In systems like Bitcoin, the people who publish these blocks to the blockchain are sometimes called "miners" because of the particular way in which Bitcoin introduces new money into the system: Each block is created with a certain amount of new bitcoin (started as 50 per block, cuts in half every 4 years, now at 12.5), and people making transactions include a "fee" to get their transaction included, these all go to the one person who makes the block (so people doing the work to check that transactions are valid and making the blocks are rewarded with a supply of new money, like people who work in mines are rewarded with a supply of new gold).

Naturally this incentivizes each person to have their own block included in the blockchain so they get the "miner reward", and if two different blocks are created at the same time which gets included? This is solved by the idea of "difficulty": each block is identified by a "hash function", another one-way function, which converts the contents into a number. The function is chosen to give an essentially random distribution. The difficulty is then calculated as a function of the number of leading zeros in the number. So 1234 would have a difficulty of 0, 0234 has a difficulty of 1 (probability of 1 in 10), and 0056 has a difficulty of 10 (1 in 100, ten times as hard as previous). Anyway, the next block has to meet a minimum difficulty score, which is adjusted periodically so that a new block is found roughly every ten minutes. If there are two competing blocks, the one included is always the one that has the greatest difficulty score. So the miners will build slightly different versions of a block and calculate the hash function until they find one with the right score.

The hash function is designed to be computationally difficult for computers. But a stronger computer will calculate it faster, and so in the beginning of bitcoin anybody could have their computer working on hashing blocks and expect to find a valid one every once in a while, a computer that was twice as fast would just get twice as many hits over a long period of time. Within a couple years of bitcoin starting, though, people had discovered that graphics cards could be programmed to do the hash calculation much faster (by orders of magnitude) than a normal computer CPU. So for a while people would buy high end graphics cards and stack them together. Within a few more years, though, specialty circuits were made which could do this calculation faster by a couple more orders of magnitude. Because of the way that the difficulty requirement is periodically redefined, these application specific circuits still generate about one block every ten minutes, while the chance that a normal computer will find a valid block is essentially 0, and all mining is controlled by a few companies in China that have built their own custom bitcoin-mining supercomputers.

The rule is that the only valid cryptocurrency blockchain is the one with the highest difficulty score, and for that there is nothing close to Bitcoin, which has been running since 2009. However, all those blocks add up, so to store the bitcoin blockchain requires several hundred gigabytes of memory. There are 1 TB disk drives available (1024 GB), so anybody can build a computer that is capable of holding all this data, and then they can run the check themself to show that any Bitcoin transaction is valid or not. This peer-to-peer structure makes Bitcoin more resilient than other types of digital currency which have a central point of failure. Because of this resiliency, the fact that Bitcoins can easily be sent anywhere around the globe instantly, and the fact that there is a defined limit to the total number of bitcoin (unlike US dollars, which can be printed whenever the US needs more money, causing inflation), Bitcoin can be used as a secure store of value or as a way to securely transfer funds globally, which is why the exchange rate has consistently increased over time (current exchange rate is about 3600 US dollars per bitcoin).

-

September 11, 2018

Security Theater

The other day I was at the hardware store with my wife. She has been thinking of installing cameras around her business to keep an eye on things, and so we were looking at the store's section of security cameras. I noticed on the shelf they had a couple boxes that were pretty cheap, only ten bucks, and so I picked one up to look at it. To my dismay, I found out that this is not an actual camera, it is a decoy. It even comes with a little blinking red led so that it looks like it is on. Not only did they have decoy cameras, but these are offered by multiple brands.

This is just silly security theater. These decoy cameras project a sad facade of security with no actual security behind them.

July 30, 2018

Fasting Prayer of a Young Boy

My Sunday school teacher said to me
That fasting is good for the soul.
And though I'm a lad of tender years,
I thought I'd give it a go.

My stomach rumbles, my head is faint;
Oh, this is how I'm brought low.
I perish! Unless I break this fast
I started ten minutes ago.

-

July 11, 2018

vdiff_sha_define_swap.vpatch

Recently I tried building vtools on a windows system running cygwin. I found that there was one function it barfed on, SWAP_BE64. I did some digging, and I found the code was taken from busybox. I dug around a bit there, and I found where the thing is defined, so I copied that definition into my copy of the program, and it worked.

Keep in mind, this function seems to depend on the endianness of the machine, so in the busybox code there are a bunch of #IFDEFs surrounding this definition. If you are using a big-endian machine then you would want to apply a different definition for this function.

This is a patch using sha hashing. There seems to be some movement leaning towards using the Keccak hash for vpatches (see this discussion and that following it), so I might have to get a Keccak hasher and regrind this later.

File vdiff_sha_define_swap.vpatch :

diff -uNr vdiff_a/vtools/manifest vdiff_b/vtools/manifest
--- vdiff_a/vtools/manifest cece6cf281356c63d1f0a935d8f772345b51f63bc77ada2b70646b395036a22a88962e06661dee77977e1d434290adec5d76e540597bbd2213a36cb5890a8cf8
+++ vdiff_b/vtools/manifest 2e651c3ae3f563a3ee7f6a0cfbd3f18e76bf57f81615643e6df3c60f6d6a2d868a5ff9809de76184ee9779091d621ef088d5f695c2c7d9d5b0fd040d7173ab8f
@@ -6,3 +6,5 @@
Fixes C99 compatibity for __attribute__((noreturn)), vdiff support for "No newline at end of file" directive.
2018-04-07 phf
Fixes for xalloc's use of static inline courtesy of spyked.
+2018-06-28 PeterL
+Added definition for SWAP_BE64 in sha.c (valid for littleendian system).
\ No newline at end of file
diff -uNr vdiff_a/vtools/src/sha.c vdiff_b/vtools/src/sha.c
--- vdiff_a/vtools/src/sha.c 8ef7baea0ae21e44b34318a0202e3a7c4ca1fe4df6c1a8a29d8a5eeee930926968ea9bd062c514460208bf89e2fa4ec958950c3776b47513cd5979961e4182c6
+++ vdiff_b/vtools/src/sha.c 3eac373f23f903ef55354750cf8921a705aefa2cb61a1e8acc0fc091fc357b434fd0598822f90beaa1d1c4ecf4ede4d2184eb12764f806d304fc70575967a4a5
@@ -3,6 +3,9 @@
#include
#include

+/* SWAP_BEnn means "convert CPU<->big_endian by swapping bytes" */
+# define SWAP_BE64(x) __bswap64(x)
+
/* \url{https://git.busybox.net/busybox/tree/libbb/hash_md5_sha.c} */

static inline uint64_t rotr64(uint64_t x, unsigned n) {


And here is the signature for the file, vdiff_sha_define_swap.vpatch.PeterL.sig :

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAABAgAGBQJbRsWTAAoJELTi4Nystj0ltSUP/A9V+5pjBVYr6qOZLOYCOF6c
/pU4YvdVW9sMc6EZ/p0Vdq6bWGr/RIhkgMiPsHhnxsdHS6xSpEFGWOMj+mzOBjRd
7sJl3GEZ14kKbweP+Z/Uwky/yYG5MAKORLBGb24HF+aBdShq3GqQFcd2B9Rb/F3g
YgIwLY4gVBsFFgjTBR47/9rXJAgpJKb3S/dPQ3sdWoLXb1ZO4ayZfzcjeCmEpuyq
DbjSbPiiHvfoTSJ4/N8UdNSAccV1UKofnpnnhQ4A43mEQ3T00Q7tnI7NzW3wLD3K
3PPEQOduAffdrPbliigVLQhDOzHiIyuQhENVI7N8IQwUxvjF6RJ3A+YWHeMAvLf9
yTOf7DSIUWMee4YX02ZhPvclomrG9+PRJUkDsk7LqwoNxVEo5ZeXlPFIE4zBYQZg
F+my5bNl4OqJzJT/0voMMza6oDNXIigX/kSqJuCEuQ+dMX0XgbiejX94GmAWO+qB
DuyQuYCdslH8dVmEd1UBKQKL/6or6HFxTOSDDUUdfcRBbrWLFdToyaWMxIz3Ea5z
m8hcDi+olE5yTKz6DAvy0SDrOwlOk0cCXoeo1XuiVVwfdzRiYPFKWkWFqFleZghm
FIZif8TuRJl6LunzV2L9P2RZn2sNuN59cuwC0e7+7XW1XroP1VPmLNP+/yaeK/gR
kdTB76aB/ebzaUXqV+U4
=rxQr
-----END PGP SIGNATURE-----

-

Chicken Potato Soup

This is a yummy soup with lots of flavors. I made this soup in an InstaPot (electronic pressure cooker), but you could also make it more traditionally in a pot on the stove.

Ingredients

  • 1 Tab butter
  • 1 Tab cooking oil
  • 3 cloves garlic, chopped finely
  • About a pound of chicken breast, cubed into about 1 inch pieces
  • About a pound of potatoes, cubed into about 1 inch pieces 
  • 1 yellow squash, cubed into about 1 inch pieces
  • 2 cups chicken broth
  • 1 tsp pepper
  • 1 tsp thyme
  • 1 tsp Garam Masala
  • 1 lemon

Procedure - InstaPot

  1. Turn the InstaPot on with "Saute" mode. Add butter, oil, and garlic, stirring frequently. Once the oil is hot, let the garlic cook a minute until it gets soft.
  2. Add the chicken. Stir frequently, cook until the chicken is slightly browned on the outside.
  3. Add the potatoes, squash, broth, pepper, thyme, and Garam Masala. Stir everything together.
  4. Cut the lemon in half, squeeze the juice out of both sides into the pot.
  5. Close the InstaPot, cook on the poultry setting for 12 minutes.

Procedure - Stovetop

Proceed as above, saute the garlic and chicken on medium-high heat. Add all other ingredients, and add water to cover all ingredients, bring to a boil. Reduce the heat to a simmer and cook until the potatoes are soft (check by poking them with a fork).

-

July 10, 2018

Beaches

I recently returned from a week of camping and swimming with my family in northern Michigan. We had a good time, and part of that was the excellent beaches we found. We have set a family goal to try to set foot in all of the Great Lakes during this summer, and we were able to check off three of the five lakes on just this trip. We camped at Straits State Park, located on the Upper Peninsula side of the Straits of Mackinac.

The first beach we visited was at the campsite; this was Lake Huron, but within the shadow of the Mackinaw Bridge. The water was cold but refreshing compared to the summer heat. This beach had some sand, but was mostly rocks. It was a perfectly clear day, with no wind, and so there were very little waves (perhaps 6 inches). It should be noted that the water here is amazingly clear, such that you can stand in water up to your shoulder and still see every detail on the bottom.

The next day we traveled north into the Upper Peninsula. We visited Tahquemenon Falls. It was a nice ten minute hike in the woods to see the falls, which were fun to see but I feel they are a bit over-hyped. We stopped at the first little beach area we came to on Lake Superior, but the beach was entirely make of very sharp rocks, so we didn't stay long. We then had lunch at a little diner in Paradise, MI, where we were directed by the waitress to a beach a bit further down the coast. This turned out to be on the south shore of Whitefish Bay, and it was a beautiful beach made entirely of amazingly soft sand. We had the beach almost to ourselves and played for several hours. The water was cold but bearable, with about one to two foot waves. Being on the south side of the lake, the trees here give shade onto the beach, which is nice for those of us like my family who are pasty white and sunburn easily.

The next beach we visited was on Lake Michigan. We drove about ten miles west of St. Ignace on US 2, at which point the highway runs right along the lake shore. We parked on the side of the highway and jumped in the water. This beach is entirely made of very fine sand, which is fun to play in but gets everywhere: into your clothes, into your towel, into the baby diaper, into the bag of chips, etc. There were also no trees here, the closet trees were about 100 meters the other way from the highway (the water is about 20 meters from the road), and so sunscreen is a necessity here. The sky was cloudless, but the wind had picked up some and so we spent all afternoon playing in the two to three foot waves.

-

June 4, 2018

V-tronics

I have been thinking about v-tronics lately, that is the mechanisms of the versioning system used with The Most Serene Republic, and variously expounded upon here and there and most recently here.

One problem that has been seen is bifurcation of the patch tree, when two patches are created for a project which touch non-overlapping sets of files.  This can be seen most easily if you imagine a genesis with two files, A and B. If patch P1 makes a change to file A, and patch P2 makes a change to file B, then (the way v currently works) the user has to manually apply one of the patches if they want both to apply.

A proposed solution is to add a "manifest" file, which is touched by every patch applied to the project. This adds a nifty way to easily graft two projects together: you just need a patch which touches the manifest file for both projects (in project A manifest: "+added project B", and in project B manifest "+added this to project A") and keeps the v-patches small and readable. That way, even if no changes are made to the substance of the project that is pulled in, the addition will be applied to all later presses.

While I was composing this post, Trinque added a specification for the V manifest.

-

May 19, 2018

lamport.py

This is a script I wrote in Python to use the ideas expressed by Lamport and Stanislav.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

#!/usr/bin/python

#
# (C) 2016 Peter Lambert. 
# You do not have, nor can you ever acquire the right to use, copy or distribute this software; 
# should you use this software for any purpose, or copy and distribute it to anyone or in any manner,
# you are breaking the laws of whatever soi-disant jurisdiction, and you promise to continue doing
# so for the indefinite future.
#

import hashlib
import os
import sys

#
# Comment out the line below with the hash you do not want to use

# sha = hashlib.sha256
sha = hashlib.sha512


def binmessagehash(filename):
    m = open(filename, 'r')
    messagehash = sha(m.read()).hexdigest()
    m.close()    
    return bin(int(messagehash, 16))[2:].zfill(len(messagehash) * 4)
    
def decode(pubkeyfile, sigfile):
    kf = open(pubkeyfile, 'r')
    pubkey = [line.strip().split() for line in kf.readlines()]
    kf.close()
    
    sf = open(sigfile, 'r')
    sig_keys = [line.strip() for line in sf.readlines()]
    sf.close()
    
    binmessage = ''
    
    for k in range(len(sig_keys)):
        key_hash = sha(sig_keys[k].decode('hex')).hexdigest()
        if key_hash in pubkey[k]:
            binmessage += str(pubkey[k].index(key_hash))
        else:
            return 'Bad signature, hash not found for %s' % sig_keys[k]
    
    hexmessage = '%x' % (int(binmessage, 2))
    return hexmessage
    
def encode(privkeyfile, messagefile):
    pkf = open(privkeyfile, 'r')
    key = [line.strip().split() for line in pkf.readlines()]
    pkf.close()
    
    bmess = binmessagehash(messagefile)
    for k in range(len(bmess)):
        print key[k][int(bmess[k])]

def generate_key(payloadbits, strengthbits):
    if '-s' in sys.argv:
        rs = randstr
    else:
        rs = os.urandom
    key = [[rs(int(strengthbits) / 8) for n in [0, 1]] for m in range(int(payloadbits))]
    for k in key:
        print ' '.join(kp.encode('hex') for kp in k)
            
def output_help():
    print '''Usage: 
        ./lamport.py -g PAYLOADBITS STRENGTHBITS > PRIVKEYFILE  
            generates a key
            
        ./lamport.py -g -s PAYLOADBITS STRENGTHBITS > PRIVKEYFILE
            generates a key with the slower, more secure /dev/random
            
        ./lamport.py -p PRIVKEYFILE > PUBKEYFILE
            creates a pubkey from a privkey file
            
        ./lamport.py -e PRIVKEYFILE MESSAGE > ENCODED.TXT
            creates a digital signature
            
        ./lamport.py -d PUBKEYFILE ENCODED.TXT
            decode the digital signature
            
        ./lamport.py -v PUBKEYFILE ENCODED.TXT MESSAGE
            check that encoded is the correct signature for message
    ''' 

def priv_to_pubkey(privkeyfile):
    pkf = open(privkeyfile, 'r')
    for line in pkf.readlines():
        print ' '.join(sha(k.decode('hex')).hexdigest() for k in line.strip().split())
    pkf.close()
    
def randstr(num_bytes):
    stream = open('/dev/random', 'rb')
    res = stream.read(num_bytes)
    stream.close()
    return res

def verify(pubkeyfile, sigfile, messagefile):
    decoded = decode(pubkeyfile, sigfile)
    print decoded
    
    m = open(messagefile, 'r')
    messagehash = sha(m.read()).hexdigest()
    m.close()
    
    if decoded == messagehash:
        print 'Good signature'
    else:
        print 'Bad signature'

if __name__ == "__main__":
    if '-g' in sys.argv and sys.argv[-2].isdigit() and sys.argv[-1].isdigit():
        generate_key(sys.argv[-2], sys.argv[-1])
    elif '-p' in sys.argv:
        priv_to_pubkey(sys.argv[-1])
    elif '-e' in sys.argv:
        encode(sys.argv[-2], sys.argv[-1])
    elif '-d' in sys.argv:
        print decode(sys.argv[-2], sys.argv[-1])
    elif '-v' in sys.argv:
        verify(sys.argv[-3], sys.argv[-2], sys.argv[-1])
    else:
        output_help()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJX89k8AAoJELTi4Nystj0llqwQAJpubTEEl4VysaO/IlPExNFl
mQ+md7cBEAmK3ithZcJSS5XsKXhSWs+w6b0IJRQB0r2XMHu4hZUiRicVubkiKtwc
jqZ8ots+Pr9mzcuw4dlVNH6Avr0WPpcObwCEWXzlLVP27kEeXCPSwPTp0Eh3YAgv
YGOkMvu4Yw2HmzwfNlHAdhdAPuNpW7dOHGwZAzBkeWegZDK/urDp7oEIw8D/DBBz
FbH/eXsRcHzmjk1cOi9lgQyaTZ3qSA+ULriSDM/LCjMNso+fKcxO0Pi7MpdNnxTj
l0KMHndWWaJkiROZ6M9zMyUt68/y4F1sJ0T/qy07msVYC8D7p+hEtOOJfAJuWiZL
VgLZzi5WiV93QwnKrJCrRR72SbPyAL9YGFbS5vBbilxv7dktoXUXnTeUP3Ddvn4B
4rW19wha3AZ5AT+ssuC8TggLzlDKlPLsGQmXEtyk7HAN+ynXnQ3ioHwwPOePzGHG
UfNIoXQnXDTLS4jLD09VXf3gYP27gb3ouT7dhOYQ0yignuE/5vvJ2UuneI52VV0i
pupS/vJ4h7qaKFJfhQmsczGkN0SnhyKjjE5Xi4A0NWHLKtaKU9Wxs0f/xA9b8sJg
hLjf2x1+3bHZeMsIPoxjd3z3xeHOgLj+BRSZTUxTmEZnlz6IaTzjNe5iSHL8ucef
tilzFtS1wA4DSSWPDbOg
=WGyj
-----END PGP SIGNATURE-----

May 18, 2018

Constructing Digital Signatures from One Way Functions

This is something I pulled from a pdf a while ago. It was referenced on Loper-OS, and so I thought it would be worthwhile to de-pdf-enate it.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Op. 52

Constructing Digital Signatures from One Way Function

Leslie Lamport
Computer Science Laboratory
SRI International

18 October 1979
CSL 98

333 Ravenswood Ave. Menlo Park, California 94025
(415) 326-6200 Cable: SRI INTL MPK TWX: 910-373-1246

1. Introduction

A digital signature created by sender P for a document m is data
item o_p(m) having the property that upon receiving m and o_p(m) one can
determine (and if necessary prove in court of law) that P generated the
document m .

A one way function is function that is easy to compute, but whose
inverse is difficult to compute [1]. More precisely one way function T is
a function from set of data objects to a set of values having the following
two properties:

        1. Given any value v , it is computationally infeasible to find a
        data object d such that T(d) = v .
        2. Given any data object d , it is computationally infeasible to find
        a different data object d' such that T(d') = T(d) .

If the set of data objects is larger than the set of values, then such a
function is sometimes called one way hashing function.

We will describe a method for constructing digital signatures from such a
one way function T . Our method is an improvement of a method devised by
Rabin [2]. Like Rabin's, it requires the sender P to deposit a piece of
data o in some trusted public repository for each document he wishes to
sign. This repository must have the following properties:

        - o can be read by anyone who wants to verify P's signature.
        - It can be proven in court of law that P was the creater of o .

Once o has been placed in the repository, P can use it to generate a
signature for any single document he wishes to send.

Rabin's method has the following drawbacks not present in ours.

        1. The document must be sent to single recipient Q , who then
        requests additional information from P to validate the signature.
        P cannot divulge any additional validating information without
        compromising information that must remain private to prevent
        someone else from generating new document m' with valid
        signature o_p(m') .

        2. For a court of law to determine if the signature is valid, it is
        necessary for P to give the court additional private information.

This has the following implications.
        - P -- or a trusted representative of P must be available
        to the court.
        - P must maintain private information whose accidental
        disclosure would enable someone else to forge his signature on
        a document.

With our method, P generates a signature that is verifiable by anyone,
with no further action on P's part. After generating the signature, P can
destroy the private information that would enable someone else to forge his
signature. The advantages of our method over Rabin's are illustrated by the
following considerations when the signed document m is a check from P
payable to Q .

        1. It is easy for Q to endorse the check payable to third party
        R by sending him the signed message "make m payable to R ".
        However, with Rabin's scheme, R cannot determine if the check m
        was really signed by P , so he must worry about forgery by Q as
        well as whether or not P can cover the check. With our method,
        there is no way for Q to forge the check, so the endorsed check
        is as good as check payable directly to R signed by P .
        (However, some additional mechanism must be introduced to prevent
        Q from cashing the original check after he has signed it over to
        R .)

        2. If P dies without leaving the executors of his estate the
        information he used to generate his signatures, then Rabin's method
        cannot prevent Q from undetectably altering the check m -- for
        example, by changing the amount of money payable. Such posthumous
        forgery is impossible with our method.

        3. With Rabin's method, to be able to successfully challenge any
        attempt by Q to modify the check before cashing it, P must
        maintain the private information he used to generate his signature.
        If anyone (not just Q ) stole that information, that person could
        forge a check from P payable to him. Our method allows P to
        destroy this private information after signing the check.

2. The Algorithm

We assume a set M of possible documents, set K of possible keys [footnote 1],
and set V of possible values. Let S denote the set of all subsets of
{1, ... , 40} containing exactly 20 elements. (The numbers 40 and 20 are
arbitrary, and could be replaced by 2n and n. We are using these numbers
because they were used by Rabin, and we wish to make it easy for the reader to
compare our method with his.)

We assume the following two functions.
        1. function F : K -> V with the following two properties:
                a. Given any value v in V , it is computationally infeasible
                to find a key k in K such that F(k) = v .
                b. For any small set of values v_1, ... , v_m , it is easy to
                find a key such that F(k) is not equal to any of the
                v_i .
        2. A function G : M -> S with the property that given any document
        m in M , it is computationally infeasible to find document
        m' /= m  such that G(m') = G(m)
        
For the function F , we can use any one way function T whose domain is
the set of keys. The second property of F follows easily from the second
property of the one way function T . We will discuss later how the function
G can be constructed from an ordinary one way function.

For convenience, we assume that P wants to generate only a single
signed document. Later, we indicate how he can sign many different documents.
The sender P first chooses 40 keys k_i such that all the values F(k_i) are
distinct. (Our second assumption about F makes this easy to do.) He puts
in public repository the data item o = (F(k_1), ... , F(k_40)) . Note that
P does not divulge the keys k_i , which by our first assumption about F
cannot be computed from o .

To generate a signature for a document m , P first computes G(m) to
obtain a set [i_1, ... , i_20] of integers. The signature consists of the 20
keys k_i_1, ... , k_i_20 . More precisely, we have o_p(m) =  (k_i_1, ... , k_i_20) ,
where the i_j are defined by the following two requirements:
        (i) G(m) =  {i_1, ... , i_20}
        (ii) i_1 < ... < i_20
After generating the signature, P can destroy all record of the 20 keys k_s
with s not in G(m) .

To verify that 20-tuple (h_1, ... , h_20) is valid signature o_p(m)
for the document m, one first computes G(m) to find the i_j and then uses
o to check that for all j , h_j is the i_j^th key. More precisely, the
signature is valid if and only if for each j with 1 _< j _< 20 :
F(h_j) o_i_j , where o_i denotes the i^th component of o , and the i_j are 
defined by the above two requirements.

To demonstrate that this method correctly implements digital signatures,
we prove that it has the following properties.
        1. If P does not reveal any of the keys k_i , then no-one else can
        generate valid signature o_p(m) for any document m .
        2. If P does not reveal any of the keys k_j except the ones that
        are contained in the signature o_p(m) , tnen no-one else can
        generate valid signature o_p(m') for any document m' /= m .

The first property is obvious, since the signature o_p(m) must contain
20 keys k_i such that F(k_i) = o_i , and our first assumption about F states
that it is computationally infeasible to find the keys k_i just knowing the
values F(k_i) .

To prove the second property, note that since no-one else can obtain any
of the keys k_i , we must have o_p(m') = o_p(m) . Moreover, since the o_i are
all distinct, for the validation test to be passed by o_p(m') we must also
have G(m') = G(m) . However, our assumption about G states that it is
computationally infeasible to find such document m' . This proves the
correctness of our method.

For P to send many different documents, he must use a different o for
each one. This means that there must be sequence of 40-tuples o_1, o_2, ...
and the document must indicate which o_i is used to generate that document's
signature. The details are simple, and will be omitted.

3. Constructing the Function G

One way functions have been proposed whose domain is the set of documents
and whose range is a set of integers of the form {0, ... , 2^n - 1} for any
reasonably large value of n . (It is necessary for n to be large enough to
make exhaustive searching over the range of T computationally infeasible.)
Such functions are described in [1] and [2]. The obvious way to construct the
required function G is to let T be such a one way function, and define
G(m) to equal R(T(m)) , where R : {0, ... , 2^n - 1} -> S .

It is easy to construct function R having the required range and
domain. For example, one can compute R(s) inductively as follows:
        1. Divide s by 40 to obtain quotient q and a remainder r
        2. Use r to choose an element x from {1, ... , 40} (This is
        easy to do, since 0 _< r _< 40 .)
        3. Use q to choose 19 elements from the set {1, ... , 40} - {x} as
        follows:
                a. Divide q by 39 to obtain quotient ...
It requires careful analysis of the properties of the one way function T
to be sure that the resulting function G has the required property. We
suspect that for most one way functions T , this method would work. However,
we cannot prove this.

The reason constructing G in this manner might not work is that the
function R from {0, ... , 2^n} into S is a many to one mapping, and the
resulting "collapsing" of the domain might defeat the one way nature of T .
However, it is easy to show that if the function R is one to one, then
property (ii) of T implies that G has the required property. To construct
G we need only find an easily computable one to one function R from
{0, ... , 2^n - 1} into S , for a reasonably large value of n .

We can simplify our task by observing that the function G need not be
defined on the entire set of documents. It suffices that for any document
m , it is easy to modify m in a harmless way to get new document that is
in the domain of G. For example, one might include meaningless number as
part of the document, and choose different values of that number until he
obtains a document that is in the domain of G . This is an acceptable
procedure if (i) it is easy to determine whether a document is in the domain,
and (ii) the expected number of choices one must make before finding a
document in the domain is small.

With this in mind, we let n = 40 and define R(s) as follows: if the
binary representation of s contains exactly 20 ones, then R(s) = {i : the
i^th bit of s equals one} , otherwise R(s) is undefined. Approximately
13% of all 40 bit numbers contain exactly 20 ones. Hence, if the one way
function T is sufficiently randomizing, there is a 0.13 probability that any
given document will be in the domain of G . This means that randomly
choosing documents (or modifications to a document), the expected number of
choices before finding one in the domain of G is approximately 8. Moreover,
after 17p choices, the probability of not having found document in the
domain of G is about 1/10^p. (If we use 60 keys instead of 40, the expected
number of choices to find document in the domain becomes about 10, and 22p
choices are needed to reduce the probability of not finding one to 1/10^p.)

If the one way function T is easy to compute, then these numbers
indicate that the expected amount of effort to compute G is reasonable.
However, it does seem undesirable to have to try so many documents before
finding one in the domain of G . We hope that someone can find more
elegant method for constructing the function G , perhaps by finding a one to
one function R which is defined on a larger subset of {0, ... , 2^n} .

Note; We have thus far insisted that G(m) be a subset of
{1, ... , 40} consisting of exactly 20 elements. It is clear that the
generation and verification procedure can be applied if G(m) is any proper
subset. An examination of our correctness proof shows that if we allow G(m)
to have any number of elements less than 40, then our method would still have
the same correctness properties if G satisfies the following property:
        - For any document m , it is computationally infeasible to find a
        different document m' such that G(m') is a subset of G(m) .
By taking the range of G to be the collection of 20 element subsets, we
insure that G(m') cannot be proper subset of G(m) . However, it may be
possible to construct a function G satisfying this requirement without
constraining the range of G in this way.

REFERENCES

[1] Diffie, W. and Hellman, M. "New Directions in Cryptography".
_IEEE Trans. on Information Theory IT-22_ (November 1976),
644-654.

[2] Rabin, M. "Digitalized Signatures", in _Foundations of
Secure Computing_, Academic Press (1978), 155-168.

FOOTNOTES

[1] The elements of K are not keys in the usual cryptographic sense, but are
arbitrary data items. We call them keys because they play the same role as
the keys in Rabin's algorithm.

- ---

Editor's note: 

Some characters in the original paper were changed to members of the 
Roman alphabet. 

/= takes the place of the "not equal" symbol.
_< takes the place of the "less than or equal" symbol.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJX8l0SAAoJELTi4Nystj0lHlcP/jfrI7hl6urPXXh3dShYxN62
vY/NfuL2WDK0bjHA65EoDN0u8MEZ7CuVtm+Rzh6GRYyUTLyZtowWFNy3mDD5aMl0
zePELZ4w6aOr7zRcwnBF5v+tHV7y3YL8yyforLYU6+1CYRwqu/1UZRyeaV9QfBA+
jYDEyxuR8zVTJ+JKlM8PtAGEi96KsuZiajP21jlvm0lN4/5bJ7ASBFYTqYiI6pwH
QOE1QaBKFHmTKZTcF/BkrE1J13ewgpqNu4as2aepexsSW7hadbQRVNl5r3lABohd
3bDyPYlUcZHf/XZSQX+u0oxLUkdjKhDGyNDQXIdBisNU04jqxwN1wZ+IvZYLRH25
u7IkSq+cCwaA8vpDHzviRV+Hp+9u4GAqsJo5TYuavQ8mMQDjN93ji1M0CHM54E96
nFUIvAHNbdsyaQlNuFek+/xFinhFiAERbO1c9SggyxAB1Ct6uyOycgTW5WYfFZrR
wiQ9Q9YYVX2CqciERj/AUiliSF/tM50+bEJdrHdA2CASlUKDrTmPtiidV9HNvNjE
efakV7MsKytCjNJM3Ay6Wtm5nd36YmBOKu4g3PtG8huUr/EXYEeIPGaVWUn3tKQF
X2SvUjnHBj6XU8oe7X3ZKGO/iNYYzCyE9FfrE/Dqu4FZbg9YdJL0fPMX9Ad3GM5A
7f97Wy9HTnjzyuL1IVwF
=VRzi
-----END PGP SIGNATURE-----