an afternoon with Don Knuth

It’s not often you get the chance to spend an afternoon with probably the most famous hacker/developer/computer scientist the world has ever seen. However, last week I got the chance to do just that when myself and three colleagues had the great fortune to spend some time with Don Knuth.

For the uninformed IT “professionals” that have never heard of Don Knuth this is the guy that brought to us the idea of analysis of algorithms and asymptotic notation (big-O notation), the Knuth-Morris-Pratt string searching algorithm, the Art of Computer Programming book series, the list goes on. However, he is not only an “algorithm’s guy”, he also developed the Tex typesetting system and the METAFONT language used to define vector fonts. So basically he is the most famous computing guy out there.

Knuth is a surprisingly easy guy to talk to. Sure, he can really lose you pretty quickly in a conversation, but he also has some great insights.

Our conversations tended to centre around stuff to do with algorithms. His next volume of the Art of Computer Programming will likely focus on constraint satisfaction problems and satisfiability problems – the former being something I worked on myself in the not to distant past. I asked what he thought was a good algorithm to teach people and he said he thought the biparite graph matching algorithm was a nice one in terms of beauty (he did mention another which escapes me now). Not everyone will find the algorithms stuff that interesting (you should!) but his view of beauty is maybe something more universal.

He also expressed a love for writing code, he said that when he gets up in the morning he thinks about writing code and misses it on days when he doesn’t get the chance. That is pretty cool by me and sits in stark contrast to many academics. I got the feeling that he wasn’t too keen on the “apps” developers as he called them. My guess is that his thoughts lie with more meaningful problems than fart apps – however people download them so who are we to say. Still, there was definitely some lamenting going on about the fact that people use software without ever trying to understand what the software is actually doing. That is, have at least a high-level view of the data structures and algorithms used that make the said piece of software useful. Having this kind of understanding allows you to select the right tools for the job. In my experience people that tend to have this knowledge and understanding are far better developers and is likely why Google, Microsoft, Facebook et al. try to attract developers with this kind of knowledge.

He was telling us that he watched The Social Network on the way over on the plane. He said he thought it was great how Mark Zuckerberg was also someone who just liked building stuff like him – this was something Zuckerberg said himself at Startup School 2010. What is even cooler is Mark Zuckerberg actually sent him a copy of the latest Art of Computer Programming book and asked him if he would sign it for him.

So Don Knuth himself will have long forgotten who I am but at least I will be able to recollect years down the line this encounter with a computing genius.

generating a unique range of numbers

Often at work I find the need to generate a set of n unique integers in a specified range. In order to do this as efficiently and as easily as possible, I created a small C# class, which a colleague thought may be of general interest. Hence, I’m posting it here. I’m sure someone else has come up with a similar (or better) way to do this in the past, but I’m sharing my way regardless 😀 .

The code is shown below (and you download the C# class here):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 class UniqueSetGenerator
 {
      private int[] store_;
      private int size_;
      private Random random_;
 
      public UniqueSetGenerator(int size, int start)
      {
          size_ = size;
          store_ = new int[size];
          random_ = new Random();
          PopulateArray(start);
      }
 
      private void PopulateArray(int start)
      {
          for (int i = 0; i < size_; i++)
              store_[i] = start++;
      }
 
      private int Delete(int pos)
      {
          int val = store_[pos];
          store_[pos] = store_[--size_];
          return val;
      }
 
      public int GetRandomNumber()
      {
          if (size_ <= 0)
             return -1;
 
          return Delete(random_.Next(size_));
      }
 }

I think it’s pretty easy to see how this works, so I will not go into it in much detail discussing it, but the sequence of figures below shows the basic operations.

First populate the array with the values.

First populate the array with the values.


Use the Random class to obtain our first random number - in this case 6. Now copy the value at position size_ in the array to position 6, and decrement size_. Note not we don't actually delete anything from the array.

Use the Random class to obtain our first random number - in this case 6. Now copy the value at position size_ in the array to position 6, and decrement size_. Note: not we don't actually delete anything from the array.


We then use the Random class to generate another random number in our reduced range - in this case 3. We then copy the value at position size_ (i.e. 9) to position 3, and then decrement the size_ count as before. This process continues until size_ is reduced to 0.

We then use the Random class to generate another random number in our reduced range - in this case 3. We then copy the value at position size_ (i.e. 9) to position 3, and then decrement the size_ count as before. This process continues until size_ is reduced to 0.

So when would you require something like this? Well, say you need to generate unique random IDs with values 1 to 10, then you can use the class as follows:

1
2
3
4
5
6
7
8
    UniqueSetGenerator uniqueSet = new UniqueSetGenerator(10, 1);
 
    for(int i = 0; i < 9; i++)
    {
        int id = uniqueSet.GetRandomNumber();
 
        // Now do something with the id....
    }

For the moment I have only included a C# version but I will update this post with a Java version soon. Hope some of you find this useful.

does not being aware of np-completeness make you dumb?

I have recently been reading a series of posts ([1][2][3]) from authors about the pros and cons of having or not having a degree in CS.  I’m sure the argument goes both ways, but my thoughts are focused on how does someone who has not spent 4 years obtaining a degree, accrue the knowledge they may need.

For the purpose of this post I will use the concept of NP-completeness (or in fact algorithmics in general) as an example.  I think from experience I have learned that NP-completeness is a topic that quite a few developers are not familiar with despite its importance in the field.  Many may be aware of it through Jeff Atwood‘s post over at codinghorror, where he suffered at the hands of the topic’s rigour.

So how does someone who has never obtained a CS degree find out about such a topic?  My feeling is that they don’t.  OK some people will go out their way to find such topics but I haven’t met too many that do.  Some may even point out that NP-completeness did (does) not appear on their course curriculum, in which case it probably boils down to Dan Dyer‘s point about the virtues of choosing a good university course.  What I’m trying to say is that someone who goes to a decent university and obtains a good degree has spent (possibly) 4 years learning about the foundations of computing, giving them a good base to make important technical decisions.

For example.  You are hit with a problem at work, let’s say some sort of job scheduling problem or even allocation of people to resources.  The problem seems easy when described and the agile developer takes over you and you start churning out code.  However, if this person had only taken a simple algorithms course at university, he might have thought twice about jumping into such a problem – as many variants of such problems are known to be NP-hard.

This brings me back round to my question, where do such developers learn about these things?  They may read articles on sites like dZone or reddit, but c’mon, let’s face it, these sites are pretty good at giving you the latest articles about Java, .NET, jQuery, but it’s rare that you see an article about algorithmic issues rise to the top of the popularity list.  I mean who wants to read the details of what a B+ Tree, Trie, Boyer-Moore string matching algorithm, etc are, because it’s hard going, instead most will take the easy option of reading guff like 7 ways to write beautiful code.  However, if you attend university you are forced to know many of the finer details of CS, as you are examined on them. OK, you may not need them for a simple web app, but an employer is certainly going to be happier that you have the details in your head should you need them.  The point is that you never know when you are going to need such things, and knowing when you do is maybe what you spend 4 years gaining.