Over the last few years I have been in a position where I have been party to discussions about software with those who know little to nothing about the art of computer science. Many of these people were in charge of very large governmental budgets for some fairly high-profile public-facing software systems. These systems typically involved a large amount of deep algorithmic work, which on the face of it might seem simple, but was actually very complicated. Given this situation you would have thought people would listen when advice was being given out. Oh how wrong you would be!
For some context to this I will introduce a problem, which is not strictly related to the situation described above (to protect the guilty), but has a similar flavour.
The problem I’m going to describe is that of allocating a set of school children to their (or I suppose their parents) most desirable school. Each parent is asked to rank in strict order of preference 5 schools. A centralised system is then used to assign each child to a school.
The naïve (serial-dictatorship) solution to this problem would be to simply assign each child to the first school on their list that is not already full. So what is the problem with such a solution? Well, firstly we can end up with an assignment of children to schools that does not necessarily have maximum cardinality, i.e. there exists another allocation in which a larger number of children are matched to schools. Also, we may end up with an allocation in which a child c1 ends up at his fifth placed school while another child c2 obtains his first-choice, but if c2 moved to his second-choice then c1 would be able to get his third-choice. If we allow this switch to take place we clearly end up with a “better” allocation.
Despite the poor characteristics obtained from the naïve algorithm, I suspect that many many organisations, both governmental and private, use such an algorithm when assigning people to resources when preference information is available. The problem is that people in charge of building such systems often only see the need to allocate these people to resources in whatever way possible and in the shortest time. So what happens? Well they get a system that looks flashy and kind of works. However a couple of months in when they have a group of people complaining and operating outside the system, the reality finally dawns on them. I’m sure in the UK alone we could count quite a few computer systems that have fallen under the weight of this pressure.
So we have shown that the naïve algorithm does not always work. Can we safely assume that the heart of a software product lies in those complicated algorithms at its core? I think we can say so with Google, but who knows in general. In the example I described I think this is definitely the case. So what would be the “best” way to allocate these children to schools?
There are different meanings of “best” but all would ensure that the allocation has maximum cardinality, without this we are at a lost cause straight away. Following this we could attempt to maximise the number of children assigned to their first-choice school, then subject to this we maximise the number of children assigned to their second choice school, etc. Therefore if we receive an angry parent complaining that a child got their third-choice school you can at least tell them that there was no other possible assignment in which some other child with their first or second place does not get worse off. Hence there is no incentive for the administrator to allow any changes to take place. This is known as a greedy maximum matching (allocation). Lastly we need to make sure that these algorithms are efficient, most algorithms are no use to anybody if they are exponential.
So there we go, just remember that naïve algorithms are not always (and often not) the best.