abolish the switch statement

I can’t explain how much I hate the switch statement. I often do everything and anything to avoid using it. I just think that it makes code very very ugly. Moreover, the amount of times I have seen people using it when they only have two cases, you just wouldn’t believe me! Let’s look at an example in Java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
switch (n)
{
    case 0: 
       System.out.println("I'm zero");
       break;
    case 1:
    case 3:
    case 5:
       System.out.println("I'm odd");
       break;
    case 2:
    case 4:
       System.out.println("I'm even");
}

Surely no-one can tell me this actually looks “nice”? I mean coding is all about looking nice and neat after all. You can give the compiler something that looks like dog meat and it won’t give a shit as long as it’s valid syntax. Code is ALL about us humans reading it. The above is also a simple example and switch statement are normally way more complicated than this.

The normal argument for using a switch is that it looks better than loads of if-then-elses, which I completely agree with as that looks ugly as well. Also, with Java (and others) you can only switch on integer values which is the height of annoyance. Furthermore, there can’t be a single person that has not suffered the “forgetting the break” problem on a switch statement. For the record, my main bugbear is the extra indentation the switch forces on me along with the fact that scrolling all the cases is frequently a nightmare.

So what am I proposing? Well what I tend to use to solve both the ugliness problem and the lack of support for switching on stuff other than an integer is a HashMap. I’m sure some are screaming “Am I hearing this idot correctly”!

I really do think this looks particularly nice when used in code and can reduce a massive sprawing switch statement in a method to one simple call. So what does this entail. First lets assume that we have a method that takes a string as an argument and depending on the contents of this string we perform a specific operation – this is a particularly common task. Let say we have a class like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Operations {
    // Operation is an interface with the method do attached
    HashMap<String, Operation> mapper;
 
     public Operations() {
        mapper.put("eyeSurgery", new EyeSurgery());
        mapper.put("heartSurgery", new HeartSurgery());
        mapper.put("heartKeyHoleSurgery", new HeartSurgery());
       // etc
     }
 
     /**
      * Carries out an operation, which is dependant on the value of op
      * and returns a string containing the results of this operation.
      */
    public String do(OpContext op) {
        return mapper.get(op.toString()).run(op);
    }
}

Now surely the code in method do looks better than a switch statement would? Obviously in this case you couldn’t even use a switch statement in Java, but even if the type of the HashMap key was Integer, I still proclaim this looks better. OK, you have the extra setup cost but it’s got to be worth it and should run fast enough for most needs. It also opens up the possibility of “switching” on any object based type rather than just an integer – presuming a sensible hash function being declared for your object. Obviously this is a simple example but hopefully it illustrates what I’m try to put across.

I’m not sure if this was really obvious and everyone is doing this already. Hopefully someone will get something out of this. So, down with the switch statement, long live the hash map!

bad software engineering – made easy

Often I find myself suffering from a chronic bout of “project fatigue” – I made this term up off the top of my head so don’t go to your doctor asking about it 😉 . However, it’s not the symptoms of this illness that concern me, it’s the consequences of getting it. So what is this “project fatigue”?

You may be one of the lucky ones never to have experienced this, but I know my close friends who also work in the programmer trade have suffered. It can onset at any time. It’s characterised by a morning or afternoon of doing anything but typing code into the computer, which can then be followed by days (or weeks) of doing absolutely nothing related to the work you’re supposed to be doing. And it all happens because you got tried/bored/frustrated/stuck with the project you are working on.

When you first start a project (or a new job for that matter) it normally has a euphoric feel to it. You charge in head first, and it’s brilliant, you are learning all these new things. You come in every day and after the obligatory email and quick favourites check you are straight on to coding away like a demon (or daemon). You’re thinking that “if I’m this productive and can get so much done, I don’t understand why it takes so long to finish things”. What you have forgotten about is THE FATIGUE.

So after maybe a month of working away night and day, it hits. Myself, I normally find it starts when I find out that the way I was doing something doesn’t seem to work. This inevitability means that some of the code you have written is now useless, and other stuff you have done really should be changed to accommodate the new design. To generalise, I would say that it causes you to do ANYTHING just to get the project done. This maybe because you either don’t have the time to let the fatigue pass, or that you are just so desperate to get off the project that anything will do.

What this means is that where before you were creating a design of your code to die for, you now just make sure that it “works”. This is where it becomes so very easy to do some bad software engineering.

For example, you think “I should really change that class/method to such and such” but you also are thinking “well I could just hack round it by doing something not so nice, as that is bound to be quicker”. When you start thinking like this you should ALWAYS make sure that you do the former rather than the latter, as much as it pains you to do so. Why? When you come back to looking at the software for a later version you will be glad you done it, and also it never takes as long as you think to do it the correct way.

So how can we avoid the fatigue? I really don’t know, maybe YOU have some suggestions? What I tend to find helps is leaving the code for a day or so and not think about it. Why not write a script! Obviously this “leaving it for a day or so” may not sit with your employer so well, but if he only knew that it would mean less bugs in the long run then it may be a different story. I dunno, as this line of thought doesn’t always work.

What I think is a far better option is to have good mentor. As I would imagine that about 98% of the fatigue is caused by a lack of motivation to finish or that you are stuck on a certain problem. Talking to someone about the work and resolving points that you are stuck on can be empowering. You can often be even more productive than just before the fatigue set in after a simple chat with someone – I often experienced this after a meeting with my supervisor when doing my PhD. It’s important to have the right person doing the mentoring though – as there are people that can make the situation worse by just plain confusing you or demoralising you with their “superior” knowledge.

So in the style of a counselling group, my name is Gregg and I’m suffering from project fatigue. What are your suggestions?

perl is dead, long live…perl?

When I started writing this article it was going to be about the choice of language people use when they create a quick dirty script for say a one off task. For this type of thing I tend to find myself using Perl and since I thought Perl was maybe a little old in the tooth, I wondered what the “cool kids” were using these days. However, this got me thinking more about the benefits of such scripts both to the programmer and their employer.

Often I find myself needing to process/generate a file in some way. To do this kind of task I feel that Java, C#, C++ etc are just waaaay too heavyweight for the task – my default reaction is that you need an interpreted language. For me this is Perl. I know PHP pretty well but it just never enters my head to use it for a command-line task, Python I kind of know but not very well, and Ruby again I just don’t know well enough, but would probably only consider this an option for a web app like PHP – no reason why I only see Ruby and PHP as web app not command-line options. So what do people use for this kind of stuff? I’m interested.

Anyway, this line of thought led me to the following observation: writing small scripts to do simple file manipulation/generation and system tasks makes you a better programmer. Even if the task you are trying to complete seems small and probably a one off, I say “write a script”. Why?

First, this may give you the opportunity to stray away from the heavyweights like Java that I mentioned above, and learn something new (the more you know the better programmer you are. Right?). Your boss may not like the idea, but the thing you are doing is work related so you can tell him it will save time, which I promise you in the long run it will. Even if you find you run the script only once you will find the learning process will have stood you in good stead.

Second, it’s an ideal way to break the monotony of your everyday work cycle. You may have been working on a particular project for months and, as happens, you are beginning to hate the thought of even looking at the code never mind write more of it. So, if you think of a script that will help you in some way in your day to day work, and will benefit both you and the company, then write it. Not only does it break the monotony but when you finish it there is also a sense of achievement that you feel by actually completing something. This sense of purpose is then reflected back into the main project; I mean you want that sense of achievement again. Right? So everyone wins.

Third, well surely the above two are more than enough reason to do it. If not tell me some more 🙂 . You can always turn the little script writing into a competition within your work, i.e. who can come up with the best/most useful script! Don’t worry about finding these types of scripts to write – I generally think of a couple everyday. Just look around you and you will be amazed at what you will find.

Over and out.

NP-completemess

Well yesterday I read a blog post by Jeff Atwood over at www.codinghorror.com about NP-completeness. In the comments to his post I felt that Jeff got a somewhat raw deal with the criticism he received. Ok, the details of his post were not exactly accurate to the nth degree, but his post may have identified a problem set that some of his many readers may never otherwise have been aware of. This has got to be a good thing, right?

So anyway, with a little experience on my side with these NP-complete problems maybe I will try to shed some further light on the details. Remember though, this is quite a complicated area and many books have been written about this stuff and many will save a whole chapter just as an introduction. Thus, I will try to keep this as light as possible and hopefully the detractors can fill in their own details where I leave them out.

First, one of the biggest questions in the field of computer science is does P=NP. For those that wish to pursue research in this area, there is the Millenium Prize Fund offering a cool $1million for a proof either way of this question. Those living in the UK will be hoping that it’s paid in dollars with the piss poor exchange rate 😉 .

Now what is P? Well P is the set of problems that can be solved in polynomial time. That is, a problem where an algorithm is known to take O(nk), where n is the size of the input and k is some fixed constant; fixed meaning it doesn’t vary with the input size. An example of such an algorithm is Quicksort, whose complexity is O(n2).

So what is NP? Well first, NP does NOT stand for Non-Polynomial, if you want to take anything for this just take that, as it will save you embarrassment when you say it is (sorry I’ve seen this so many times). It actually stands for Non-deterministic Polynomial. Now when is a problem in NP?

For a problem to be in NP, the problem must be a decision problem. See, this is where it escalates trying to describe this stuff, as you now have to define a decision problem for all this to make sense. Anyway, a decision problem is a problem where the answer can be given as a simple ‘Yes’ or ‘No’. For example, given two cities is there a direct plane route between these cities. Clearly the answer is Yes or No. For a more computing/maths related example we can say that given a graph G, does there exist a path between two vertices, say u and v, in G.

Now that we know what a decision problem is, we can now describe what the class of problems in NP look like. Mmmmm, if only. In actual fact, I now need to describe what a non-deterministic algorithm is. Sigh. Let’s keep it simple if not 100% accurate. A non-deterministic algorithm has two phases. The first phase guesses an answer (certificate) to our problem and the second phase verifies that this certificate satisfies the constraints of the problem in polynomial-time. For example, consider the problem of matching 3 sets of people (known as 3D matching). Let’s name these people men, women and dogs, with each set having size n. So what we want is a set of (man, woman, dog) triples, such that each man, woman and dog appear in exactly one triple and the matching is of size n, i.e. everyone is matched. So the decision problem here is: given such an instance does there exist a matching of size n? The non-deterministic algorithm is: guess a set of (man,woman,dog) triples, then first verify that our set of triples is of size n, by counting the triples, and then to ensure that it is a matching, we can check that no two triples have someone in common.

Phhhhewww, that was longer than I imagined, see why Jeff’s abbreviated version may be better if not so accurate. I mean I have not even described what an NP-complete problem is yet. We can however describe what the class of problems in NP now are.

The class NP is defined as the collection of all decision problems that are polynomial-time solvable using a non-deterministic algorithm. We note here that many of the problems we see in everyday (programming) life are defined as optimization problems rather than decision problems – an optimization problem is one where we seek to maximise or minimise some property, for example the Travelling Salesman Problem. However, it is known that all optimization problems can be re-written as a decision version of the same problem.

So can we now define what an NP-complete problem is? Mmmm just about. In fact, I will state it here and then describe the one further bit of information we need. A problem X is NP-complete problem if:

  1. X is in NP;
  2. X is NP-hard.

Often you hear people getting confused by NP-complete and NP-hard. They are different things, an NP-hard problem need not be NP-complete, basically NP-hard problems are at least as hard as those problems in NP. Also, if a combinatorial decision problem is NP-complete we say that the optimisation problem is NP-hard.

So what does it mean for a problem to be NP-hard? Well to show that a problem is NP-hard we must show that it is polynomial-time reducible to an NP-complete problem (in fact, to be precise, what we have to show is that it is reducible to ALL NP-complete problems, however due to the transitivity of reductions we need only show it’s reducible to a single NP-complete problem). That is, we need to show that we can reduce an instance of one problem to the instance of another problem, i.e a Yes answer in one problem maps to a Yes answer in the other and vice-versa, similarly with the No answers.

Now if only we had one problem that we know is NP-complete. Well we do! This was shown in Cook’s Theorem. So now all (and I say all very lightly) we have to do to show that our problem is polynomial-time reducible to satisfiability, or some other NP-complete problem – satisfiability (SAT) is the problem Cook showed to be NP-complete. Another (classic) example of an NP-complete problem is the 3D matching problem example given above.

And there you have it, NP-completeness in just over 1000 words, although I missed a few details out I hope this is useful to someone. Incidentally I have meet Stephen Cook (ala Cooks’ theorem) before and I’m pretty sure from experience he would struggle to tell you this in under 1000 words 😉 Actually maybe not………………..

Over and out.

being dynamic is overrated

I was going to post about a little problem that I experienced while undertaking a project I’m involved with but canned it thinking it was of limited interest. That was, however, until I made the same mistake again yesterday, so I thought I would document it to save my own sanity, even if there is no-one else who cares – not that anyone reads this anyway.

So to my faux pas, and believe it or not this cost me a few hours! The problem: well as part of my project (a bike shop website for those who really want to know – www.cyclewize.com) I’m programming in PHP using cakePHP. The advantages/disadvantages of PHP can be debated from dusk till dawn if that is your thing – just remember though, while you are debating all those smart people are just getting on with doing stuff. That said, my problem was, in a sense, due to the use of PHP or dynamic languages in particular.

Anyway, back on track, so the component I was working on simply uploads a set of photos and scales them accordingly. Simple as that. The problem that I kept seeing was that although the photographs were being uploaded ok, i.e no errors were being reported, I was unable to copy the files from their temporary directory over to their new home. Since no errors were being returned I was stuck as to what was happening. I battled away putting print_r’s in here there and everywhere to no avail. Then I just happened to check my call to move_uploaded_file. On inspection, I noticed I had a slight misspelling of a variable name in one of the functions arguments, thus only at that point did the variable come into existence. Therefore what was getting passed to the function was the empty string and hence move_uploaded_file could not copy the file – you would have thought it should have raised an error but no. However, as it turns out, even if it had raised an error it wouldn’t have helped me much. Why? Because straight after the call to move_uploaded_file I was doing a redirect to a new page. Therefore, any errors related to the problem http request were output to a screen which flashed by too quickly for me to notice.

So what is the root of this problem? Well if I was using a complied language it is likely that the compiler would have warned me that the variable I was passing to move_uploaded_file was either not declared or at least never been initialised. I’m pretty sure that a warning message would have been output (which I missed) by PHP’s interpreter but this is no substitute for a compile-time message. As much as I love the freedom of dynamic languages you have to be aware of this significant drawback, and people can go on about how this stuff gets picked up through thorough testing but I’m not sure that’s good enough. Also, the page redirect was a killer, as it made me miss all the output messages. It’s probably worth while not performing redirects if you find something isn’t working, just to ensure that you are seeing all the messages output by the interpreter (you can always switch it back on later). Finally, some sort of lint check should have picked this up. So from now on I think I will be running things through php –l to check that my syntax is good.

There you have it, a nice root cause analysis! I probably missed out the fact that it was my rather inane programming that was the main problem, but bugger it, I’m not really into damaging my ego.

programmer rehab

Faced with the task of learning a new programming language (and its associated libraries) I was left considering what is the best way to do this effectively. I will now describe what I do, and by no means am I advocating its use, be aware of the hidden dangers that lie with it.

My first port of call is normally to try and see if some book on Amazon is the overwhelming favourite of the masses and purchase it. I must admit though, I often get the book, flick through it, and then probably ignore just about everything in it. I do however normally remember the name of the book and the author and when someone asks me if there is a book I recommend I spout this information to them. So the moral of this is probably never to ask me to recommend a book as I’m probably grossly misinformed about what is actually in it.

Anyway, in the first week of learning the language I will try to map the syntax of the language to concepts I know, i.e. object-orientation. This probably presents the first problem: what if the language is not really object-oriented? My solution, make it object-oriented 🙂 This can prove tiresome in certain languages, and JavaScript in particular. However I’m not one for letting this type of hurdle get in the way, and tend to find a way to do this (with JavaScript this was the module pattern described by Douglas Crockford). By the second week I’m normally getting pissed off with my lack of understanding of the language’s features. I think this in part stems from the fact that my style of coding appears to be a copy-paste based system. What I mean by this is that quite often at the start I won’t even bother learning how to say define a class – I will find it out once then just copy-paste the class definition in from another class and change the name. A similar process is then used for other language features (I must point out I’m not copy-pasting the code itself though, that’s just waaay too 80’s). So those first couple of classes and library calls can be like kicking smoking the crack pipe.

After this initial push though, it tends to get easier, I replace the crack induced cold turkey with a slightly more sociable nicotine based habit, but experience constant irritation at the need to stand out in the cold street whenever I want to spark up in public. At this stage I try to abstract most things away from the language itself and think more about how I can isolate change. In other words I tend to have the thought process where I ask myself, if someone asked me to change a requirement, how can I do it with just a change to one line; oh and preferably that one line I would have to change is at the top of the file so I don’t have to search through everything to find it. My motto is that not everyone cares enough to want to learn how something works and is happy if the process is simple and easy. Me, I prefer to waste a mountain of time trying to figure out every detail of how the code fits together – I really need to give up that pursuit, rehab or something. So at this point I start to trace into every language feature, not happy with it just working – tantamount to starting chain smoking. If I get through this stage then I will normally stick with the language, many however have fallen by the wayside after two weeks. More so learning specific frameworks though, as you start thinking “God I could do this far easier if I was writing ALL the code myself”. Remember if you get these thoughts they are normally bullshit though, try to ignore them, as you ARE wrong, you just don’t know it. It’s your mind trying to trick you out of a little hard work. All you will be is a hopeless quiter!

So only after I’ve kicked the crack pipe by weeding myself on to ciggies, and then developed a massive chain smoking habit, do I get the moment of enlightenment where I dump the ciggies and tag myself clean of any irritants (well apart from all those people who tell you that you have learned the wrong language, who all should go get a life). At this point I still occasionally use my copy-paste system described above but it’s becoming easier.

As I said at the beginning of this post I’m not sure this is the best way to go about this process. The main problem that I see is that I don’t learn all the language specific features (maybe this is a good thing). In most languages I work with a standard set of ideas that I apply to the specific language. This always leaves me thinking I could get more out of the language if I only pushed the boundaries a little further. For example, I always think that I’m not quite getting as much out of things like closures, lamdas, etc. I use these things but the way people rave about them I feel there must be more to it.

So that’s it. It usually takes me about a month or so to get comfortable with the language. I then use the language exclusively for around six months and then, for whatever reason, have to shift back to an old favourite which I have forgot how to use. So I dust down the crack pipe and seek out the Rizlas and start all over again……