is ruby killing your career?

I’m probably at the point with Ruby where I consider it my programming language of choice (I program in both Ruby and C++ in my day job).

Over the last few years I’ve kind of grown to love Ruby but I’m not really one to get passionate over someone else’s choice of programming language – apart from Java, which, I’m sorry, I hate. However, when it comes to employment, there is no doubt in my mind that being competent in a particular programming language can strongly influence A) getting an interview and B) getting the job.

This is why ruby developers, like me, are killing their career. Sure Ruby is cool and Rails is awesome but do a quick check on job boards and see how many people are looking for a ruby developer. Actually, let me save you the time I’ve done some of the work already.

I’m not claiming this to be scientific in anyway what-so-ever but it does warrant some thought. I only searched using the programming language as a keyword, which, I know, may not give the full story but should convince you there is some merit in the point that I’m trying to make. Additionally (and I suppose somewhat importantly) my search area was restricted to Scotland.

First up I carried out a search on s1jobs.com. The table below gives a summary of the results:

Language Number of jobs matching keyword
Ruby 3
Java 18
C# 26
C++ 9
PHP 7

I then tried a cwjobs.co.uk:

Language Number of jobs matching keyword
Ruby 2
Java 35
C# 45
C++ 45
PHP 4

As you can see, the job prospects for Ruby developers here in Scotland are somewhat dire. Sure, people don’t always look for a particular programming language when employing someone (which is a decent policy) but, as I said above, it helps a lot.

I decided to take my crude search a little further as I thought “Hell, there will be waaaaaaay more cool Ruby jobs in London”. Below we have the results, just cwjobs this time:

Language Number of jobs matching keyword
Ruby 57
Java 792
C# 838
C++ 611
PHP 196

That was kind of disappointing! Ruby still doesn’t do that great – even worse when you realise there were over 200 that mentioned Perl and 150 Python. By the looks of it if you want to maximise your chances of getting a job in the UK, and already doing Java or C# in your day job, you’d be better off learning C/C++ in your spare time.

Is all this going to stop me coding in Ruby? Probably not. Is it worth thinking about for a minute? Yes sure. If I was starting my own company and was hoping to get some developers in then I’m likely to be faced with a problem. Yes you can train people up, but that costs time and money. When they leave it may be worse, as the chances of finding replacements at the required skill level will be difficult. Finding a Java/C#/C++ programmer is bound to be far easier.

So is it all bad news for us Ruby developers? Well not if you plan to move to California – yeah yeah I know I’ve went on about it before. I’m not exactly sure of the popular job boards in the US so I went with the only one I knew off the top of my head, careers.stackoverflow.com. The results for the Bay Area are as follows:

Language Number of jobs matching keyword
Ruby 27
Java 33
C# 10
C++ 23
PHP 17

Maybe this was a skewed sample set but impressive all the same. So moral of the story is if you want to be a well paid Ruby hacker make sure you don’t stay in Scotland :-).

sticking with what you know

There comes a time in every programmers life when they have to learn new things and step out the box. Yeah it’s difficult, for sure. It’s all too easy to create the latest application in your software empire, using a language you’ve been developing in for the last 10 years. However, the real problem is thinking this is the only choice. When is it time to abandon this certitude?

First, we cover the forced abandonment. This is when you are pushed kicking and screaming into pastures new, whether you like it or not, i.e. the new job. Here, not only is the new language curve ball thrown (viciously), but you also get whole new set of business rules into the bargain. So what do you do? You program the new language like the old one, only translating the syntax in your head. This is not the best way to learn a language though. Why? Well consider those C programmers trying to program imperatively in Java, Java programmers in JavaScript, C++ programmers in Ruby, and so on. When there is a change in paradigm this mapping strategy just doesn’t work – a similar situation exists with languages that contain a more powerful expression set. It also encourages the behaviour where people learning enough to get the job done, without understanding what is really happening, or that there may have been a better way using “unmappable” language’s features. A better approach would be to write something small, and new, that allows you to explore the language’s features. I’m sure most people can think of something they could write. Furthermore, if you can make it useful to other people, or even your new employer, then everyone’s a winner! This is something I touched on before.

For many people though, this is the only time they will ever consider abandoning. This is sad, and a poor characteristic in a programmer. And to be honest, I just don’t understand it. That’s not to say that I don’t accept that people just do programming as a job, then go home and don’t think about it. However, it’s like most things in life, it’s nice to progress?

As a programmer there will also be other signs that the tide is turning, and you don’t have to be too alert to spot these. Previously I wrote “Perl is Dead, Long Live…Perl?” and being a big Perl fan it was sad to see the language apparently dying, so I know what it’s like. Some signs to look out for may be:

  • the language features are not moving on (Java watch your back) – the people who created it no longer care,
  • the community surrounding the language is dwindling – the people who use it no longer care,
  • there is little in the way of choice when selecting libraries/frameworks – the experts have fled,
  • other programmers have never heard of it – there is no buzz,
  • jobs using it are few and far between – businesses have given up on it, the death kneel.

However, this is all not to say that you give up on your language just because it’s no longer cool – popularity is by no means a great indicator that something will suit your needs. It need not be the case that you give up on your language of choice, instead it could be that you contribute and drag the language forward. But be careful with this one.

Finally, any decent employer will want to see that you are continually developing your skill set – their business needs are continually evolving, so why aren’t you? You are much more likely to land a better job if you contribute to your own education in some way. It looks good and it’s also something to talk about.

So go out and learn something new today, and stop sticking with what you know.

generating a unique range of numbers

Often at work I find the need to generate a set of n unique integers in a specified range. In order to do this as efficiently and as easily as possible, I created a small C# class, which a colleague thought may be of general interest. Hence, I’m posting it here. I’m sure someone else has come up with a similar (or better) way to do this in the past, but I’m sharing my way regardless 😀 .

The code is shown below (and you download the C# class here):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 class UniqueSetGenerator
 {
      private int[] store_;
      private int size_;
      private Random random_;
 
      public UniqueSetGenerator(int size, int start)
      {
          size_ = size;
          store_ = new int[size];
          random_ = new Random();
          PopulateArray(start);
      }
 
      private void PopulateArray(int start)
      {
          for (int i = 0; i < size_; i++)
              store_[i] = start++;
      }
 
      private int Delete(int pos)
      {
          int val = store_[pos];
          store_[pos] = store_[--size_];
          return val;
      }
 
      public int GetRandomNumber()
      {
          if (size_ <= 0)
             return -1;
 
          return Delete(random_.Next(size_));
      }
 }

I think it’s pretty easy to see how this works, so I will not go into it in much detail discussing it, but the sequence of figures below shows the basic operations.

First populate the array with the values.

First populate the array with the values.


Use the Random class to obtain our first random number - in this case 6. Now copy the value at position size_ in the array to position 6, and decrement size_. Note not we don't actually delete anything from the array.

Use the Random class to obtain our first random number - in this case 6. Now copy the value at position size_ in the array to position 6, and decrement size_. Note: not we don't actually delete anything from the array.


We then use the Random class to generate another random number in our reduced range - in this case 3. We then copy the value at position size_ (i.e. 9) to position 3, and then decrement the size_ count as before. This process continues until size_ is reduced to 0.

We then use the Random class to generate another random number in our reduced range - in this case 3. We then copy the value at position size_ (i.e. 9) to position 3, and then decrement the size_ count as before. This process continues until size_ is reduced to 0.

So when would you require something like this? Well, say you need to generate unique random IDs with values 1 to 10, then you can use the class as follows:

1
2
3
4
5
6
7
8
    UniqueSetGenerator uniqueSet = new UniqueSetGenerator(10, 1);
 
    for(int i = 0; i < 9; i++)
    {
        int id = uniqueSet.GetRandomNumber();
 
        // Now do something with the id....
    }

For the moment I have only included a C# version but I will update this post with a Java version soon. Hope some of you find this useful.

programming language obsession makes you look stupid

It appears that many people seem to have too much of their life wrapped up in a particular programming language.  You only need to look over at dZone, Reddit or Digg to see this fandom in all its glory.  All too often we find articles about why such and such a programming language sucks.  However, just because a language sucks for one (or a couple) of particular reasons, it doesn’t mean it is not useful in general.  It’s like me saying computers suck because they crash.  However, just because my computer crashes from time to time doesn’t mean it’s not useful.

I just find the whole religious aspect to a language rather pathetic.  The result often leads to the inappropriate choice of language for development of an application, simply because the individual’s voice that is heard the loudest makes the decision.  Ok, if any language will do then just go with whatever you are comfortable with, but stop yourself bitching about other people’s choice of language.

For example, the number of times you hear people saying dynamic languages are no use, for a plethora of reasons, is stunning.  You would think that no one had ever developed anything of reasonable size and scale in these languages.  It’s not as if most of the largest websites on this planet have not been written in PHP/Python/Ruby, yet you still read articles where people are saying where such a feat is likely to lead to catastrophe.  Stop doing this it makes you look stupid.

The same can be said for those that diss Java.  OK, I think it’s possibly a poor choice for those considering a startup web business (basically if you are going to be considering shared hosting Java as an option on this platform is nonexistent), but there are many places where the use of existing libraries written in Java make it the ideal choice for an application.  An example of this can be seen in what I’m currently working on, which is an application that uses constraint programming techniques.  There are a few such libraries in other languages but the most mature and feature rich (and free) are in Java so sense dictates you use Java.

Essentially my bug bear boils down to people choosing a language for the wrong reasons, more often than not due to blind faith rather than education.  Don’t just use a language because it is popular, use it because it best fits the job needing done.  Popularity can come into it though, because at the end of the day you might wish to tap into a large set of existing programmers, or you may want to attract the brightest young talent who want to work in what’s popular/new.  Just don’t let it be the only thing that dictates your choice.

Unfortunately, regardless of however many blog post or articles people read and write, I feel that we are never going to remove this inherent language evangelism.  Maybe the industry would be in a far better position if we were all language agnostic.  Can you imagine how much more work would get done if people spent the first two months of a project actually doing work rather than arguing about what language it should all be written in.

forgive me for i have sinned, it’s been two years since i last used an ide

Right, this ain’t as bad as it seems, honest! But in the last two years I have pretty much coded Java without the use of an IDE, albeit I wasn’t doing too much Java, all the same though, I was doing pretty much everything by hand. Why I hear you ask SCREAM!

A little background first.  In my first job I used Visual Studio (6!) almost exclusively and being just out of uni with no prior experience of any IDE, I thought it was brilliant.  Time has move on since then and if we all face facts VS is still a top (the best?) developer tool.  On to my next job where development was done in XEmacs and debugging with some Sun IDE, god I can’t remember the name of it for the life of me, it was ancient at the time, Work(something) I think?  This was going back a step.  However, I got to love XEmacs, and within our organisation we had quite a few Emacs Lisp scripts that allowed us to automate adding things like header files or creating classes.  Then when I moved onto Java development near the end of my time there I just stuck with XEmacs – there wasn’t too much else around at the time (2002-2003) that wouldn’t run on a dog of a Sun U10.  After this I went back to uni to do a PhD (that’s back to Grad School for my readers over the Atlantic) where I done a fair bit of Java in my first year but largely went without programming after this.  I always think it seems bizarre that you can do a PhD in the area of Algorithms and most folk barely write a line of code – and some don’t even know a programming language, no joke!   Fortunately though I done a fair bit of contracting work throughout my graduate studies using Visual C++ (.NET 2003 then onto .NET 2008). Also, in the last year or so, I have managed to do quite a bit of JavaScript programming and some PHP stuff – again non IDE based.

Right not sure whether the background was need but you got it anyway.  So where was I.  Yeah, I tried Eclipse at the start of my time as a graduate student and just thought it sucked!  It was slow, froze often and I ended up back with trusty old XEmacs.  I did miss a debugger though – especially as I was using VS for the contracting work.  I just couldn’t face Eclipse though.

Now skip forward around 4 years to a few months ago and I find myself writing loads of Java again.  By this time I had ditched XEmacs for the very nice e text editor – I seen a Google developer videocast and in it they were using TextMate and I thought it looked awesome, so I wanted a windows version of it which is what e promised, and delivered.  This was great, it did have some nice autocompletion stuff but I was still doing quite a bit manually.  I never really felt that it was a problem though, as I just got on with doing what I was doing.  Then after a couple of conversations with people I gradually started thinking that I may be missing out on stuff that the IDE provided, so I decided to give it a go.

I tried Eclipse and I still think it sucks.  It’s not that slow anymore but I can’t put my finger on why I don’t like it, just tastes I suppose.  Or maybe it was because I had tried NetBeans 6.5, which I think is rather excellent.  I have no doubt Eclipse does all the things that NetBeans does but I could “figure out” NetBeans quicker.

For starters, it made it pretty clear where I should put my unit test files, you simply click add on the “Test Sources” tree.  Easy.  When I added my existing source files it didn’t seem to have the same bizarre problem that Eclipse was giving me for my package name.  Then when I wanted to run one of my tests it seemed easy I just right-clicked and selected Run File.  I mean I’m not saying you can’t do these things in Eclipse but it certainly wasn’t as easy as doing them in NetBeans, so why go with Eclipse?  I won’t go on further but essentially getting started in NetBeans just seems easier than Eclipse, which has got to be the single most important thing for gaining new users.

All that said and done, my general point is that boy have I been waaaaaaay more productive with the IDE over using an editor and the old System.out.println style of debugging!  I would say the key elements to this improvement have been intellisense, automatic compilation (probably a fancier word for it), ease of unit testing and of course a nice shiny debugger (I had tried JSwat as a stand alone debugger at first but it constantly failed to repaint its window).

There are a few things that annoy me about both Eclipse and NetBeans, the first is the color themes.  With e it is so easy to change themes and plenty of themes available – I really need a dark background for developing.  However, it’s a royal pain in the ass with both of the above IDEs.  Furthermore, NetBeans gave me a laptop destroying moment where I deleted files from the NetBeans project and expected it only to delete them from the actual project.  No, afraid not, it deleted them from the file system, without making it obvious it was going to do this.  I was seconds away from throwing my laptop against a wall, seriously!

Apart from that though it really has been a bonus switching to an IDE.  Those of you out there that are thinking a text editor and println will do are just kidding yourself on – like I was.  Those using an IDE are going to be way more productive than you are.  Now if we only had a decent IDE for every programming language, we all have a dream right?

7 ways to write beautiful code

I’ve noticed that the developer and designer community appear to be obsessed with lists. Is this for a reason that I have somehow missed? For example, over at dZone, 2 out of the top 3 most popular articles are lists: 15 CSS Tricks That Must be Learned and 10 Dirty Little Web Development Tricks – incidentally I liked both articles so I seem to have this obsession myself. OK this is not that many but hey I’ve seen loads of lists elsewhere. Anyway, I thought I would embrace this culture to feed my own addiction and I have detailed 7 (I was only going to do 5, and 6 was out as it’s not prime 😉 ) ways to write beautiful code.

First things first here, I’m talking about pure aesthetics, nothing else. As I have said previously, good code starts by being something other people find easy to read. In fact, Jeff Attwood had a blog post comparing coding to writing citing several sources. I urge you to take a look.

Anyway on to my list.

  1. Return from if statements as quickly as possible.

    For example, consider the following JavaScript function, this just looks horrific:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    function findShape(flags, point, attribute, list) {
        if(!findShapePoints(flags, point, attribute)) {
            if(!doFindShapePoints(flags, point, attribute)) {
                if(!findInShape(flags, point, attribute)) {
                    if(!findFromGuide(flags,point) {
                        if(list.count() > 0 && flags == 1) {
                              doSomething();
                        }
                    }
                }
           }
        }   
     }

    Instead we can change the above to the following:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    
    function findShape(flags, point, attribute, list) {
        if(findShapePoints(flags, point, attribute)) {
            return;
        }
     
        if(doFindShapePoints(flags, point, attribute)) {
            return;
        }
     
        if(findInShape(flags, point, attribute)) { 
            return;
        }
     
        if(findFromGuide(flags,point) {
            return;
        }
     
        if (!(list.count() > 0 && flags == 1)) {
            return;
        }
     
        doSomething();
     
    }

    You probably wouldn’t even want a function like the second one, too much going on (see point 7), but it illustrates exiting as soon as you can from an if statement. The same can be said about avoiding unnecessary else statements.

  2. Don’t use an if statement when all you simply want to do is return the boolean from the condition of the if.

    Once again an example will better illustrate:

    1
    2
    3
    4
    5
    6
    7
    8
    
    function isStringEmpty(str){
        if(str === "") { 
            return true;
        }
        else {
            return false;
        }
    }

    Just remove the if statement completely:

    1
    2
    3
    
    function isStringEmpty(str){
        return (str === "");
    }
  3. Please use whitespace it’s free!

    You wouldn’t believe the amount of people that just don’t use whitespace – you would think there was a tax associated with using it. Again another example and I hesitate to say this but this is from real live code (as was the first example), all I have done is change the programming language and some function names – to protect the guilty:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
    function getSomeAngle() {
        // Some code here then
        radAngle1 = Math.atan(slope(center, point1));
        radAngle2 = Math.atan(slope(center, point2));
        firstAngle = getStartAngle(radAngle1, point1, center);
        secondAngle = getStartAngle(radAngle2, point2, center);
        radAngle1 = degreesToRadians(firstAngle);
        radAngle2 = degreesToRadians(secondAngle);
        baseRadius = distance(point, center);
        radius = baseRadius + (lines * y);
        p1["x"] = roundValue(radius * Math.cos(radAngle1) + center["x"]);
        p1["y"] = roundValue(radius * Math.sin(radAngle1) + center["y"]);
        pt2["x"] = roundValue(radius * Math.cos(radAngle2) + center["y"]);
        pt2["y"] = roundValue(radius * Math.sin(radAngle2) + center["y");
        // Now some more code
    }

    I mean I won’t bother putting an example of how it should be – it should just be sooo bloody obvious. That said, I see code like this ALL the time and so certain people do not find it that easy to judge how to use whitespace. Screw it, for them I will inject some whitespace into the example and it’s shown below.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
    function getSomeAngle() {
        // Some code here then
        radAngle1 = Math.atan(slope(center, point1));
        radAngle2 = Math.atan(slope(center, point2));
     
        firstAngle = getStartAngle(radAngle1, point1, center);
        secondAngle = getStartAngle(radAngle2, point2, center);
     
        radAngle1 = degreesToRadians(firstAngle);
        radAngle2 = degreesToRadians(secondAngle);
     
        baseRadius = distance(point, center);
        radius = baseRadius + (lines * y);
     
        p1["x"] = roundValue(radius * Math.cos(radAngle1) + center["x"]);
        p1["y"] = roundValue(radius * Math.sin(radAngle1) + center["y"]);
     
        pt2["x"] = roundValue(radius * Math.cos(radAngle2) + center["y"]);
        pt2["y"] = roundValue(radius * Math.sin(radAngle2) + center["y");
        // Now some more code
    }
  4. Don’t have useless comments:

    This one can get quite irritating. Don’t point out the obvious in comments. In the example below everyone can see that we’re getting the students id, there is no need to point it out.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    function existsStudent(id, list) {
        for(i = 0; i < list.length; i++) {
           student = list[i];
     
           // Get the student's id
           thisId = student.getId();
     
           if(thisId === id) {
               return true;
           }
        }
        return false;   
    }
  5. Don’t leave code that has been commented out in the source file, delete it.

    If you are using version control, which hopefully you are – if not why not! – then you can always get that code back easily by reverting to a previous version. There is nothing more off putting when looking through code and seeing a large commented out block of code. Something like below or even a large comment block within a function itself.

    1
    2
    3
    4
    5
    6
    
    //function thisReallyHandyFunction() {
    //      someMagic();
    //      someMoreMagic();
    //      magicNumber = evenMoreMagic();
    //      return magicNumber;
    //}
  6. Don’t have overly long lines.

    There is nothing worse than when you look at code that has lines that go on forever – especially with sample code on the internet. The number of times I see this and go ahhhhhhhhhh (I’ll switch to Java for this, as generics makes this particularly easy to do):

    1
    2
    3
    4
    5
    6
    
    public static EnumMap<Category, IntPair> getGroupCategoryDistribution(EnumMap<Category, Integer> sizes, int groups) {
            EnumMap<Category, IntPair> categoryGroupCounts = new EnumMap<Category,IntPair>(Category.class);
     
            for(Category cat : Category.values()) {
                categoryGroupCounts.put(cat, getCategoryDistribution(sizes.get(cat), groups));
            }

    I’m not suggesting the 70 characters width that you had to stick to on old Unix terminals but a sensible limit like say 120 characters makes things a little easier. Obviously if you are putting sample code on the internet and you have it within a fixed width container, make it easier for people to read by actually having it fit in the container.

  7. Don’t have too many lines within a function/method.

    Believe it or not a few years ago now an old work colleague exclaimed that Visual C++ was “shit” as it didn’t allow you to have a method with more than 10,000 lines. I kid you not – well ok I can’t remember the exact number of lines but it was huge. I still see this time and time again where a function/method is at least 50 lines long. Can anyone tell me this is easy to follow? Not only that but it normally forms part of an if statement that you can never find the enclosing block because you are scrolling. To me anything over 30-35 lines is pretty hard to follow and requires scrolling. My recommendation is if it’s more than 10-15 lines consider splitting it up.

This is by no means an exhaustive list and I could have gone on longer – in fact my abolish the switch statement motto would have been number 8 if I had not already mentioned it before. However, it has to end somewhere but feel free to state your own annoyances and maybe I can update the post with some more.

Over and out.

abolish the switch statement

I can’t explain how much I hate the switch statement. I often do everything and anything to avoid using it. I just think that it makes code very very ugly. Moreover, the amount of times I have seen people using it when they only have two cases, you just wouldn’t believe me! Let’s look at an example in Java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
switch (n)
{
    case 0: 
       System.out.println("I'm zero");
       break;
    case 1:
    case 3:
    case 5:
       System.out.println("I'm odd");
       break;
    case 2:
    case 4:
       System.out.println("I'm even");
}

Surely no-one can tell me this actually looks “nice”? I mean coding is all about looking nice and neat after all. You can give the compiler something that looks like dog meat and it won’t give a shit as long as it’s valid syntax. Code is ALL about us humans reading it. The above is also a simple example and switch statement are normally way more complicated than this.

The normal argument for using a switch is that it looks better than loads of if-then-elses, which I completely agree with as that looks ugly as well. Also, with Java (and others) you can only switch on integer values which is the height of annoyance. Furthermore, there can’t be a single person that has not suffered the “forgetting the break” problem on a switch statement. For the record, my main bugbear is the extra indentation the switch forces on me along with the fact that scrolling all the cases is frequently a nightmare.

So what am I proposing? Well what I tend to use to solve both the ugliness problem and the lack of support for switching on stuff other than an integer is a HashMap. I’m sure some are screaming “Am I hearing this idot correctly”!

I really do think this looks particularly nice when used in code and can reduce a massive sprawing switch statement in a method to one simple call. So what does this entail. First lets assume that we have a method that takes a string as an argument and depending on the contents of this string we perform a specific operation – this is a particularly common task. Let say we have a class like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Operations {
    // Operation is an interface with the method do attached
    HashMap<String, Operation> mapper;
 
     public Operations() {
        mapper.put("eyeSurgery", new EyeSurgery());
        mapper.put("heartSurgery", new HeartSurgery());
        mapper.put("heartKeyHoleSurgery", new HeartSurgery());
       // etc
     }
 
     /**
      * Carries out an operation, which is dependant on the value of op
      * and returns a string containing the results of this operation.
      */
    public String do(OpContext op) {
        return mapper.get(op.toString()).run(op);
    }
}

Now surely the code in method do looks better than a switch statement would? Obviously in this case you couldn’t even use a switch statement in Java, but even if the type of the HashMap key was Integer, I still proclaim this looks better. OK, you have the extra setup cost but it’s got to be worth it and should run fast enough for most needs. It also opens up the possibility of “switching” on any object based type rather than just an integer – presuming a sensible hash function being declared for your object. Obviously this is a simple example but hopefully it illustrates what I’m try to put across.

I’m not sure if this was really obvious and everyone is doing this already. Hopefully someone will get something out of this. So, down with the switch statement, long live the hash map!

more json and java

My post yesterday was actually intended to be a placeholder where I was planning to include a build of Douglas Crockfords Java JSON library that I have been using on some recent projects. However, I seemed to get carried away while writing and, as you can see, ended rabbiting on more than I intended. The result of this was that by the time I got to the end of it, I could no longer see where my original point fitted in to the discussion.

For the record, there are other Java JSON libraries (http://json-lib.sourceforge.net/, and http://oss.metaparadigm.com/jsonrpc/ to name two) but these are way way way too complicated for a simple application, and/or also have far too many dependences. The files in the jar file included at the end of this post work out the box, no need for other libraries.

So, anyway, today I thought I may as well get round to putting this jar file up on this site . This is not something that you can’t get elsewhere but I wanted it easily available for ME. Also, if like me, you are happy just to get a jar file, and really don’t want the hassle of building the (small) library, then you can just get it from this ere blog; at the time when I built this I could only seem to find the source, hence why I thought I may as well build it myself and host it. The javadoc for the code can be found at http://www.json.org/java/ and the link to the Java jar file I created can be downloaded using this link. If anyone has any issues with me hosting it then let me know and I will take it down, otherwise enjoy it.

Java JSON Library jar file

regular expressions saved my life – again

Right, so, I talked in my last entry how the wonders of regular expressions had saved my life, and therefore filled the world with utter joy.  Once again a few days later I find myself faced with a similar problem, and you guessed it, regular expressions saved my life AGAIN.

The problem is essentially the same as before only this time I had a medium-sized database dump as a CSV file and once again I wanted to fill a Java array with the values from certain columns. For the record, as previously, this stuff was all for some JUnit tests I was running. A simplified example of what I was doing is shown below:

1
2
3
4
5
while(itor.hasNext()) {
    Student student = itor.next();
    assertTrue("Student " + student.getId() + " != " + idResults_[count],
                    student.getId() == idResults_[count]);
}

Basically I have a list of Students that I have created whose ids I want to ensure correspond to what I expect them to be. To test this I have a data set of around 250 students (I’m not really that interested in checking the ids, it more a category a student is in but the ids example was easier to show).

In the code above idResults_ corresponds to an int array that I would like to generate from a column in the CSV file. So idResults_ looks something like:

int [] idResults_ = {87868,78757,89987,......};

So how did I generate this array? Well I extended the 5 lines in my last post into a slightly larger Perl script that takes some options and spits out the array initaliser. The actual script can be found HERE. The usage for this script is:

Usage: extract.pl -f <input_file> -c <id> -[hnwisro]
        -h Show this screen
        -n Show the column names in the file
        -w Separate on whitespace (default is a comma)
        -i Don't ignore first line, i.e. it contains the names of the columns
        -s Treat the data as a string, i.e. data in generated array is in 
           double quotes, defaults to an int array
        -r Treat the data as characters, i.e. data in generated array is
           in quotes
        -f  <input_file> Input file
        -c  <id> column to include in array (can be either a number, 
            zero based, or column name)
        -o  <output_file> File array is output to (any other content will be
            over-written)
 
   Outputs a Java/C# array initaliser with values from column <id> from file
 <input_file> and send it out to <output_file>

As you can see I have extended this somewhat from my previous post into a full blow utility (useful probably only to me, but hey who cares). As you can see, instead of creating an int array from the data if you use the -s flag you can create a string array (e.g. {"Harry","Sally", "Billy"}) or a character array using the -r option. Furthermore, you can specify the column to create the array from. This can either be a zero based integer or the id of the column – this presumes that the first line in your file contains that names of the columns (ala CSV file). Also if the first line contains actual data, and you do not want it to be treated as the column names, then you can specify the -i option to choose NOT to ignore the values contained on this line.

Well that’s it. Hope someone else finds the script useful. Over and out.