subscribe to RSS

Comparing Collections

Posted Sunday, April 18, 2004  

After a long week, Achilles finds he has too much time on his hands. His friend the Tortoise takes pity and indulges him with a bit of IM’ing.<table cellpadding=”5” cellspacing=”5”>Achilles:I've done nothing but read blog entries this weekend.Tortoise:You must be bored! Anything interesting?Achilles:I just read [an entry](http://www.corvine.org/blog/archives/000002.html) that reminded me of some stuff I refactored during the week.Tortoise:Do you ever get any _real_ work done?Achilles:Now that Java has a `LinkedHashSet` can you think of any reason to use a simple `List` except for "performance" reasons?Tortoise:Won't it just look like a `List`?Achilles:Sort of but, importantly, it's also a `Set`. Since when do you actually mean to add the same item to a `List` more than once? I'm being pedantic here.<tr>Tortoise:<td>But it happens, life is full of duplicates.<tr>Achilles:<td>I’m sure it does but I can’t think of many examples where that’s actually what you want. It just seems too often people use Lists when they should actually be using a Set. Clearly, Lists are useful but ArrayList has to be the most abused Collection class aroundTortoise:People generally think in terms of `Lists` - it's a simple concept.Achilles:Yes, but people also think that AND and OR mean exactly the opposite. What we think and what we mean aren't always the same and programming is about expressing what you mean.Tortoise:Do people really think about the correct `Collection` type to use?Achilles:No, they probably don't but they should.Tortoise:I try to but I can't guarantee that I won't be lazy and default to `ArrayList`.Achilles:Exactly! And then the code ends up iterating over stuff and assuming a particular order on things that have no order. I see it all the time and this damn `CollectionUtils.isEquals(Collection, Collection)code just makes it worse. Its ludicrous. It basically allows you to compare a `List` with a `Set` and see if the contents are the same. Which is just wrong! A `List` and a `Set` are not the same thing. They are symantically very different and thinking that it's just a matter of comparing the contents is, IMHO, flawed.Tortoise:Which takes us back to your original question - if you want to allow duplicates then you can't use a `Set`, so when would you want to allow dups?Achilles:Very rarely I suspect. In fact how often do you ever want to allow duplicates and how often does order really matter? Part of the problem I think is a misunderstanding of what `equals(Object)` actually means. It implies substitutability and therefore must be reflexive. But many people don't realise that their `equals(Object)` method isn't so that we end up with `a == b` but `b != a`.Tortoise:I've not seen that happen.Achilles:It usually happens with inhreitence and using `instanceof` instead of class comparison.Tortoise:You must have looked at a lot of shitty code!Achilles:You mean you can't tell? Why do you think I bitch so much :-)Tortoise:Ok what if I have a situation where it is possible to have more than one object of the same type and content? That `Collection` could not be stored in a `Set`, correct?Achilles:Correct. So you just want a `Collection`, not a `List`. I repeat NOT A `List`.Tortoise:Then what implementation class do I use?Achilles:The implementation can be a `List` but the variable should be a `Collection` as in `Collection things = new ArrayList();code because a `List` implies ordering and so far you haven't mentioned anything about order being important.Tortoise:Ok so then I decide that ordering is important.Achilles:Sure make it a `List` but the key thing is that you don't just assume that order is important because then people will try and write tests assuming something about the order and then they'll build screens assuming something about the order, etc. etc.Tortoise:I've just remembered...I added a method to compare `Collections` (for that domain object) to see if there had been any changes - there is no check to see if they are the same implementation of `Collection` so i could be iterating over a `List` and a `Set`Achilles:Why can't you just call `Collection.equals(Object)`? Thats what it's for.Tortoise:On the `Collection`?Achilles:Yes. I see people writing "convenience" methods for comparing `Collections` all the time when they already have an `equals(Object)` method that does a perfectly good job.Tortoise:I assumed that it wouldn't do a deep comparison.<tr>Achilles:<td>It iterates over the contents, calling equals(Object) and or checking object identity (whatever is appropriate for the Collection). I use assertEquals(Object, Object)code on Collections all the time.<tr><td align="left" valign="top">Tortoise:</td><td>Hmmm, that didn't get picked up in the tech review.</td></tr><tr><td align="left" valign="top">Achilles:</td><td>Probably because everyone on the project uses CollectionUtils.isEqual(Collection, Collection)code!</td></tr></table>

Tweet This Delicious Reddit Stumble Upon Digg Share on Tumblr email

About Simon

Husband, Father, One-time Entrepreneur.

Aka Haruki Zaemon. Aka Sampy.

In my younger years I wanted to save the world; now I'm happy solving bigger problems than I create.

If I didn't need to work I'd be teaching Aikido and spending all my free time with my amazing wife and two children in Tylden, Victoria, Australia.

Books

Beginning Algorithms
with James Ross

Software

Simian
Similarity Analyser

Blog Categories