We had a short discussion about a type of a certain collection. It seemed the situation we had was confusing to most of the people present. In this post I’ll outline my thinking with the reasons behind it.
We started with a trivial
ShoppingCart class that could only have one item:
1 2 3 4 5
We needed to change this class so it supports multiple items. It was agreed upon that the shopping cart needs to support duplicate items and that there shouldn’t be any semantics of ordering between the items.
We decided to implement this with the following code:
1 2 3 4 5
A friend later started a discussion about the type of the collection. In his
opinion the type should not be
there are collections like
HashSet<Integer> that break the abstraction by not
allowing duplicate elements:
I was against the
List<Integer> for the type as it implies ordering between
items in the shopping cart. Most people understood both points, but it was not
obvious what is better.
A case against the
If we are going to use an
ArrayList instance, there are two acceptable types
The following is not acceptable:
prices collection is internal to the
ShoppingCart class and so
ArrayList<Integer> can be used directly. This is the simplest possibility and
would not surprise a reader of the code. Most likely they would not even notice
the type on the left side. What about coding to an interface you ask? In our
current situation this doesn’t apply –
prices collection is just a private,
internal detail of the
Collection<Integer> is also ok, it only reveals that there can be more
than one item in a shopping cart, nothing more. Combined with instantiation
ArrayList we now know that ordering does not matter – otherwise
the type on the left side would be
List and not
Collection. Because a
list is exactly that – a collection that has an ordering of items.
We can actually infer a bit more. Given that we now know it’s a collection
where the ordering does not matter and we are still not using the
which prohibits duplicates – we can assume that the duplicates are allowed. In
reality almost nobody gives this much thought to a private field. With
experience we get to know that the
List is a wrong type to use here and we
develop the “intuition” that the
Collection works well.
This brings me to the problem with using the
List here: seeing an abstract
type in this situation implies that the implementer has thought a bit about the
types and has decided for undisclosed reasons to care about the type. After
all they did not go with the simplest possible scenario of just having the
ArrayList type. As the only thing that the
List type adds to the
Collection type is ordering, we are now led to believe that the ordering of
the items in a shopping cart matters and we have previously agreed that is not
Why is the
List so enticing?
There are two reasons why people feel like the
List could be used here.
First one1 is that lists are ubiquitously present in basically any codebase.
Different lists types are often a good choice of a collection from performance
perspective. They are used so often2 that most people think of a list when
you talk about a collection and ignore the ordering implication. And that can
be ok when you are using a concrete list type and are not saying that you
explicitly want to use the
The other reason you already read – the
List type can prevent bugs like:
Things get a bit more complicated here.
Can usage of the
List prevent all bugs that don’t allow duplicate items in
a shopping cart? The answer is no. Here is an example of a possible bug:
SetUniqueList is an example of a collection that conforms to the
type, but still only allows unique elements.
But really this should not matter at all. Even if it was possible to prevent
usage of a concrete collection that allows only unique elements, it should not
be done with an abstract type that implies false semantics. If this is a
problem, why not just have a unit test that ensures you can have duplicates?
You should have that test anyway and so the
List abstraction provides no
You could of course reasonably want to have this ensured on a type level. Then you should have your own collection type that is semantically sound.
It is interesting to note that, depending on your language, it might be impossible to use an abstract class or an interface and at the same time prevent bugs like the one shown above. The type would need to be both final and abstract at the same time. This is another reason for creating types to encapsulate your collections.
Try to show only the intent you have and do not imply more. Be aware that anything that’s not the simplest possible thing implies additional semantical meaning!
Don’t break the semantics of your types as a means to prevent possible future bugs – tests are perfect for that.
Note that the type systems currently in use are imperfect and are not expressive enough. Even more so in the case of mainstream languages like Java or C#. Luckily we have additional means to express intent – having good names, using instances of appropriate concrete types and having readable usage patterns.