Data Structures are fundamental to computer software. They allow the efficient storage, manipulation, and retrieval of data. Java's implementation of the canonical Data Structures is called Java Collections Framework. Google Collections, an open source contribution from Google extends the Java Collections Framework. Google Collections is not a replacement for the Java Collections Framework, but rather supplements and extends the Java Collections Framework to provide more power and flexibility to the programmer.
This article is preceded by the April 2008 edition of the Java News Brief: Google Collections article. This article may be read independently of the previous article, but newcomers to Google Collections are encouraged to read the previous article.
Disclaimer: Google Collections library is currently under development and continues to evolve as of the time of this writing, and some APIs may be subject to change.
The following topics demonstrate selected Google Collections features and review the Java Collections Framework upon which it is based. The full source code of the examples is available.
Think outside the List. Java Collections Framework and Google Collections provide many useful data structures beyond the List. Sometimes, a lot of code can be saved by using the correct collection.
Sets do not allow repeated elements and do not preserve order. This example uses the newHashSet(E...
elements) static
method of the Sets class that accepts a varargs array of elements. Note that you don't have to specify
the types on the right hand side of the assignment. Arrays or Lists that have uniqueness checks around every add
method are good candidates to be refactored to be Sets.
Set<String> dontForget = Sets.newHashSet("Keys", "Badge", "Wallet");
dontForget.add("Keys"); // ignored, set will still only contain one instance of "Keys"
Multisets (also known as bags) do allow repetition and do not preserve order. Subclass TreeMultiset keeps elements sorted according to their "natural" sort order (see http://java.sun.com/docs/books/tutorial/collections/interfaces/order.html) or can accept a Comparator. All Multisets know how many occurrences of each object are contained within and provide a count method. See the "Bag Groceries with Multiset" example in the previous Google Collections Java News Brief.
Multiset<String> groceries = Multisets.newHashMultiset("Apple", "Orange", "Orange", "Banana", "Pear");
System.out.printf("groceries contains %d oranges\n", groceries.count("Orange"));
The output is:
groceries contains 2 oranges
Lists allow repetition and preserve order. The following list contains the combination to a combination lock.
The newLinkedList(E...elems) method was deprecated between versions 0.6 and 0.8 of Google Collections
because of a conflict with another method of the same name that accepts an Iterator. The
newArrayList(E...elems) method may well be deprecated by 1.0 for the same reason. If this occurs,
Collections.addAll() provides an alternative easy way to create populated lists.
List<Integer> combination = Lists.newArrayList(36, 22, 36);
Maps don't allow duplicate keys, and each key maps to only one value. Multiple keys may refer to the same value.
The newHashMap(K, V), newHashMap(K, V, K, V), and newHashMap(K, V, K, V, K,
V)
methods were deprecated between 0.6 and 0.8 and will likely not return.
Map<String, Integer> numLegs = Maps.newHashMap();
numLegs.put("Horse", 4);
numLegs.put("Zebra", 4);
numLegs.put("Spider", 8);
numLegs.put("Monkey", 2);
BiMaps require unique keys and values but provide an inverse view. Review the example below, and also see the "Do Inverse Lookups with BiMap" example in the previous Google Collections Java News Brief.
BiMap<String, String> usCoinsFrontBack1998 = Maps.newHashBiMap();
usCoinsFrontBack1998.put("Abraham Lincoln", "Lincoln Memorial");
usCoinsFrontBack1998.put("Thomas Jefferson", "Monticello");
usCoinsFrontBack1998.put("Franklin D. Roosevelt", "Olive branch, Torch, Oak Branch");
usCoinsFrontBack1998.put("George Washington", "Eagle");
//usCoinsFrontBack1998.put("Not George Washington", "Eagle"); // java.lang.IllegalArgumentException
String query1 = "Franklin D. Roosevelt";
System.out.printf("front = %s, back = %s\n", query1, usCoinsFrontBack1998.get(query1));
String query2 = "Monticello";
System.out.printf("back = %s, front = %s\n", query2,
usCoinsFrontBack1998.inverse().get(query2)); // now use the inverse lookup capability
The output is:
front = Franklin D. Roosevelt, back = Olive branch, Torch, Oak Branch
back = Monticello, front = Thomas Jefferson
Multimap allows keys to map to multiple values. Pass a key into get()
on a Multimap and it will return a collection of all of the values associated with that key.
a collection. Google provides and implements three subinterfaces: ListMultimap, SetMultimap
, and SortedSetMultimap.
ListMultimap allows duplicate key-value pairs and preserves insertion order of values. SetMultimap disallows
multiple
key-value pairs. SortedSetMultimap is a SetMultimap that sorts the values for each key. The following example uses
HashMultimap, an implementation of SetMultimap.
One use of Multimap is to check the frequency of a particular key. See the "Check the Frequency with Multimap"
example in the previous Google Collections Java
News Brief.
Multimap<String, String> pets = Multimaps.newHashMultimap();
pets.put("John", "Cat");
pets.put("John", "Dog");
pets.put("Anne", "Fish");
pets.put("Anne", "Bird");
pets.put("Anne", "Bunny");
pets.put("Anne", "Cat");
pets.put("Dan", "Cat");
final Collection<String> annePets = pets.get("Anne");
System.out.printf("Anne's pets: %s\n", Join.join(", ", annePets));
The output is:
Anne's pets: Bunny, Bird, Cat, Fish
BiMap supports an inverse view, but Google Collections makes it easy to invert any Map or Multimap. Multimaps.forMap()
converts any Map to a Multimap. Once you have a Multimap, call Multimaps.inverseHashMultimap(),
Multimaps.inverseArrayListMultimap() or Multimaps.inverseTreeMultimap() for an inverted
view.
SetMultimap<String, Integer> numLegsMulti = Multimaps.forMap(numLegs);
HashMultimap<Integer, String> invNumLegsMulti = Multimaps.inverseHashMultimap(numLegsMulti);
int numLegsQuery = 4;
System.out.printf("animals with %d legs = %s\n", numLegsQuery,
Join.join(", ", invNumLegsMulti.get(numLegsQuery)));
The output is:
animals with 4 legs = Horse, Zebra
One criticism of Java generics is that type information is "erased" at runtime. Java Collections Framework as of
version 1.5 added the ability to create a "Checked" view of any Collection, List, Map, Set, SortedMap, or SortedSet.
Note
that the first example provides a compile warning but allows the put() at runtime.
Map<String, Integer> map = Maps.newHashMap();
Map mapRaw = map;
mapRaw.put("abc", "xyz"); // no RuntimeException
However, by calling Collections.checkedMap() we add runtime checking of the types. Note the runtime exception thrown on the put:
Map<String, Integer> map = Maps.newHashMap();
Map<String, Integer> mapChecked = Collections.checkedMap(map, String.class,
Integer.class);
Map mapCheckedRaw = mapChecked;
mapCheckedRaw.put("abc", "xyz"); // java.lang.ClassCastException
As of 0.8 Google Collections doesn't provide this capability for Multimap. It would be an easy exercise to add
this capability using ForwardedMultimap. BiMap and Multiset are accepted by Collections.checkedMap()
and
Collections.checkedCollection(), respectively.
Immutability is the inability to change. The first snippet shows a typical, mutable map:
Map<String, Integer> map = Maps.newHashMap();
map.put("abc", 123); // no RuntimeException
Now, we create a map using the ImmutableMap factory method, and notice that we get a runtime exception if we attempt to add to it.
Map<String, Integer> map = ImmutableMap.of("abc", 123);
map.put("def", 456); // java.lang.UnsupportedOperationException
The of() factory methods allow a max of three elements. Use the provided Builder to add more elements:
Map<String, Integer> map = new ImmutableMap.Builder<String, Integer>().put("abc", 123).build();
map.put("def", 456); // java.lang.UnsupportedOperationException
Sometimes we want to take an existing map and make it immutable:
Map<String, Integer> map = Maps.newHashMap();
map.put("abc", 123); // no RuntimeException
Map<String, Integer> immutableMap = ImmutableMap.copyOf(map);
immutableMap.put("def", 456); // java.lang.UnsupportedOperationException
Java Collections Framework provides this capability on its own without Google Collections:
Map<String, Integer> map = Maps.newHashMap();
map.put("abc", 123); // no RuntimeException
Map<String, Integer> immutableMap = Collections.unmodifiableMap(map);
immutableMap.put("def", 456); // java.lang.UnsupportedOperationException
Google Collections provides a Constraint interface and some factory methods to apply constraints to existing collections. For the first example, we take a HashSet and prove that adding null to it is allowed:
Set<String> set = Sets.newHashSet();
set.add("A");
set.add(null);
In this example, we constrain a set to not allow null additions using the built-in notNull() implementation of Constraint:
Set<String> set = Sets.newHashSet();
Set<String> constrainedSet = Constraints.constrainedSet(set, Constraints.notNull());
constrainedSet.add("A");
constrainedSet.add(null); // NullPointerException here
The Constraints class provides methods constrainedCollection(),
constrainedList(),
constrainedMultiset(),
constrainedSortedSet()
in addition to constrainedSet().
The examples in this article up to this point collect the built-in Java types, e.g. String, Integer, etc. Java Collections Framework and Google Collections demonstrate increased value when applied to more sophisticated user-defined types, i.e. types the programmer defines to represent abstractions in his/her problem domain.
The prerequisite to collecting a user-defined type is defining high quality equals() and hashCode()
methods. See
items #8 and #9 in Effective Java by Josh Bloch.
The following class has high quality equals() and hashCode() methods.
public class Combination {
private int a;
private int b;
private int c;
public Combination(int a, int b, int c) {
this.a = a;
this.b = b;
this.c = c;
}
@Override public boolean equals(Object o) {
if (o == this) {
return true;
}
if (!(o instanceof Combination)) {
return false;
}
Combination rhs = (Combination) o;
return rhs.a == a && rhs.b == b && rhs.c == c;
}
@Override public int hashCode() {
int result = 17;
result = 31 * result + a;
result = 31 * result + b;
result = 31 * result + c;
return result;
}
}
Now let's use Google Collections' BiMap to collect our combinations and map them to our lockers.
BiMap<String, Combination> locks = Maps.newHashBiMap();
locks.put("school", new Combination(36, 22, 36));
locks.put("ymca", new Combination(14, 7, 28));
locks.put("club", new Combination(3, 18, 6));
Weeks later a combination is found written on a slip of paper in the junk drawer. To which locker does it refer?
BiMap<Combination, String> locksInv = locks.inverse();
System.out.println(locksInv.get(new Combination(36, 22, 36)));
The output is:
school
This works because the Combination class provides an override for equals(). If it didn't,
the default definition inherited from Object would be used, and the query into the locksInv
Map would have returned null. The default implementation of equals()
within the Object class returns true only if both objects are the same object:
public boolean equals(Object obj) {
return (this == obj);
}
Also note that because Combination overrides equals(), it must also, according to contract, override
hashCode() as well.
From the Object.equals() javadocs:
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
Sorting is another area where user-defined types can leverage JCF and Google Collections. To take advantage,
implement the Comparable interface. As an example, let's consider a time slip. A time slip is a
paper that a driver receives after a drag race. It shows his elapsed time for the distance he/she raced. This
TimeSlip class implements
the Comparable interface and overrides equals() and hashCode():
public class TimeSlip implements Comparable<TimeSlip> {
String driver;
int elapsedTime;
public TimeSlip(String driver, int elapsedTime) {
this.driver = driver;
this.elapsedTime = elapsedTime;
}
@Override
public boolean equals(Object o) {
if (o == this) {
return true;
}
if (!(o instanceof TimeSlip)) {
return false;
}
TimeSlip rhs = (TimeSlip) o;
return rhs.driver.equals(driver) && rhs.elapsedTime == elapsedTime;
}
@Override
public int hashCode() {
int result = 17;
result = 31 * result + driver.hashCode();
result = 31 * result + elapsedTime;
return result;
}
public int compareTo(TimeSlip timeSlip) {
if (timeSlip == this) {
return 0;
}
if (elapsedTime < timeSlip.elapsedTime) {
return -1;
} else if (elapsedTime == timeSlip.elapsedTime) {
return 0;
} else {
return +1;
}
}
@Override
public String toString() {
return String.format("%s/%d", driver, elapsedTime);
}
}
Now TimeSlip instances can be collected with ordering collections. Using Google Collections' TreeMultimap, lets collect last weekend's drag racing results between Dan, Carey, and Dean. Because we have implemented Comparable, TreeMultimap will do the work of automatically sorting the TimeSlips as they are added. And because it's a Multimap, TreeMultimap will also allow multiple values per key.
Multimap<String, TimeSlip> weekend = Multimaps.newTreeMultimap();
weekend.put("Friday", new TimeSlip("Carey", 13));
weekend.put("Friday", new TimeSlip("Dan", 14));
weekend.put("Sunday", new TimeSlip("Dean", 10));
weekend.put("Friday", new TimeSlip("Dean", 12));
weekend.put("Saturday", new TimeSlip("Dan", 13));
weekend.put("Saturday", new TimeSlip("Dean", 11));
weekend.put("Sunday", new TimeSlip("Dan", 12));
weekend.put("Saturday", new TimeSlip("Carey", 12));
weekend.put("Sunday", new TimeSlip("Carey", 11));
System.out.printf("Friday's Times: %s\n", Join.join(", ", weekend.get("Friday")));
System.out.printf("Saturday's Times: %s\n", Join.join(", ", weekend.get("Saturday")));
System.out.printf("Sunday's Times: %s\n", Join.join(", ", weekend.get("Sunday")));
The output is:
Friday's Times: Dean/12, Carey/13, Dan/14
Saturday's Times: Dean/11, Carey/12, Dan/13
Sunday's Times: Dean/10, Carey/11, Dan/12
Observe that the results are in order, fastest to slowest, and grouped by day.
In this exercise Google Collections and Java Collections Framework are used in concert to explore the problem of matching blood donors to recipients. First, we enumerate the blood types:
enum BloodType {
OPOS, APOS, BPOS, ABPOS, ONEG, ANEG, BNEG, ABNEG
}
Now, we choose Multimap to store the compatibility matrix of blood types. The first putAll(K key, V...
values) encodes the rule
that ONEG can donate to ONEG, OPOS, ANEG, APOS, BNEG, BPOS, ABNEG, and ABPOS.
Let's make it immutable because this information will not change during the execution of the program and to reflect
the stability of these rules:
// donor -> recipient red blood cell compatibility (source = http://en.wikipedia.org/wiki/Blood_type)
Multimap<BloodType, BloodType> donorToRecipients = new ImmutableMultimap.Builder<BloodType, BloodType>().
putAll(ONEG, ONEG, OPOS, ANEG, APOS, BNEG, BPOS, ABNEG, ABPOS).
putAll(OPOS, OPOS, APOS, BPOS, ABPOS).
putAll(ANEG, ANEG, APOS, ABNEG, ABPOS).
putAll(APOS, APOS, ABPOS).
putAll(BNEG, BNEG, BPOS, ABNEG, ABPOS).
putAll(BPOS, BPOS, ABPOS).
putAll(ABNEG, ABNEG, ABPOS).
putAll(ABPOS, ABPOS).
build();
Later in the exercise we'll want to determine the compatible donors for a recipient. So, we invert the Multimap:
// recipient -> donor red blood cell compatibility
Multimap<BloodType, BloodType> recipientToDonor = Multimaps.inverseHashMultimap(donorToRecipients);
Now, let's collect some fictional people and their blood types, and create an inverse of this map as well.
// cast member -> blood type N -> 1
Map<String,BloodType> castBloodTypes = new ImmutableMap.Builder<String,BloodType>().
put("Fred", ONEG).
put("Wilma", APOS).
put("Pebbles", ANEG).
put("Barney", ABPOS).
put("Betty", OPOS).
put("Bamm-Bamm", ANEG).
build();
Multimap<BloodType, String> castBloodTypesInv = Multimaps.inverseHashMultimap(Multimaps.forMap(castBloodTypes));
Barney cuts his finger and needs blood, so let's do some lookups in our maps to find compatible donors from within the cast.
// Barney cuts his finger and needs blood. Who from the cast can donate? String injured = "Barney"; // create an empty set to hold the possible donors for later display Set<String> possibleDonors = new HashSet<String>(); // look up the blood type of the injured person BloodType injuredType = castBloodTypes.get(injured); // look up the compatible donor types for the injured person's blood type Collection<BloodType> compatibleDonorTypes = recipientToDonor.get(injuredType); // iterate over each compatible donor bood type for (BloodType compatibleDonorType : compatibleDonorTypes) { // find cast members with the compatible type Collection<String> castMembersWithType = castBloodTypesInv.get(compatibleDonorType); // add all cast members with the compatible type to the set of possible donors possibleDonors.addAll(castMembersWithType); } possibleDonors.remove(injured); // the donor shouldn't try to donate to him/herself possibleDonors.remove("Pebbles"); // too young to donate possibleDonors.remove("Bamm-Bamm"); // too young to donate System.out.printf("%s can donate to %s\n", Join.join(", ", possibleDonors), injured);
The output is:
Betty, Fred, Wilma can donate to Barney
Going further, let's capture the distribution of blood types for a reference population in another map. Let's use this information to choose donor based on the availability of people with the same blood type.
Disclaimer: this probably isn't the best algorithm, and may not even be a good one
// blood type -> percentage in USA population (source = http://en.wikipedia.org/wiki/Blood_type) Map<BloodType, Double> bloodTypeDistributionUSA = new ImmutableMap.Builder<BloodType, Double>() .put(OPOS, 37.4) .put(APOS, 35.7) .put(BPOS, 8.5) .put(ABPOS, 3.4) .put(ONEG, 6.6) .put(ANEG, 6.3) .put(BNEG, 1.5) .put(ABNEG, 0.6) .build(); // create a Function that looks up the percentage in population based on Blood Type Function<BloodType, Double> bloodTypePct = Functions.forMap(bloodTypeDistributionUSA); // create a Function that looks up the Blood Type for a person Function<String, BloodType> personToBloodType = Functions.forMap(castBloodTypes); // create a Function that looks up the percentage in population with the same Blood Type as a given person Function<String, Double> personToBloodTypePct = Functions.compose(bloodTypePct, personToBloodType); // user our composed lookups to create a Comparator that sorts by availability Comparator<String> personToBloodTypePctComparator = Comparators.fromFunction(personToBloodTypePct); // reverse the comparator to prefer blood types with higher availability (descending) Comparator<String> personToBloodTypePctComparatorDescend = Collections.reverseOrder(personToBloodTypePctComparator); // create a Set of donors that is sorted according to blood type availability, favoring higher availability Set<String> possibleDonorsOptimized = Sets.newTreeSet(personToBloodTypePctComparatorDescend, possibleDonors); System.out.printf("%s (sorted in order of preference) can donate to %s", Join.join(", ", possibleDonorsOptimized), injured);
Note that we use some of the built-in "Functions" that Google collections provides. Functions.forMap()
creates a
function that simply does a lookup in a Map. Functions.compose() composes two functions, e.g. f(g(x)).
Comparators.fromFunction() creates a Comparator given a Function, and we
convert the Set of possible donors to an array to facilitate sorting. Finally, we use Java Collections Framework
Collections.reverseOrder() to flip the sort order.
The output is:
Betty, Wilma, Fred (sorted in order of preference) can donate to Barney
Please consider the items listed below before adopting Google Collections. This will help you to avoid bugs and the need for refactoring later.
There are still no published tests for the library. An outstanding issue (#54) on this subject explains that the tests are present and being extricated from other non-open source dependencies at Google.
Google Collections target Java 1.5. kevinb9n (Kevin Bourrillion): "We will upgrade to requiring Java 6, but will create a Java 5-compatible branch and will include both forms in our release." (1 Nov. 2007) however, "this will probably not happen for a long time" (6 Jan. 2009)
The stated version of Google Collections is 0.8 alpha.
Apache License 2.0: http://www.apache.org/licenses/LICENSE-2.0 The Apache License 2.0 is approved by the OSI via the License Review Process. http://www.opensource.org/licenses/alphabetical
Apache Commons Collections - seeks to build upon the JDK classes by providing new interfaces, implementations and utilities.
Commons-Collections with Generics - A Java 5 generics-enabled version of the popular Jakarta Commons-Collections project.