![]() |
James Thornton |
| Internet Business Consultant |
| Home | Blog | Bio | Projects | Contact | Latest Blog (new site): How to Get to Genius |
|---|
|
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ] Thinking in Java, 2nd edition, Revision 12©2000 by Bruce Eckel[ Previous Chapter ] [ Short TOC ] [ Table of Contents ] [ Index ] [ Next Chapter ]
9: Holding
|
|
boolean
add(Object) |
Ensures that the container holds the
argument. Returns false if it doesn’t add the argument. (This is an
“optional” method, described later in this
chapter.) |
|
boolean
|
Adds all the elements in the argument.
Returns true if any elements were added.
(“Optional.”) |
|
void
clear( ) |
Removes all the elements in the
container. (“Optional.”) |
|
boolean
|
true if the container holds the
argument. |
|
boolean
containsAll(Collection) |
true if the container holds all
the elements in the argument. |
|
boolean
isEmpty( ) |
true if the container has no
elements. |
|
Iterator
iterator( ) |
Returns an Iterator that you can
use to move through the elements in the container. |
|
boolean
|
If the argument is in the container, one
instance of that element is removed. Returns true if a removal occurred.
(“Optional.”) |
|
boolean
removeAll(Collection) |
Removes all the elements that are
contained in the argument. Returns true if any removals occurred.
(“Optional.”) |
|
boolean
retainAll(Collection) |
Retains only elements that are contained
in the argument (an “intersection” from set theory). Returns
true if any changes occurred. (“Optional.”) |
|
int size( ) |
Returns the number of elements in the
container. |
|
Object[]
toArray( ) |
Returns an array containing all the
elements in the container. |
|
Object[]
|
Returns an array containing all the
elements in the container, whose type is that of the array a rather than plain
Object (you must cast the array to the right type). |
Notice that there’s no
get( ) function for random-access element selection. That’s
because Collection also includes Set, which maintains its own
internal ordering (and thus makes random-access lookup meaningless). Thus, if
you want to examine all the elements of a Collection you must use an
iterator; that’s the only way to fetch things back.
The following example demonstrates all of
these methods. Again, these work with anything that inherits from
Collection, but an ArrayList is used as a kind of
“least-common denominator”:
[ Add Comment ]
//: c09:Collection1.java
// Things you can do with all Collections.
import java.util.*;
import com.bruceeckel.util.*;
public class Collection1 {
public static void main(String[] args) {
Collection c = new ArrayList();
Collections2.fill(c,
Collections2.countries, 10);
c.add("ten");
c.add("eleven");
System.out.println(c);
// Make an array from the List:
Object[] array = c.toArray();
// Make a String array from the List:
String[] str =
(String[])c.toArray(new String[1]);
// Find max and min elements; this means
// different things depending on the way
// the Comparable interface is implemented:
System.out.println("Collections.max(c) = " +
Collections.max(c));
System.out.println("Collections.min(c) = " +
Collections.min(c));
// Add a Collection to another Collection
Collection c2 = new ArrayList();
Collections2.fill(c2,
Collections2.countries, 10);
c.addAll(c2);
System.out.println(c);
c.remove(CountryCapitals.pairs[0][0]);
System.out.println(c);
c.remove(CountryCapitals.pairs[1][0]);
System.out.println(c);
// Remove all components that are in the
// argument collection:
c.removeAll(c2);
System.out.println(c);
c.addAll(c2);
System.out.println(c);
// Is an element in this Collection?
String val = CountryCapitals.pairs[3][0];
System.out.println(
"c.contains(" + val + ") = "
+ c.contains(val));
// Is a Collection in this Collection?
System.out.println(
"c.containsAll(c2) = "+ c.containsAll(c2));
Collection c3 = ((List)c).subList(3, 5);
// Keep all the elements that are in both
// c2 and c3 (an intersection of sets):
c2.retainAll(c3);
System.out.println(c);
// Throw away all the elements
// in c2 that also appear in c3:
c2.removeAll(c3);
System.out.println("c.isEmpty() = " +
c.isEmpty());
c = new ArrayList();
Collections2.fill(c,
Collections2.countries, 10);
System.out.println(c);
c.clear(); // Remove all elements
System.out.println("after c.clear():");
System.out.println(c);
}
} ///:~
ArrayLists are created containing
different sets of data and upcast to Collection objects, so it’s
clear that nothing other than the Collection interface is being used.
main( ) uses simple exercises to show all of the methods in
Collection.
[ Add Comment ]
The following sections describe the
various implementations of List, Set, and Map and indicate
in each case (with an asterisk) which one should be your default choice.
You’ll notice that the legacy classes Vector, Stack, and
Hashtable are not included because in all cases there are
preferred classes within the Java 2 Containers.
[ Add Comment ]
The basic List is quite simple to
use, as you’ve seen so far with ArrayList. Although most of the
time you’ll just use add( ) to insert objects,
get( ) to get them out one at a time, and iterator( ) to
get an Iterator to the sequence, there’s also a set of other
methods that can be useful.
[ Add Comment ]
In addition, there are actually two types
of List: the basic ArrayList, which excels at randomly accessing
elements, and the much more powerful LinkedList (which is not designed
for fast random access, but has a much more general set of
methods).
|
List (interface) |
Order is the most important feature of a
List; it promises to maintain elements in a particular sequence.
List adds a number of methods to Collection that allow insertion
and removal of elements in the middle of a List. (This is recommended
only for a LinkedList.) A List will produce a ListIterator,
and using this you can traverse the List in both directions, as well as
insert and remove elements in the middle of the List. |
|
ArrayList* |
A List implemented with an array.
Allows rapid random access to elements, but is slow when inserting and removing
elements from the middle of a list. ListIterator should be used only for
back-and-forth traversal of an ArrayList, but not for inserting and
removing elements, which is expensive compared to
LinkedList. |
|
LinkedList |
Provides optimal sequential access, with
inexpensive insertions and deletions from the middle of the List.
Relatively slow for random access. (Use ArrayList instead.) Also has
addFirst( ), addLast( ), getFirst( ),
getLast( ), removeFirst( ), and
removeLast( ) (which are not defined in any interfaces or base
classes) to allow it to be used as a stack, a queue, and a
deque. |
The methods in the following example each
cover a different group of activities: things that every list can do
(basicTest( )), moving around with an Iterator
(iterMotion( )) versus changing things with an
Iterator (iterManipulation( )), seeing the effects of
List manipulation (testVisual( )), and operations available
only to LinkedLists.
//: c09:List1.java
// Things you can do with Lists.
import java.util.*;
import com.bruceeckel.util.*;
public class List1 {
public static List fill(List a) {
Collections2.countries.reset();
Collections2.fill(a,
Collections2.countries, 10);
return a;
}
static boolean b;
static Object o;
static int i;
static Iterator it;
static ListIterator lit;
public static void basicTest(List a) {
a.add(1, "x"); // Add at location 1
a.add("x"); // Add at end
// Add a collection:
a.addAll(fill(new ArrayList()));
// Add a collection starting at location 3:
a.addAll(3, fill(new ArrayList()));
b = a.contains("1"); // Is it in there?
// Is the entire collection in there?
b = a.containsAll(fill(new ArrayList()));
// Lists allow random access, which is cheap
// for ArrayList, expensive for LinkedList:
o = a.get(1); // Get object at location 1
i = a.indexOf("1"); // Tell index of object
b = a.isEmpty(); // Any elements inside?
it = a.iterator(); // Ordinary Iterator
lit = a.listIterator(); // ListIterator
lit = a.listIterator(3); // Start at loc 3
i = a.lastIndexOf("1"); // Last match
a.remove(1); // Remove location 1
a.remove("3"); // Remove this object
a.set(1, "y"); // Set location 1 to "y"
// Keep everything that's in the argument
// (the intersection of the two sets):
a.retainAll(fill(new ArrayList()));
// Remove everything that's in the argument:
a.removeAll(fill(new ArrayList()));
i = a.size(); // How big is it?
a.clear(); // Remove all elements
}
public static void iterMotion(List a) {
ListIterator it = a.listIterator();
b = it.hasNext();
b = it.hasPrevious();
o = it.next();
i = it.nextIndex();
o = it.previous();
i = it.previousIndex();
}
public static void iterManipulation(List a) {
ListIterator it = a.listIterator();
it.add("47");
// Must move to an element after add():
it.next();
// Remove the element that was just produced:
it.remove();
// Must move to an element after remove():
it.next();
// Change the element that was just produced:
it.set("47");
}
public static void testVisual(List a) {
System.out.println(a);
List b = new ArrayList();
fill(b);
System.out.print("b = ");
System.out.println(b);
a.addAll(b);
a.addAll(fill(new ArrayList()));
System.out.println(a);
// Insert, remove, and replace elements
// using a ListIterator:
ListIterator x = a.listIterator(a.size()/2);
x.add("one");
System.out.println(a);
System.out.println(x.next());
x.remove();
System.out.println(x.next());
x.set("47");
System.out.println(a);
// Traverse the list backwards:
x = a.listIterator(a.size());
while(x.hasPrevious())
System.out.print(x.previous() + " ");
System.out.println();
System.out.println("testVisual finished");
}
// There are some things that only
// LinkedLists can do:
public static void testLinkedList() {
LinkedList ll = new LinkedList();
fill(ll);
System.out.println(ll);
// Treat it like a stack, pushing:
ll.addFirst("one");
ll.addFirst("two");
System.out.println(ll);
// Like "peeking" at the top of a stack:
System.out.println(ll.getFirst());
// Like popping a stack:
System.out.println(ll.removeFirst());
System.out.println(ll.removeFirst());
// Treat it like a queue, pulling elements
// off the tail end:
System.out.println(ll.removeLast());
// With the above operations, it's a dequeue!
System.out.println(ll);
}
public static void main(String[] args) {
// Make and fill a new list each time:
basicTest(fill(new LinkedList()));
basicTest(fill(new ArrayList()));
iterMotion(fill(new LinkedList()));
iterMotion(fill(new ArrayList()));
iterManipulation(fill(new LinkedList()));
iterManipulation(fill(new ArrayList()));
testVisual(fill(new LinkedList()));
testLinkedList();
}
} ///:~
In basicTest( ) and
iterMotion( ) the calls are simply made to show the proper syntax,
and while the return value is captured, it is not used. In some cases, the
return value isn’t captured since it isn’t typically used. You
should look up the full usage of each of these methods in the online
documentation from java.sun.com before you use them.
[ Add Comment ]
A stack is
sometimes referred to as a “last-in, first-out”
(LIFO) container. That is, whatever you
“push” on the stack last is the first item you can “pop”
out. Like all of the other containers in Java, what you push and pop are
Objects, so you must cast what you pop, unless you’re just using
Object behavior.
[ Add Comment ]
The LinkedList has methods that
directly implement stack functionality, so you can also just use a
LinkedList rather than making a stack class. However, a stack class can
sometimes tell the story better:
[ Add Comment ]
//: c09:StackL.java
// Making a stack from a LinkedList.
import java.util.*;
import com.bruceeckel.util.*;
public class StackL {
private LinkedList list = new LinkedList();
public void push(Object v) {
list.addFirst(v);
}
public Object top() { return list.getFirst(); }
public Object pop() {
return list.removeFirst();
}
public static void main(String[] args) {
StackL stack = new StackL();
for(int i = 0; i < 10; i++)
stack.push(Collections2.countries.next());
System.out.println(stack.top());
System.out.println(stack.top());
System.out.println(stack.pop());
System.out.println(stack.pop());
System.out.println(stack.pop());
}
} ///:~
If you want only stack behavior,
inheritance is inappropriate here because it would produce a class with all the
rest of the LinkedList methods (you’ll see later that this very
mistake was made by the Java 1.0 library designers with Stack).
[ Add Comment ]
A queue is
a “first-in, first-out” (FIFO)
container. That is, you put things in at one end, and pull them out at the
other. So the order in which you put them in will be the same order that they
come out. LinkedList has methods to support queue
behavior, so these can be used in a Queue class:
//: c09:Queue.java
// Making a queue from a LinkedList.
import java.util.*;
public class Queue {
private LinkedList list = new LinkedList();
public void put(Object v) { list.addFirst(v); }
public Object get() {
return list.removeLast();
}
public boolean isEmpty() {
return list.isEmpty();
}
public static void main(String[] args) {
Queue queue = new Queue();
for(int i = 0; i < 10; i++)
queue.put(Integer.toString(i));
while(!queue.isEmpty())
System.out.println(queue.get());
}
} ///:~
You can also easily create a deque
(double-ended queue) from a LinkedList. This is like a queue, but you can
add and remove elements from either end.
[ Add Comment ]
Set has exactly the same interface
as Collection, so there isn’t any extra functionality like there is
with the two different Lists. Instead, the Set is exactly a
Collection, it just has different behavior. (This is the ideal use of
inheritance and polymorphism: to express different behavior.) A Set
refuses to hold more than one instance of each object value (what constitutes
the “value” of an object is more complex, as you shall see).
|
Set (interface) |
Each element that you add to the
Set must be unique; otherwise the Set doesn’t add the
duplicate element. Objects added to a Set must define
equals( ) to establish object uniqueness. Set has exactly the
same interface as Collection. The Set interface does not guarantee
it will maintain its elements in any particular order. |
|
HashSet* |
For Sets where fast lookup time is
important. Objects must also define
hashCode( ). |
|
TreeSet |
An ordered Set backed by a tree.
This way, you can extract an ordered sequence from a
Set. |
The following example does not
show everything you can do with a Set, since the interface is the same as
Collection, and so was exercised in the previous example. Instead, this
demonstrates the behavior that makes a Set unique:
//: c09:Set1.java
// Things you can do with Sets.
import java.util.*;
import com.bruceeckel.util.*;
public class Set1 {
static Collections2.StringGenerator gen =
Collections2.countries;
public static void testVisual(Set a) {
Collections2.fill(a, gen.reset(), 10);
Collections2.fill(a, gen.reset(), 10);
Collections2.fill(a, gen.reset(), 10);
System.out.println(a); // No duplicates!
// Add another set to this one:
a.addAll(a);
a.add("one");
a.add("one");
a.add("one");
System.out.println(a);
// Look something up:
System.out.println("a.contains(\"one\"): " +
a.contains("one"));
}
public static void main(String[] args) {
System.out.println("HashSet");
testVisual(new HashSet());
System.out.println("TreeSet");
testVisual(new TreeSet());
}
} ///:~
Duplicate values are added to the
Set, but when it is printed you’ll see the Set has accepted
only one instance of each value.
[ Add Comment ]
When you run this program you’ll
notice that the order maintained by the HashSet is different from
TreeSet, since each has a different way of storing elements so they can
be located later. (TreeSet keeps them sorted, while HashSet uses a
hashing function, which is designed specifically for rapid lookups.) When
creating your own types, be aware that a Set needs a way to maintain a
storage order, which means you must implement the Comparable interface
and define the compareTo( ) method. Here’s an
example:
//: c09:Set2.java
// Putting your own type in a Set.
import java.util.*;
class MyType implements Comparable {
private int i;
public MyType(int n) { i = n; }
public boolean equals(Object o) {
return
(o instanceof MyType)
&& (i == ((MyType)o).i);
}
public int hashCode() { return i; }
public String toString() { return i + " "; }
public int compareTo(Object o) {
int i2 = ((MyType)o).i;
return (i2 < i ? -1 : (i2 == i ? 0 : 1));
}
}
public class Set2 {
public static Set fill(Set a, int size) {
for(int i = 0; i < size; i++)
a.add(new MyType(i));
return a;
}
public static void test(Set a) {
fill(a, 10);
fill(a, 10); // Try to add duplicates
fill(a, 10);
a.addAll(fill(new TreeSet(), 10));
System.out.println(a);
}
public static void main(String[] args) {
test(new HashSet());
test(new TreeSet());
}
} ///:~
The
form for the definitions for equals( ) and
hashCode( ) will be described later in this chapter. You must define
an equals( ) in both cases, but the hashCode( ) is
absolutely necessary only if the class will be placed in a HashSet (which
is likely, since that should generally be your first choice as a Set
implementation). However, as a programming style you should always override
hashCode( ) when you override equals( ). This process
will be fully detailed later in this chapter.
[ Add Comment ]
In the compareTo( ), note
that I did not use the “simple and obvious” form return
i-i2. Although this is a common programming error, it would only work
properly if i and i2 were “unsigned” ints (if
Java had an “unsigned” keyword, which it does not). It breaks
for Java’s signed int, which is not big enough to represent the
difference of two signed ints. If i is a large positive integer
and j is a large negative integer, i-j will overflow and return a
negative value, which will not work.
[ Add Comment ]
If you have a SortedSet (of which
TreeSet is the only one available), the elements are guaranteed to be in
sorted order which allows additional functionality to be provided with these
methods in the SortedSet interface:
[ Add Comment ]
Comparator comparator(): Produces
the Comparator used for this Set, or null for natural
ordering.
Object first(): Produces the
lowest element.
Object last(): Produces the
highest element.
SortedSet subSet(fromElement,
toElement): Produces a view of this Set with elements from
fromElement, inclusive, to toElement, exclusive.
SortedSet headSet(toElement):
Produces a view of this Set with elements less than toElement.
SortedSet tailSet(fromElement):
Produces a view of this Set with elements greater than or equal to
fromElement.
An ArrayList allows you to select
from a sequence of objects using a number, so in a sense it associates numbers
to objects. But what if you’d like to select from a sequence of objects
using some other criterion? A stack is an example: its selection criterion is
“the last thing pushed on the stack.” A powerful twist on this idea
of “selecting from a sequence” is alternately termed a
map, a dictionary,
or an associative array.
Conceptually, it seems like an ArrayList, but instead of looking up
objects using a number, you look them up using another object! This is
often a key process in a program.
[ Add Comment ]
The concept shows up in Java as the
Map interface. The put(Object key, Object value) method adds a
value (the thing you want), and associates it with a key (the thing you look it
up with). get(Object key) produces the value given the corresponding key.
You can also test a Map to see if it contains a key or a value with
containsKey( ) and containsValue( ).
[ Add Comment ]
The standard Java library contains two
different types of Maps: HashMap and TreeMap. Both have the
same interface (since they both implement Map), but they differ in one
distinct way: efficiency. If you look at what must be done for a
get( ), it seems pretty slow to search through (for example) an
ArrayList for the key. This is where HashMap speeds things up.
Instead of a slow search for the key, it uses a special value called a
hash code. The hash code is a way to take
some information in the object in question and turn it into a “relatively
unique” int for that object. All Java objects can produce a hash
code, and hashCode( ) is a method in the root
class Object. A HashMap takes the
hashCode( ) of the object and uses it to quickly hunt for the key.
This results in a dramatic performance
improvement[50].
|
Map (interface) |
Maintains key-value associations (pairs),
so you can look up a value using a key. |
|
HashMap* |
Implementation based on a hash table.
(Use this instead of Hashtable.) Provides constant-time performance for
inserting and locating pairs. Performance can be adjusted via constructors that
allow you to set the capacity and load factor of the hash
table. |
|
TreeMap |
Implementation based on a red-black tree.
When you view the keys or the pairs, they will be in sorted order (determined by
Comparable or Comparator, discussed later). The point of a
TreeMap is that you get the results in sorted order. TreeMap is
the only Map with the subMap( ) method, which allows you to
return a portion of the tree. |
Sometimes you’ll also need to know
the details of how hashing works, so we’ll look at that a little
later.
The following example uses the
Collections2.fill( ) method and the test data sets that were
previously defined:
[ Add Comment ]
//: c09:Map1.java
// Things you can do with Maps.
import java.util.*;
import com.bruceeckel.util.*;
public class Map1 {
static Collections2.StringPairGenerator geo =
Collections2.geography;
static Collections2.RandStringPairGenerator
rsp = Collections2.rsp;
// Producing a Set of the keys:
public static void printKeys(Map m) {
System.out.print("Size = " + m.size() +", ");
System.out.print("Keys: ");
System.out.println(m.keySet());
}
// Producing a Collection of the values:
public static void printValues(Map m) {
System.out.print("Values: ");
System.out.println(m.values());
}
public static void test(Map m) {
Collections2.fill(m, geo, 25);
// Map has 'Set' behavior for keys:
Collections2.fill(m, geo.reset(), 25);
printKeys(m);
printValues(m);
System.out.println(m);
String key = CountryCapitals.pairs[4][0];
String value = CountryCapitals.pairs[4][1];
System.out.println("m.containsKey(\"" + key +
"\"): " + m.containsKey(key));
System.out.println("m.get(\"" + key + "\"): "
+ m.get(key));
System.out.println("m.containsValue(\""
+ value + "\"): " +
m.containsValue(value));
Map m2 = new TreeMap();
Collections2.fill(m2, rsp, 25);
m.putAll(m2);
printKeys(m);
key = m.keySet().iterator().next().toString();
System.out.println("First key in map: "+key);
m.remove(key);
printKeys(m);
m.clear();
System.out.println("m.isEmpty(): "
+ m.isEmpty());
Collections2.fill(m, geo.reset(), 25);
// Operations on the Set change the Map:
m.keySet().removeAll(m.keySet());
System.out.println("m.isEmpty(): "
+ m.isEmpty());
}
public static void main(String[] args) {
System.out.println("Testing HashMap");
test(new HashMap());
System.out.println("Testing TreeMap");
test(new TreeMap());
}
} ///:~
The printKeys( ) and
printValues( ) methods are not only useful utilities, they also
demonstrate how to produce Collection views of a Map. The
keySet( ) method produces a Set backed by the keys in the
Map. Similar treatment is given to values( ), which produces
a Collection containing all the values in the Map. (Note that keys
must be unique, while values may contain duplicates.) Since these
Collections are backed by the Map, any changes in a
Collection will be reflected in the associated Map.
[ Add Comment ]
The rest of the program provides simple
examples of each Map operation, and tests each type of Map.
[ Add Comment ]
As an example of the use of a
HashMap, consider a program to check the randomness of Java’s
Math.random( ) method.
Ideally, it would produce a perfect distribution of random numbers, but to test
this you need to generate a bunch of random numbers and count the ones that fall
in the various ranges. A HashMap is perfect for this, since it associates
objects with objects (in this case, the value object contains the number
produced by Math.random( ) along with the number of times that
number appears):
//: c09:Statistics.java
// Simple demonstration of HashMap.
import java.util.*;
class Counter {
int i = 1;
public String toString() {
return Integer.toString(i);
}
}
public class Statistics {
public static void main(String[] args) {
HashMap hm = new HashMap();
for(int i = 0; i < 10000; i++) {
// Produce a number between 0 and 20:
Integer r =
new Integer((int)(Math.random() * 20));
if(hm.containsKey(r))
((Counter)hm.get(r)).i++;
else
hm.put(r, new Counter());
}
System.out.println(hm);
}
} ///:~
In main( ), each time a
random number is generated it is wrapped inside an Integer object so that
reference can be used with the HashMap. (You can’t use a primitive
with a container, only an object reference.) The containsKey( )
method checks to see if this key is already in the container. (That is, has the
number been found already?) If so, the get( )
method produces the associated value for the key, which in this case is a
Counter object. The value i inside the counter is incremented to
indicate that one more of this particular random number has been found.
[ Add Comment ]
If the key has not been found yet, the
method put( ) will place a new key-value pair
into the HashMap. Since Counter automatically initializes its
variable i to one when it’s created, it indicates the first
occurrence of this particular random number.
[ Add Comment ]
To display the HashMap, it is
simply printed. The HashMap toString( ) method moves through
all the key-value pairs and calls the toString( ) for each one. The
Integer.toString( ) is predefined, and you can see the
toString( ) for Counter. The output from one run (with some
line breaks added) is:
{19=526, 18=533, 17=460, 16=513, 15=521, 14=495,
13=512, 12=483, 11=488, 10=487, 9=514, 8=523,
7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475,
0=505}
You might wonder at the necessity of the
class Counter, which seems like it doesn’t even have the
functionality of the wrapper class Integer. Why not use int or
Integer? Well, you can’t use an int because all of the
containers can hold only Object references. After seeing containers the
wrapper classes might begin to make a little more sense to you, since you
can’t put any of the primitive types in containers. However, the only
thing you can do with the Java wrappers is to
initialize them to a particular value and read that value. That is,
there’s no way to change a value once a wrapper object has been created.
This makes the Integer wrapper immediately useless to solve our problem,
so we’re forced to create a new class that does satisfy the need.
[ Add Comment ]
If you have a SortedMap (of which
TreeMap is the only one available), the keys are guaranteed to be in
sorted order which allows additional functionality to be provided with these
methods in the SortedMap interface:
[ Add Comment ]
Comparator comparator(): Produces
the comparator used for this Map, or null for natural ordering.
Object firstKey(): Produces the
lowest key.
Object lastKey(): Produces the
highest key.
SortedMap subMap(fromKey, toKey):
Produces a view of this Map with keys from fromKey, inclusive, to
toKey, exclusive.
SortedMap headMap(toKey): Produces
a view of this Map with keys less than toKey.
In the previous example, a standard
library class (Integer) was used as a key for the HashMap. It
worked fine as a key, because it has all the necessary wiring to make it work
correctly as a key. But a common pitfall occurs with HashMaps when you
create your own classes to be used as keys. For example, consider a weather
predicting system that matches Groundhog objects to Prediction
objects. It seems fairly straightforward—you create the two classes, and
use Groundhog as the key and Prediction as the value:
[ Add Comment ]
//: c09:SpringDetector.java
// Looks plausible, but doesn't work.
import java.util.*;
class Groundhog {
int ghNumber;
Groundhog(int n) { ghNumber = n; }
}
class Prediction {
boolean shadow = Math.random() > 0.5;
public String toString() {
if(shadow)
return "Six more weeks of Winter!";
else
return "Early Spring!";
}
}
public class SpringDetector {
public static void main(String[] args) {
HashMap hm = new HashMap();
for(int i = 0; i < 10; i++)
hm.put(new Groundhog(i), new Prediction());
System.out.println("hm = " + hm + "\n");
System.out.println(
"Looking up prediction for Groundhog #3:");
Groundhog gh = new Groundhog(3);
if(hm.containsKey(gh))
System.out.println((Prediction)hm.get(gh));
else
System.out.println("Key not found: " + gh);
}
} ///:~
Each Groundhog is given an
identity number, so you can look up a Prediction in the HashMap by
saying, “Give me the Prediction associated with Groundhog
number 3.” The Prediction class contains a boolean that is
initialized using Math.random( ), and a toString( ) that
interprets the result for you. In main( ), a HashMap is
filled with Groundhogs and their associated Predictions. The
HashMap is printed so you can see that it has been filled. Then a
Groundhog with an identity number of 3 is used as a key to look up the
prediction for Groundhog #3 (which you can see must be in the
Map).
[ Add Comment ]
It seems simple enough, but it
doesn’t work. The problem is that Groundhog is inherited from the
common root class Object (which is what happens if you don’t
specify a base class, thus all classes are ultimately inherited from
Object). It is Object’s hashCode( ) method that
is used to generate the hash code for each object, and by default it just uses
the address of its object. Thus, the first instance of Groundhog(3) does
not produce a hash code equal to the hash code for the second instance of
Groundhog(3) that we tried to use as a lookup.
[ Add Comment ]
You might think that all you need to do
is write an appropriate override for
hashCode( ). But it still won’t work
until you’ve done one more thing: override the
equals( ) that is also part of Object.
This method is used by the HashMap when trying to determine if your key
is equal to any of the keys in the table. Again, the default
Object.equals( ) simply compares object addresses, so one
Groundhog(3) is not equal to another Groundhog(3).
[ Add Comment ]
Thus, to use your own classes as keys in
a HashMap, you must override both hashCode( ) and
equals( ), as shown in the following solution to the problem
above:
//: c09:SpringDetector2.java
// A class that's used as a key in a HashMap
// must override hashCode() and equals().
import java.util.*;
class Groundhog2 {
int ghNumber;
Groundhog2(int n) { ghNumber = n; }
public int hashCode() { return ghNumber; }
public boolean equals(Object o) {
return (o instanceof Groundhog2)
&& (ghNumber == ((Groundhog2)o).ghNumber);
}
}
public class SpringDetector2 {
public static void main(String[] args) {
HashMap hm = new HashMap();
for(int i = 0; i < 10; i++)
hm.put(new Groundhog2(i),new Prediction());
System.out.println("hm = " + hm + "\n");
System.out.println(
"Looking up prediction for groundhog #3:");
Groundhog2 gh = new Groundhog2(3);
if(hm.containsKey(gh))
System.out.println((Prediction)hm.get(gh));
}
} ///:~
Note that this uses the Prediction
class from the previous example, so SpringDetector.java must be compiled
first or you’ll get a compile-time error when you try to compile
SpringDetector2.java.
[ Add Comment ]
Groundhog2.hashCode( )
returns the groundhog number as an identifier. In this example, the programmer
is responsible for ensuring that no two groundhogs exist with the same ID
number. The hashCode( ) is not required to return a unique
identifier (something you’ll understand better later in this chapter), but
the equals( ) method must be able to strictly determine whether two
objects are equivalent.
[ Add Comment ]
Even though it appears that the
equals( ) method is only checking to see whether the argument is an
instance of Groundhog2 (using the instanceof keyword, which is
fully explained in Chapter 12), the instanceof actually quietly does a
second sanity check, to see if the object is null, since
instanceof produces false if the left-hand argument is
null. Assuming it’s the correct type and not null, the
comparison is based on the actual ghNumbers. This time, when you run the
program, you’ll see it produces the correct output.
[ Add Comment ]
When creating your own class to use in a
HashSet, you must pay attention to the same issues as when it is used as
a key in a HashMap.
[ Add Comment ]
The above example is only a start toward
solving the problem correctly. It shows that if you do not override
hashCode( ) and
equals( ) for your key, the hashed data
structure (HashSet or HashMap) will not be able to deal with your
key properly. However, to get a good solution for the problem you need to
understand what’s going on inside the hashed data structure.
[ Add Comment ]
First, consider the motivation behind
hashing: you want to look up an object using another
object. But you can accomplish this with a TreeSet or TreeMap,
too. It’s also possible to implement your own Map. To do so, the
Map.entrySet( ) method must be supplied to produce a set of
Map.Entry objects. MPair will be defined as the new type of
Map.Entry. In order for it to be placed in a
TreeSet it must implement equals( ) and be
Comparable:
//: c09:MPair.java
// A Map implemented with ArrayLists.
import java.util.*;
public class MPair
implements Map.Entry, Comparable {
Object key, value;
MPair(Object k, Object v) {
key = k;
value = v;
}
public Object getKey() { return key; }
public Object getValue() { return value; }
public Object setValue(Object v){
Object result = value;
value = v;
return result;
}
public boolean equals(Object o) {
return key.equals(((MPair)o).key);
}
public int compareTo(Object rv) {
return ((Comparable)key).compareTo(
((MPair)rv).key);
}
} ///:~
Notice that the comparisons are only interested in the keys, so duplicate values are perfectly acceptable.
The
following example implements a Map using a pair of ArrayLists:
[ Add Comment ]
//: c09:SlowMap.java
// A Map implemented with ArrayLists.
import java.util.*;
import com.bruceeckel.util.*;
public class SlowMap extends AbstractMap {
private ArrayList
keys = new ArrayList(),
values = new ArrayList();
public Object put(Object key, Object value) {
Object result = get(key);
if(!keys.contains(key)) {
keys.add(key);
values.add(value);
} else
values.set(keys.indexOf(key), value);
return result;
}
public Object get(Object key) {
if(!keys.contains(key))
return null;
return values.get(keys.indexOf(key));
}
public Set entrySet() {
Set entries = new HashSet();
Iterator
ki = keys.iterator(),
vi = values.iterator();
while(ki.hasNext())
entries.add(new MPair(ki.next(), vi.next()));
return entries;
}
public static void main(String[] args) {
SlowMap m = new SlowMap();
Collections2.fill(m,
Collections2.geography, 25);
System.out.println(m);
}
} ///:~
The put( ) method simply
places the keys and values in corresponding ArrayLists. In
main( ), a SlowMap is loaded and then printed to show that it
works.
This shows that it’s not that hard
to produce a new type of Map. But as the name suggests, a SlowMap
isn’t very fast, so you probably wouldn’t use it if you had an
alternative available. The problem is in the lookup of the key: there is no
order so a simple linear search is used, which is the slowest way to look
something up.
[ Add Comment ]
The whole point of hashing is speed:
hashing allows the lookup to happen quickly. Since the bottleneck is in the
speed of the key lookup, one of the solutions to the problem could be by keeping
the keys sorted and then using Collections.binarySearch( ) to
perform the lookup (an exercise at the end of this chapter will walk you through
this process).
[ Add Comment ]
Hashing goes further by saying that all
you want to do is to store the key somewhere so that it can be quickly
found. As you’ve seen in this chapter, the fastest structure in which to
store a group of elements is an array, so that will be used for representing the
key information (note carefully that I said “key information,” and
not the key itself). Also seen in this chapter was the fact that an array, once
allocated, cannot be resized, so we have a problem: we want to be able to store
any number of values in the Map, but if the number of keys is fixed by
the array size, how can this be?
[ Add Comment ]
The answer is that the array will not
hold the keys. From the key object, a number will be derived that will index
into the array. This number is the hash code,
produced by the hashCode( ) method (in computer science parlance,
this is the hash function) defined in
Object and presumably overridden by your class. To solve the problem of
the fixed-size array, more than one key may produce the same index. That is,
there may be collisions. Because of this, it
doesn’t matter how big the array is because each key object will land
somewhere in that array.
[ Add Comment ]
So the process of looking up a value
starts by computing the hash code and using it to index into the array. If you
could guarantee that there were no collisions (which could be possible if you
have a fixed number of values) then you’d have a
perfect hashing function,
but that’s a special case. In all other cases, collisions are handled by
external chaining: the array points not directly
to a value, but instead to a list of values. These values are searched in a
linear fashion using the equals( ) method. Of course, this aspect of
the search is much slower, but if the hash function is good there will only be a
few values in each slot, at the most. So instead of searching through the entire
list, you quickly jump to a slot where you have to compare a few entries to find
the value. This is much faster, which is why the HashMap is so quick.
[ Add Comment ]
Knowing the basics of hashing, it’s
possible to implement a simple hashed Map:
//: c09:SimpleHashMap.java
// A demonstration hashed Map.
import java.util.*;
import com.bruceeckel.util.*;
public class SimpleHashMap extends AbstractMap {
// Choose a prime number for the hash table
// size, to achieve a uniform distribution:
private final static int SZ = 997;
private LinkedList[] bucket= new LinkedList[SZ];
public Object put(Object key, Object value) {
Object result = null;
int index = key.hashCode() % SZ;
if(index < 0) index = -index;
if(bucket[index] == null)
bucket[index] = new LinkedList();
LinkedList pairs = bucket[index];
MPair pair = new MPair(key, value);
ListIterator it = pairs.listIterator();
boolean found = false;
while(it.hasNext()) {
Object iPair = it.next();
if(iPair.equals(pair)) {
result = ((MPair)iPair).getValue();
it.set(pair); // Replace old with new
found = true;
break;
}
}
if(!found)
bucket[index].add(pair);
return result;
}
public Object get(Object key) {
int index = key.hashCode() % SZ;
if(index < 0) index = -index;
if(bucket[index] == null) return null;
LinkedList pairs = bucket[index];
MPair match = new MPair(key, null);
ListIterator it = pairs.listIterator();
while(it.hasNext()) {
Object iPair = it.next();
if(iPair.equals(match))
return ((MPair)iPair).getValue();
}
return null;
}
public Set entrySet() {
Set entries = new HashSet();
for(int i = 0; i < bucket.length; i++) {
if(bucket[i] == null) continue;
Iterator it = bucket[i].iterator();
while(it.hasNext())
entries.add(it.next());
}
return entries;
}
public static void main(String[] args) {
SimpleHashMap m = new SimpleHashMap();
Collections2.fill(m,
Collections2.geography, 25);
System.out.println(m);
}
} ///:~
Because the “slots” in a hash
table are often referred to as buckets, the array that represents the
actual table is called bucket. To promote even distribution, the number
of buckets is typically a prime number. Notice that it is an array of
LinkedList, which automatically provides for collisions—each new
item is simply added to the end of the list.
[ Add Comment ]
The return value of put( ) is
null or, if the key was already in the list, the old value associated
with that key. The return value is result, which is initialized to
null, but if a key is discovered in the list then result is
assigned to that key.
[ Add Comment ]
For both put( ) and
get( ), the first thing that happens is that the
hashCode( ) is called for the key, and the result is forced to a
positive number. Then it is forced to fit into the bucket array using the
modulus operator and the size of the array. If that location is null, it
means there are no elements that hash to that location, so a new
LinkedList is created to hold the object that just did. However, the
normal process is to look through the list to see if there are duplicates, and
if there are, the old value is put into result and the new value replaces
the old. The found flag keeps track of whether an old key-value pair was
found and, if not, the new pair is appended to the end of the list.
[ Add Comment ]
In get( ), you’ll see
very similar code as that contained in put( ), but simpler. The
index is calculated into the bucket array, and if a LinkedList
exists it is searched for a match.
[ Add Comment ]
entrySet( ) must find and
traverse all the lists, adding them to the result Set. Once this method
has been created, the Map can be tested by filling it with values and
then printing them.
[ Add Comment ]
To understand the issues, some
terminology is necessary:
Initial
capacity: The number of buckets when the table is created.
HashMap and HashSet: have constructors that allow you to specify
the initial capacity.
Load
factor: size/capacity. A load factor of 0 is an empty table, 0.5
is a half-full table, etc. A lightly-loaded table will have few collisions and
so is optimal for insertions and lookups (but will slow down the process of
traversing with an iterator). HashMap and HashSet have
constructors that allow you to specify the load factor, which means that when
this load factor is reached the container will automatically increase the
capacity (the number of buckets) by roughly doubling it, and will redistribute
the existing objects into the new set of buckets (this is called
rehashing).
The default load factor used by
HashMap is 0.75 (it doesn’t rehash until the table is ¾ full).
This seems to be a good trade-off between time and space costs. A higher load
factor decreases the space required by the table but increases the lookup cost,
which is important because lookup is what you do most of the time (including
both get( ) and put( )).
[ Add Comment ]
If you know that you’ll be storing
many entries in a HashMap, creating it with an appropriately large
initial capacity will prevent the overhead of automatic rehashing.
[ Add Comment ]
Now that you understand what’s
involved in the function of the HashMap, the issues involved in writing a
hashCode( ) will make more sense.
[ Add Comment ]
First of all, you don’t have
control of the creation of the actual value that’s used to index into the
array of buckets. That is dependent on the capacity of the particular
HashMap object, and that capacity changes depending on how full the
container is, and what the load factor is. The value produced by your
hashCode( ) will be further processed in order to create the bucket
index (in SimpleHashMap the calculation is just a modulo by the size of
the bucket array).
[ Add Comment ]
The most important factor in creating a
hashCode( ) is that, regardless of when hashCode( ) is
called, it produces the same value for a particular object every time it is
called. If you end up with an object that produces one hashCode( )
value when it is put( ) into a HashMap, and another
during a get( ), you won’t be able to retrieve the objects. So
if your hashCode( ) depends on mutable data in the object the user
must be made aware that changing the data will effectively produce a different
key by generating a different hashCode( ).
[ Add Comment ]
In addition, you will probably not
want to generate a hashCode( ) that is based on unique object
information—in particular, the value of this makes a bad
hashCode( ) because then you can’t generate a new identical
key to the one used to put( ) the original key-value pair. This was
the problem that occurred in SpringDetector.java because the default
implementation of hashCode( ) does use the object address. So
you’ll want to use information in the object that identifies the object in
a meaningful way.
[ Add Comment ]
One example is found in the String
class. Strings have the special characteristic that if a program has
several String objects that contain identical character sequences, then
those String objects all map to the same memory (the mechanism for this
is described in Appendix A). So it makes sense that the hashCode( )
produced by two separate instances of new String(“hello”)
should be identical. You can see it by running this program:
//: c09:StringHashCode.java
public class StringHashCode {
public static void main(String[] args) {
System.out.println("Hello".hashCode());
System.out.println("Hello".hashCode());
}
} ///:~
For this to work, the
hashCode( ) for String must be based on the contents of the
String.
[ Add Comment ]
So for a hashCode( ) to be
effective, it must be fast and it must be meaningful: that is, it must generate
a value based on the contents of the object. Remember that this value
doesn’t have to be unique—you should lean toward speed rather than
uniqueness—but between hashCode( ) and equals( )
the identity of the object must be completely resolved.
[ Add Comment ]
Because the hashCode( ) is
further processed before the bucket index is produced, the range of values is
not important; it just needs to generate an int.
[ Add Comment ]
There’s one other factor: a good
hashCode( ) should result in an even distribution of values. If the
values tend to cluster, then the HashMap or HashSet will be more
heavily loaded in some areas and will not be as fast as it could be with an
evenly distributed hashing function.
[ Add Comment ]
Here’s an example that follows
these guidelines:
//: c09:CountedString.java
// Creating a good hashCode().
import java.util.*;
public class CountedString {
private String s;
private int id = 0;
private static ArrayList created =
new ArrayList();
public CountedString(String str) {
s = str;
created.add(s);
Iterator it = created.iterator();
// Id is the total number of instances
// of this string in use by CountedString:
while(it.hasNext())
if(it.next().equals(s))
id++;
}
public String toString() {
return "String: " + s + " id: " + id +
" hashCode(): " + hashCode() + "\n";
}
public int hashCode() {
return s.hashCode() * id;
}
public boolean equals(Object o) {
return (o instanceof CountedString)
&& s.equals(((CountedString)o).s)
&& id == ((CountedString)o).id;
}
public static void main(String[] args) {
HashMap m = new HashMap();
CountedString[] cs = new CountedString[10];
for(int i = 0; i < cs.length; i++) {
cs[i] = new CountedString("hi");
m.put(cs[i], new Integer(i));
}
System.out.println(m);
for(int i = 0; i < cs.length; i++) {
System.out.print("Looking up " + cs[i]);
System.out.println(m.get(cs[i]));
}
}
} ///:~
CountedString includes a
String and an id that represents the number of
CountedString objects that contain an identical String. The
counting is accomplished in the constructor by iterating through the static
ArrayList where all the Strings are stored.
[ Add Comment ]
Both hashCode( ) and
equals( ) produce results based on both fields; if they were just
based on the String alone or the id alone there would be duplicate
matches for distinct values.
[ Add Comment ]
Note how simple the
hashCode( ) is: String’s hashCode( ) is
multiplied by the id. Smaller is generally better (and faster) for
hashCode( ).
[ Add Comment ]
In main( ), a bunch of
CountedString objects are created, using the same String to show
that the duplicates create unique values because of the count id. The
HashMap is displayed so that you can see how it is stored internally (no
discernible orders) and then each key is looked up individually to demonstrate
that the lookup mechanism is working properly.
[ Add Comment ]
The java.lang.ref library contains
a set of classes that allow greater flexibility in garbage collection, which are
especially useful when you have large objects that may cause memory exhaustion.
There are three classes inherited from the abstract class
Reference:
SoftReference,
WeakReference, and
PhantomReference. Each of these provides a different
level of indirection for the garbage collector, if the object in question is
only reachable through one of these Reference objects.
[ Add Comment ]
If an object is
reachable it means that
somewhere in your program the object can be found. This could mean that you have
an ordinary reference on the stack that goes right to the object, but you might
also have a reference to an object that has a reference to the object in
question; there could be many intermediate links. If an object is reachable, the
garbage collector cannot release it because it’s still in use by your
program. If an object isn’t reachable, there’s no way for your
program to use it so it’s safe to garbage-collect that object.
[ Add Comment ]
You use Reference objects when you
want to continue to hold onto a reference to that object—you want to be
able to reach that object—but you also want to allow the garbage collector
to release that object. Thus, you have a way to go on using the object, but if
memory exhaustion is imminent you allow that object to
be released.
[ Add Comment ]
You accomplish this by using a
Reference object as an intermediary between you and the ordinary
reference, and there must be no ordinary references to the object (ones
that are not wrapped inside Reference objects). If the garbage collector
discovers that an object is reachable through an ordinary reference, it will not
release that object.
[ Add Comment ]
In the order SoftReference,
WeakReference, and PhantomReference, each one is “weaker”
than the last, and corresponds to a different level of reachability. Soft
references are for implementing memory-sensitive caches. Weak references are for
implementing “canonicalizing mappings”—where instances of
objects can be simultaneously used in multiple places in a program, to save
storage—that do not prevent their keys (or values) from being reclaimed.
Phantom references are for scheduling pre-mortem cleanup actions in a more
flexible way than is possible with the Java finalization mechanism.
[ Add Comment ]
With SoftReferences and
WeakReferences, you have a choice about whether to place them on a
ReferenceQueue (the device used for premortem cleanup actions), but a
PhantomReference can only be built on a ReferenceQueue.
Here’s a simple demonstration:
//: c09:References.java
// Demonstrates Reference objects
import java.lang.ref.*;
class VeryBig {
static final int SZ = 10000;
double[] d = new double[SZ];
String ident;
public VeryBig(String id) { ident = id; }
public String toString() { return ident; }
public void finalize() {
System.out.println("Finalizing " + ident);
}
}
public class References {
static ReferenceQueue rq= new ReferenceQueue();
public static void checkQueue() {
Object inq = rq.poll();
if(inq != null)
System.out.println("In queue: " +
(VeryBig)((Reference)inq).get());
}
public static void main(String[] args) {
int size = 10;
// Or, choose size via the command line:
if(args.length > 0)
size = Integer.parseInt(args[0]);
SoftReference[] sa =
new SoftReference[size];
for(int i = 0; i < sa.length; i++) {
sa[i] = new SoftReference(
new VeryBig("Soft " + i), rq);
System.out.println("Just created: " +
(VeryBig)sa[i].get());
checkQueue();
}
WeakReference[] wa =
new WeakReference[size];
for(int i = 0; i < wa.length; i++) {
wa[i] = new WeakReference(
new VeryBig("Weak " + i), rq);
System.out.println("Just created: " +
(VeryBig)wa[i].get());
checkQueue();
}
SoftReference s = new SoftReference(
new VeryBig("Soft"));
WeakReference w = new WeakReference(
new VeryBig("Weak"));
System.gc();
PhantomReference[] pa =
new PhantomReference[size];
for(int i = 0; i < pa.length; i++) {
pa[i] = new PhantomReference(
new VeryBig("Phantom " + i), rq);
System.out.println("Just created: " +
(VeryBig)pa[i].get());
checkQueue();
}
}
} ///:~
When you run this program (you’ll
want to pipe the output through a “more” utility so that you can
view the output in pages), you’ll see that the objects are
garbage-collected, even though you still have access to them through the
Reference object (to get the actual object reference, you use
get( )). You’ll also see that the ReferenceQueue always
produces a Reference containing a null object. To make use of
this, you can inherit from the particular Reference class you’re
interested in and add more useful methods to the new type of Reference.
[ Add Comment ]
The containers library has a special
Map to hold weak references: the
WeakHashMap. This class is designed to make the
creation of canonicalized mappings easier. In such a mapping, you are saving
storage by making only one instance of a particular value. When the program
needs that value, it looks up the existing object in the mapping and uses that
(rather than creating one from scratch). The mapping may make the values as part
of its initialization, but it’s more likely that the values are made on
demand.
[ Add Comment ]
Since this is a storage-saving technique,
it’s very convenient that the WeakHashMap allows the garbage
collector to automatically clean up the keys and values. You don’t have to
do anything special to the keys and values you want to place in the
WeakHashMap; these are automatically wrapped in WeakReferences by
the map. The trigger to allow cleanup is if the key is no longer in use, as
demonstrated here:
//: c09:CanonicalMapping.java
// Demonstrates WeakHashMap.
import java.util.*;
import java.lang.ref.*;
class Key {
String ident;
public Key(String id) { ident = id; }
public String toString() { return ident; }
public int hashCode() {
return ident.hashCode();
}
public boolean equals(Object r) {
return (r instanceof Key)
&& ident.equals(((Key)r).ident);
}
public void finalize() {
System.out.println("Finalizing Key "+ ident);
}
}
class Value {
String ident;
public Value(String id) { ident = id; }
public String toString() { return ident; }
public void finalize() {
System.out.println("Finalizing Value "+ident);
}
}
public class CanonicalMapping {
public static void main(String[] args) {
int size = 1000;
// Or, choose size via the command line:
if(args.length > 0)
size = Integer.parseInt(args[0]);
Key[] keys = new Key[size];
WeakHashMap whm = new WeakHashMap();
for(int i = 0; i < size; i++) {
Key k = new Key(Integer.toString(i));
Value v = new Value(Integer.toString(i));
if(i % 3 == 0)
keys[i] = k; // Save as "real" references
whm.put(k, v);
}
System.gc();
}
} ///:~
The Key class must have a
hashCode( ) and an equals( ) since it is being used as a
key in a hashed data structure, as described previously in this chapter.
[ Add Comment ]
When you run the program you’ll see
that the garbage collector will skip every third key, because an ordinary
reference to that key has also been placed in the keys array and thus
those objects cannot be garbage-collected.
[ Add Comment ]
We can now demonstrate the true power of
the Iterator: the ability to separate the
operation of traversing a sequence from the underlying structure of that
sequence. In the following example, the class PrintData uses an
Iterator to move through a sequence and call the
toString( ) method for every object. Two
different types of containers are created—an
ArrayList and a
HashMap—and they are each filled with,
respectively, Mouse and Hamster objects. (These classes are
defined earlier in this chapter.) Because an Iterator hides the structure
of the underlying container, PrintData doesn’t know or care what
kind of container the Iterator comes from:
//: c09:Iterators2.java
// Revisiting Iterators.
import java.util.*;
class PrintData {
static void print(Iterator e) {
while(e.hasNext())
System.out.println(e.next());
}
}
public class Iterators2 {
public static void main(String[] args) {
ArrayList v = new ArrayList();
for(int i = 0; i < 5; i++)
v.add(new Mouse(i));
HashMap m = new HashMap();
for(int i = 0; i < 5; i++)
m.put(new Integer(i), new Hamster(i));
System.out.println("ArrayList");
PrintData.print(v.iterator());
System.out.println("HashMap");
PrintData.print(m.entrySet().iterator());
}
} ///:~
For the HashMap, the
entrySet( ) method produces a Set of Map.entry
objects, which contain both the key and the value for each entry, so you see
both of them printed.
[ Add Comment ]
Note that PrintData.print( )
takes advantage of the fact that the objects in these containers are of class
Object so the call toString( ) by
System.out.println( ) is automatic. It’s more likely that in
your problem, you must make the assumption that your Iterator is walking
through a container of some specific type. For example, you might assume that
everything in the container is a Shape with a draw( ) method.
Then you must downcast from the Object that Iterator.next( )
returns to produce a Shape.
[ Add Comment ]
By now you should understand that there
are really only three container components: Map, List, and
Set, and only two or three implementations of each interface. If you need
to use the functionality offered by a particular interface, how do you
decide which particular implementation to use?
[ Add Comment ]
To understand the answer, you must be
aware that each different implementation has its own features, strengths, and
weaknesses. For example, you can see in the diagram that the
“feature” of Hashtable, Vector, and Stack is
that they are legacy classes, so that old code doesn’t break. On the other
hand, it’s best if you don’t use those for new (Java 2) code.
[ Add Comment ]
The distinction between the other
containers often comes down to what they are “backed by”; that is,
the data structures that physically implement your desired interface.
This means that, for example, ArrayList and LinkedList implement
the List interface so your program will produce the same results
regardless of the one you use. However, ArrayList is backed by an array,
while the LinkedList is implemented in the usual way for a doubly linked
list, as individual objects each containing data along with references to the
previous and next elements in the list. Because of this, if you want to do many
insertions and removals in the middle of a list, a LinkedList is the
appropriate choice. (LinkedList also has additional functionality that is
established in AbstractSequentialList.) If not,
an ArrayList is typically faster.
[ Add Comment ]
As another example, a Set can be
implemented as either a TreeSet or a HashSet. A TreeSet is
backed by a TreeMap and is designed to produce a constantly sorted set.
However, if you’re going to have larger quantities in your Set, the
performance of TreeSet insertions will get slow. When you’re
writing a program that needs a Set, you should choose HashSet by
default, and change to TreeSet when it's more important to have a
constantly sorted set.
[ Add Comment ]
The most convincing way to see the
differences between the implementations of List is with a performance
test. The following code establishes an inner base class to use as a test
framework, then creates an array of
anonymous
inner classes, one for each different test. Each of these inner classes is
called by the test( ) method. This approach allows you to easily add
and remove new kinds of tests.
//: c09:ListPerformance.java
// Demonstrates performance differences in Lists.
import java.util.*;
import com.bruceeckel.util.*;
public class ListPerformance {
private abstract static class Tester {
String name;
int size; // Test quantity
Tester(String name, int size) {
this.name = name;
this.size = size;
}
abstract void test(List a, int reps);
}
private static Tester[] tests = {
new Tester("get", 300) {
void test(List a, int reps) {
for(int i = 0; i < reps; i++) {
for(int j = 0; j < a.size(); j++)
a.get(j);
}
}
},
new Tester("iteration", 300) {
void test(List a, int reps) {
for(int i = 0; i < reps; i++) {
Iterator it = a.iterator();
while(it.hasNext())
it.next();
}
}
},
new Tester("insert", 5000) {
void test(List a, int reps) {
int half = a.size()/2;
String s = "test";
ListIterator it = a.listIterator(half);
for(int i = 0; i < size * 10; i++)
it.add(s);
}
},
new Tester("remove", 5000) {
void test(List a, int reps) {
ListIterator it = a.listIterator(3);
while(it.hasNext()) {
it.next();
it.remove();
}
}
},
};
public static void test(List a, int reps) {
// A trick to print out the class name:
System.out.println("Testing " +
a.getClass().getName());
for(int i = 0; i < tests.length; i++) {
Collections2.fill(a,
Collections2.countries.reset(),
tests[i].size);
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(a, reps);
long t2 = System.currentTimeMillis();
System.out.println(": " + (t2 - t1));
}
}
public static void testArray(int reps) {
System.out.println("Testing array as List");
// Can only do first two tests on an array:
for(int i = 0; i < 2; i++) {
String[] sa = new String[tests[i].size];
Arrays2.fill(sa,
Collections2.countries.reset());
List a = Arrays.asList(sa);
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(a, reps);
long t2 = System.currentTimeMillis();
System.out.println(": " + (t2 - t1));
}
}
public static void main(String[] args) {
int reps = 50000;
// Or, choose the number of repetitions
// via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
System.out.println(reps + " repetitions");
testArray(reps);
test(new ArrayList(), reps);
test(new LinkedList(), reps);
test(new Vector(), reps);
}
} ///:~
The inner class Tester is
abstract, to provide a base class for the specific tests. It contains a
String to be printed when the test starts, a size parameter to be
used by the test for quantity of elements or repetitions of tests, a constructor
to initialize the fields, and an abstract method test( ) that
does the work. All the different types of tests are collected in one place, the
array tests, which is initialized with different anonymous inner classes
that inherit from Tester. To add or remove tests, simply add or remove an
inner class definition from the array, and everything else happens
automatically.
[ Add Comment ]
To compare array access to container
access (primarily against ArrayList), a special test is created for
arrays by wrapping one as a List using Arrays.asList( ). Note
that only the first two tests can be performed in this case, because you cannot
insert or remove elements from an array.
[ Add Comment ]
The List that’s handed to
test( ) is first filled with elements, then each test in the
tests array is timed. The results will vary from machine to machine; they
are intended to give only an order of magnitude comparison between the
performance of the different containers. Here is a summary of one
run:
|
Type |
Get |
Iteration |
Insert |
Remove |
|
array |
1430 |
3850 |
na |
na |
|
ArrayList |
3070 |
12200 |
500 |
46850 |
|
LinkedList |
16320 |
9110 |
110 |
60 |
|
Vector |
4890 |
16250 |
550 |
46850 |
As expected, arrays are faster than any
container for random-access lookups and iteration. You can see that random
accesses (get( )) are cheap for ArrayLists and expensive for
LinkedLists. (Oddly, iteration is faster for a
LinkedList than an
ArrayList, which is a bit counterintuitive.) On
the other hand, insertions and removals from the middle of a list are
dramatically cheaper for a LinkedList than for an
ArrayList—especially removals.
Vector is generally not as fast as
ArrayList, and it should be avoided; it’s only in the library for
legacy code support (the only reason it works in this program is because it was
adapted to be a List in Java 2). The best approach is probably to
choose an ArrayList as your default, and to change to a LinkedList
if you discover performance problems due to many insertions and removals from
the middle of the list. And of course, if you are working with a fixed-sized
group of elements, use an array.
You can choose between a
TreeSet and a
HashSet, depending on the size of the
Set (if you need to produce an ordered sequence
from a Set, use TreeSet). The following test program gives
an indication of this trade-off:
[ Add Comment ]
//: c09:SetPerformance.java
import java.util.*;
import com.bruceeckel.util.*;
public class SetPerformance {
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Set s, int size, int reps);
}
private static Tester[] tests = {
new Tester("add") {
void test(Set s, int size, int reps) {
for(int i = 0; i < reps; i++) {
s.clear();
Collections2.fill(s,
Collections2.countries.reset(),size);
}
}
},
new Tester("contains") {
void test(Set s, int size, int reps) {
for(int i = 0; i < reps; i++)
for(int j = 0; j < size; j++)
s.contains(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Set s, int size, int reps) {
for(int i = 0; i < reps * 10; i++) {
Iterator it = s.iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void
test(Set s, int size, int reps) {
System.out.println("Testing " +
s.getClass().getName() + " size " + size);
Collections2.fill(s,
Collections2.countries.reset(), size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(s, size, reps);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
int reps = 50000;
// Or, choose the number of repetitions
// via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
// Small:
test(new TreeSet(), 10, reps);
test(new HashSet(), 10, reps);
// Medium:
test(new TreeSet(), 100, reps);
test(new HashSet(), 100, reps);
// Large:
test(new TreeSet(), 1000, reps);
test(new HashSet(), 1000, reps);
}
} ///:~
The following table shows the results of
one run. (Of course, this will be different according to the computer and JVM
you are using; you should run the test yourself as well):
|
Type |
Test size |
Add |
Contains |
Iteration |
|
|
10 |
138.0 |
115.0 |
187.0 |
|
TreeSet |
100 |
189.5 |
151.1 |
206.5 |
|
|
1000 |
150.6 |
177.4 |
40.04 |
|
|
10 |
55.0 |
82.0 |
192.0 |
|
HashSet |
100 |
45.6 |
90.0 |
202.2 |
|
|
1000 |
36.14 |
106.5 |
39.39 |
The performance of HashSet is
generally superior to TreeSet for all operations (but in particular
addition and lookup, the two most important operations). The only reason
TreeSet exists is because it maintains its elements in sorted order, so
you only use it when you need a sorted
Set.
When choosing between implementations of
Map, the size of the Map is what most
strongly affects performance, and the following test program gives an indication
of this trade-off:
[ Add Comment ]
//: c09:MapPerformance.java
// Demonstrates performance differences in Maps.
import java.util.*;
import com.bruceeckel.util.*;
public class MapPerformance {
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Map m, int size, int reps);
}
private static Tester[] tests = {
new Tester("put") {
void test(Map m, int size, int reps) {
for(int i = 0; i < reps; i++) {
m.clear();
Collections2.fill(m,
Collections2.geography.reset(), size);
}
}
},
new Tester("get") {
void test(Map m, int size, int reps) {
for(int i = 0; i < reps; i++)
for(int j = 0; j < size; j++)
m.get(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Map m, int size, int reps) {
for(int i = 0; i < reps * 10; i++) {
Iterator it = m.entrySet().iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void
test(Map m, int size, int reps) {
System.out.println("Testing " +
m.getClass().getName() + " size " + size);
Collections2.fill(m,
Collections2.geography.reset(), size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(m, size, reps);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
int reps = 50000;
// Or, choose the number of repetitions
// via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
// Small:
test(new TreeMap(), 10, reps);
test(new HashMap(), 10, reps);
test(new Hashtable(), 10, reps);
// Medium:
test(new TreeMap(), 100, reps);
test(new HashMap(), 100, reps);
test(new Hashtable(), 100, reps);
// Large:
test(new TreeMap(), 1000, reps);
test(new HashMap(), 1000, reps);
test(new Hashtable(), 1000, reps);
}
} ///:~
Because the size of the map is the issue,
you’ll see that the timing tests divide the time by the size to normalize
each measurement. Here is one set of results. (Yours will probably be
different.)
|
Type |
Test size |
Put |
Get |
Iteration |
|---|---|---|---|---|
|
|
10 |
143.0 |
110.0 |
186.0 |
|
TreeMap |
100 |
201.1 |
188.4 |
280.1 |
|
|
1000 |
222.8 |
205.2 |
40.7 |
|
|
10 |
66.0 |
83.0 |
197.0 |
|
HashMap |
100 |
80.7 |
135.7 |
278.5 |
|
|
1000 |
48.2 |
105.7 |
41.4 |
|
|
10 |
61.0 |
93.0 |
302.0 |
|
Hashtable |
100 |
90.6 |
143.3 |
329.0 |
|
|
1000 |
54.1 |
110.95 |
47.3 |
As you might expect,
Hashtable performance is roughly equivalent to
HashMap. (You can also see that HashMap is generally a bit faster.
HashMap is intended to replace Hashtable.) The
TreeMap is generally slower than the
HashMap, so why would you use it? So you could use it not as a
Map, but as a way to create an ordered list. The behavior of a
tree is such that it’s always in order and doesn’t have to be
specially sorted. Once you fill a TreeMap, you can call
keySet( ) to get a Set view of the
keys, then toArray( ) to produce an array of
those keys. You can then use the static method
Arrays.binarySearch( ) (discussed later) to rapidly find objects in
your sorted array. Of course, you would probably only do this if, for some
reason, the behavior of a HashMap was unacceptable, since HashMap
is designed to rapidly find things. Also, you can easily create a HashMap
from a TreeMap with a single object creation In the end, when
you’re using a Map your first choice should be HashMap, and
only if you need a constantly sorted Map will you need
TreeMap.
Utilities to perform sorting and
searching for Lists have the same names and
signatures as those for sorting arrays of objects, but are static methods
of Collections instead of Arrays.
Here’s an example, modified from
ArraySearching.java:
//: c09:ListSortSearch.java
// Sorting and searching Lists with 'Collections.'
import com.bruceeckel.util.*;
import java.util.*;
public class ListSortSearch {
public static void main(String[] args) {
List list = new ArrayList();
Collections2.fill(list,
Collections2.capitals, 25);
System.out.println(list + "\n");
Collections.shuffle(list);
System.out.println("After shuffling: "+list);
Collections.sort(list);
System.out.println(list + "\n");
Object key = list.get(12);
int index =
Collections.binarySearch(list, key);
System.out.println("Location of " + key +
" is " + index + ", list.get(" +
index + ") = " + list.get(index));
AlphabeticComparator comp =
new AlphabeticComparator();
Collections.sort(list, comp);
System.out.println(list + "\n");
key = list.get(12);
index =
Collections.binarySearch(list, key, comp);
System.out.println("Location of " + key +
" is " + index + ", list.get(" +
index + ") = " + list.get(index));
}
} ///:~
The use of these methods is identical to
the ones in Arrays, but you’re using a List instead of an
array. Just like searching and sorting with arrays, if you sort using a
Comparator you must binarySearch( ) using the same
Comparator.
[ Add Comment ]
This program also demonstrates the
shuffle( ) method in Collections,
which randomizes the order of a List.
[ Add Comment ]
|
enumeration(Collection)
|
Produces an old-style Enumeration
for the argument. |
|
max(Collection) min(Collection) |
Produces the maximum or minimum element
in the argument using the natural comparison method of the objects in the
Collection. |
|
max(Collection, Comparator)
min(Collection,
Comparator) |
Produces the maximum or minimum element
in the Collection using the Comparator. |
|
reverse( ) |
Reverses all the elements in
place. |
|
copy(List dest, List
src) |
Copies elements from src to
dest. |
|
fill(List list, Object
o) |
Replaces all the elements of list with
o. |
|
nCopies(int n, Object o)
|
Returns an immutable List of size
n whose references all point to o. |
Note that min( ) and
max( ) work with Collection objects, not with Lists,
so you don’t need to worry about whether the Collection should be
sorted or not. (As mentioned earlier, you do need to sort( )
a List or an array before performing a binarySearch( ).)
[ Add Comment ]
Often it is convenient to create a
read-only version of a Collection or Map. The Collections
class allows you to do this by passing the original container into a method that
hands back a read-only version. There are four variations on this method, one
each for Collection (if you don’t want to treat a Collection
as a more specific type), List, Set, and Map. This
example shows the proper way to build read-only versions of
each:
//: c09:ReadOnly.java
// Using the Collections.unmodifiable methods.
import java.util.*;
import com.bruceeckel.util.*;
public class ReadOnly {
static Collections2.StringGenerator gen =
Collections2.countries;
public static void main(String[] args) {
Collection c = new ArrayList();
Collections2.fill(c, gen, 25); // Insert data
c = Collections.unmodifiableCollection(c);
System.out.println(c); // Reading is OK
c.add("one"); // Can't change it
List a = new ArrayList();
Collections2.fill(a, gen.reset(), 25);
a = Collections.unmodifiableList(a);
ListIterator lit = a.listIterator();
System.out.println(lit.next()); // Reading OK
lit.add("one"); // Can't change it
Set s = new HashSet();
Collections2.fill(s, gen.reset(), 25);
s = Collections.unmodifiableSet(s);
System.out.println(s); // Reading OK
//! s.add("one"); // Can't change it
Map m = new HashMap();
Collections2.fill(m,
Collections2.geography, 25);
m = Collections.unmodifiableMap(m);
System.out.println(m); // Reading OK
//! m.put("Ralph", "Howdy!");
}
} ///:~
In each case, you must fill the container
with meaningful data before you make it read-only. Once it is loaded, the
best approach is to replace the existing reference with the reference that is
produced by the “unmodifiable” call. That way, you don’t run
the risk of accidentally changing the contents once you’ve made it
unmodifiable. On the other hand, this tool also allows you to keep a modifiable
container as private within a class and to return a read-only reference
to that container from a method call. So you can change it from within the
class, but everyone else can only read it.
[ Add Comment ]
Calling the “unmodifiable”
method for a particular type does not cause compile-time checking, but once the
transformation has occurred, any calls to methods that modify the contents of a
particular container will produce an UnsupportedOperationException.
[ Add Comment ]
The
synchronized keyword is an important part of the
subject of multithreading, a more complicated
topic that will not be introduced until Chapter 14. Here, I shall note only that
the Collections class contains a way to automatically synchronize an
entire container. The syntax is similar to the “unmodifiable”
methods:
//: c09:Synchronization.java
// Using the Collections.synchronized methods.
import java.util.*;
public class Synchronization {
public static void main(String[] args) {
Collection c =
Collections.synchronizedCollection(
new ArrayList());
List list = Collections.synchronizedList(
new ArrayList());
Set s = Collections.synchronizedSet(
new HashSet());
Map m = Collections.synchronizedMap(
new HashMap());
}
} ///:~
In this case, you immediately pass the
new container through the appropriate “synchronized” method; that
way there’s no chance of accidentally exposing the unsynchronized version.
[ Add Comment ]
The Java containers also have a mechanism
to prevent more than one process from modifying the contents of a container. The
problem occurs if you’re iterating through a container and some other
process steps in and inserts, removes, or changes an object in that container.
Maybe you’ve already passed that object, maybe it’s ahead of you,
maybe the size of the container shrinks after you call
size( )—there are many scenarios for disaster. The Java
containers library incorporates a fail-fast mechanism that looks for any
changes to the container other than the ones your process is personally
responsible for. If it detects that someone else is modifying the container, it
immediately produces a ConcurrentModificationException. This is the
“fail-fast” aspect—it doesn’t try to detect a problem
later on using a more complex algorithm.
[ Add Comment ]
It’s quite easy to see the
fail-fast mechanism in operation—all you have to do is create an iterator
and then add something to the collection that the iterator is pointing to, like
this:
//: c09:FailFast.java
// Demonstrates the "fail fast" behavior.
import java.util.*;
public class FailFast {
public static void main(String[] args) {
Collection c = new ArrayList();
Iterator it = c.iterator();
c.add("An object");
// Causes an exception:
String s = (String)it.next();
}
} ///:~
The exception happens because something
is placed in the container after the iterator is acquired from the
container. The possibility that two parts of the program could be modifying the
same container produces an uncertain state, so the exception notifies you that
you should change your code—in this case, acquire the iterator
after you have added all the elements to the container.
[ Add Comment ]
Note that you cannot benefit from this
kind of monitoring when you’re accessing the elements of a List
using get( ).
[ Add Comment ]
It’s possible to turn an array into
a List with the Arrays.asList( ) method:
//: c09:Unsupported.java
// Sometimes methods defined in the
// Collection interfaces don't work!
import java.util.*;
public class Unsupported {
private static String[] s = {
"one", "two", "three", "four", "five",
"six", "seven", "eight", "nine", "ten",
};
static List a = Arrays.asList(s);
static List a2 = a.subList(3, 6);
public static void main(String[] args) {
System.out.println(a);
System.out.println(a2);
System.out.println(
"a.contains(" + s[0] + ") = " +
a.contains(s[0]));
System.out.println(
"a.containsAll(a2) = " +
a.containsAll(a2));
System.out.println("a.isEmpty() = " +
a.isEmpty());
System.out.println(
"a.indexOf(" + s[5] + ") = " +
a.indexOf(s[5]));
// Traverse backwards:
ListIterator lit = a.listIterator(a.size());
while(lit.hasPrevious())
System.out.print(lit.previous() + " ");
System.out.println();
// Set the elements to different values:
for(int i = 0; i < a.size(); i++)
a.set(i, "47");
System.out.println(a);
// Compiles, but won't run:
lit.add("X"); // Unsupported operation
a.clear(); // Unsupported
a.add("eleven"); // Unsupported
a.addAll(a2); // Unsupported
a.retainAll(a2); // Unsupported
a.remove(s[0]); // Unsupported
a.removeAll(a2); // Unsupported
}
} ///:~
You’ll discover that only a portion
of the Collection and List interfaces are actually implemented.
The rest of the methods cause the unwelcome appearance of something called an
UnsupportedOperationException. You’ll learn all about exceptions in
the next chapter, but the short story is that the Collection
interface—as well as some of the other interfaces in the
Java containers library—contain “optional” methods, which
might or might not be “supported” in the concrete class that
implements that interface. Calling an unsupported method causes an
UnsupportedOperationException to indicate a programming error.
[ Add Comment ]
“What?!?” you say,
incredulous. “The whole point of interfaces and base classes is
that they promise these methods will do something meaningful! This breaks that
promise—it says that not only will calling some methods not perform
a meaningful behavior, they will stop the program! Type safety was just thrown
out the window!”
[ Add Comment ]
It’s not quite that bad. With a
Collection, List, Set, or Map, the compiler still
restricts you to calling only the methods in that interface, so
it’s not like Smalltalk (in which you can call any method for any object,
and find out only when you run the program whether your call does anything). In
addition, most methods that take a Collection as an argument only read
from that Collection—all the “read” methods of
Collection are not optional.
[ Add Comment ]
This approach prevents an explosion of
interfaces in the design. Other designs for container libraries always seem to
end up with a confusing plethora of interfaces to describe each of the
variations on the main theme and are thus difficult to learn. It’s not
even possible to capture all of the special cases in interfaces, because
someone can always invent a new interface. The “unsupported
operation” approach achieves an important goal of the Java containers
library: the containers are simple to learn and use; unsupported operations are
a special case that can be learned later. For this approach to work, however:
[ Add Comment ]
In
the example above, Arrays.asList( ) produces
a List that is backed by a fixed-size array. Therefore it makes sense
that the only supported operations are the ones that don’t change the size
of the array. If, on the other hand, a new interface were required to
express this different kind of behavior (called, perhaps,
“FixedSizeList”), it would throw open the door to complexity
and soon you wouldn’t know where to start when trying to use the library.
[ Add Comment ]
The documentation for a method that takes
a Collection, List, Set, or Map as an argument
should specify which of the optional methods must be implemented. For example,
sorting requires the set( ) and Iterator.set( ) methods,
but not add( ) and remove( ).
[ Add Comment ]
Unfortunately, a lot of code was written
using the Java 1.0/1.1 containers, and even new code is sometimes written using
these classes. So although you should never use the old containers when writing
new code, you’ll still need to be aware of them. However, the old
containers were quite limited, so there’s not that much to say about them.
(Since they are in the past, I will try to refrain from overemphasizing some of
the hideous design decisions.)
[ Add Comment ]
The only self-expanding sequence in Java
1.0/1.1 was the Vector, and so it saw a lot of
use. Its flaws are too numerous to describe here (see the first edition of this
book, available on this book’s CD ROM and as a free download from
www.BruceEckel.com). Basically, you can think of it as an
ArrayList with long, awkward method names. In the Java 2 container
library, Vector was adapted so that it could fit as a Collection
and a List, so in the following example the
Collections2.fill( ) method is successfully used. This turns out to
be a bit perverse, as it may confuse some people into thinking that
Vector has gotten better, when it is actually included only to support
pre-Java 2 code.
[ Add Comment ]
The Java 1.0/1.1 version of the iterator
chose to invent a new name, “enumeration,” instead of using a term
that everyone was already familiar with. The
Enumeration interface is smaller than
Iterator, with only two methods, and it uses longer method names:
boolean hasMoreElements( ) produces true if this
enumeration contains more elements, and Object nextElement( )
returns the next element of this enumeration if there are any more
(otherwise it throws an exception).
[ Add Comment ]
Enumeration is only an interface,
not an implementation, and even new libraries sometimes still use the old
Enumeration—which is unfortunate but generally harmless. Even
though you should always use Iterator when you can in your own code, you
must be prepared for libraries that want to hand you an Enumeration.
[ Add Comment ]
In addition, you
can produce an Enumeration for any Collection by using the
Collections.enumeration( ) method, as seen in this
example:
//: c09:Enumerations.java
// Java 1.0/1.1 Vector and Enumeration.
import java.util.*;
import com.bruceeckel.util.*;
public class Enumerations {
public static void main(String[] args) {
Vector v = new Vector();
Collections2.fill(
v, Collections2.countries, 100);
Enumeration e = v.elements();
while(e.hasMoreElements())
System.out.println(e.nextElement());
// Produce an Enumeration from a Collection:
e = Collections.enumeration(new ArrayList());
}
} ///:~
The Java 1.0/1.1 Vector has only
an addElement( ) method, but fill( ) uses the
add( ) method that was pasted on as Vector was turned into a
List. To produce an Enumeration, you call elements( ),
then you can use it to perform a forward iteration.
[ Add Comment ]
The last line creates an ArrayList
and uses enumeration( ) to adapt an Enumeration from the
ArrayList Iterator. Thus, if you have old code that wants an
Enumeration, you can still use the new containers.
[ Add Comment ]
As you’ve seen in the performance
comparison in this chapter, the basic Hashtable
is very similar to the HashMap, even down to the method names.
There’s no reason to use Hashtable instead of HashMap in new
code.
[ Add Comment ]
The concept of the stack was introduced
earlier, with the LinkedList. What’s rather odd about the
Java 1.0/1.1 Stack is that instead of using a
Vector as a building block,
Stack is inherited from Vector. So
it has all of the characteristics and behaviors of a Vector plus some
extra Stack behaviors. It’s difficult to know whether the designers
explicitly decided that this was an especially useful way of doing things, or
whether it was just a naive design.
[ Add Comment ]
Here’s a simple demonstration of
Stack that pushes each line from a String array:
//: c09:Stacks.java
// Demonstration of Stack Class.
import java.util.*;
public class Stacks {
static String[] months = {
"January", "February", "March", "April",
"May", "June", "July", "August", "September",
"October", "November", "December" };
public static void main(String[] args) {
Stack stk = new Stack();
for(int i = 0; i < months.length; i++)
stk.push(months[i] + " ");
System.out.println("stk = " + stk);
// Treating a stack as a Vector:
stk.addElement("The last line");
System.out.println(
"element 5 = " + stk.elementAt(5));
System.out.println("popping elements:");
while(!stk.empty())
System.out.println(stk.pop());
}
} ///:~
Each line in the months array is
inserted into the Stack with push( ), and later fetched from
the top of the stack with a pop( ). To make a point, Vector
operations are also performed on the Stack object. This is possible
because, by virtue of inheritance, a Stack is a Vector.
Thus, all operations that can be performed on a Vector can also be
performed on a Stack, such as elementAt( ).
[ Add Comment ]
As mentioned earlier, you should use a
LinkedList when you want stack behavior.
[ Add Comment ]
A BitSet
is used if you want to efficiently store a lot of on-off information. It’s
efficient only from the standpoint of size; if you’re looking for
efficient access, it is slightly slower than using an array of some native type.
[ Add Comment ]
In addition, the minimum size of the
BitSet is that of a long: 64 bits. This implies that if
you’re storing anything smaller, like 8 bits, a BitSet will be
wasteful; you’re better off creating your own class, or just an array, to
hold your flags if size is an issue.
[ Add Comment ]
A normal container expands as you add
more elements, and the BitSet does this as well. The following example
shows how the BitSet works:
//: c09:Bits.java
// Demonstration of BitSet.
import java.util.*;
public class Bits {
static void printBitSet(BitSet b) {
System.out.println("bits: " + b);
String bbits = new String();
for(int j = 0; j < b.size() ; j++)
bbits += (b.get(j) ? "1" : "0");
System.out.println("bit pattern: " + bbits);
}
public static void main(String[] args) {
Random rand = new Random();
// Take the LSB of nextInt():
byte bt = (byte)rand.nextInt();
BitSet bb = new BitSet();
for(int i = 7; i >=0; i--)
if(((1 << i) & bt) != 0)
bb.set(i);
else
bb.clear(i);
System.out.println("byte value: " + bt);
printBitSet(bb);
short st = (short)rand.nextInt();
BitSet bs = new BitSet();
for(int i = 15; i >=0; i--)
if(((1 << i) & st) != 0)
bs.set(i);
else
bs.clear(i);
System.out.println("short value: " + st);
printBitSet(bs);
int it = rand.nextInt();
BitSet bi = new BitSet();
for(int i = 31; i >=0; i--)
if(((1 << i) & it) != 0)
bi.set(i);
else
bi.clear(i);
System.out.println("int value: " + it);
printBitSet(bi);
// Test bitsets >= 64 bits:
BitSet b127 = new BitSet();
b127.set(127);
System.out.println("set bit 127: " + b127);
BitSet b255 = new BitSet(65);
b255.set(255);
System.out.println("set bit 255: " + b255);
BitSet b1023 = new BitSet(512);
b1023.set(1023);
b1023.set(1024);
System.out.println("set bit 1023: " + b1023);
}
} ///:~
The random number generator is used to
create a random byte, short, and int, and each one is
transformed into a corresponding bit pattern in a BitSet. This works fine
because a BitSet is 64 bits, so none of these cause it to increase in
size. Then a BitSet of 512 bits is created. The constructor allocates
storage for twice that number of bits. However, you can still set bit 1024 or
greater.
[ Add Comment ]
To review the containers provided in the
standard Java library:
[ Add Comment ]
The
containers are tools that you can use on a day-to-day basis to make your
programs simpler, more powerful, and more effective.
[ Add Comment ]
Solutions to selected exercises
can be found in the electronic document The Thinking in Java Annotated
Solution Guide, available for a small fee from
www.BruceEckel.com.
[44]
It’s possible, however, to ask how big the vector is, and the
at( ) method does perform bounds checking.
[45]
This is one of the places where C++ is distinctly superior to Java, since C++
supports parameterized types with the template
keyword.
[46]
The C++ programmer will note how much the code could be collapsed with the use
of default arguments and templates. The Python programmer will note that this
entire library would be largely unnecessary in that language.
[47]
By Joshua Bloch at Sun.
[48]
This data was found on the Internet, then processed by creating a Python program
(see www.Python.org).
[49]
This is a place where operator overloading would be nice.
[50]
If these speedups still don’t meet your performance needs, you can further
accelerate table lookup by writing your own Map and customizing it to
your particular types to avoid delays due to casting to and from Objects.
To reach even higher levels of performance, speed enthusiasts can use Donald
Knuth’s The Art of Computer Programming, Volume 3: Sorting and
Searching, Second Edition to replace overflow bucket lists with arrays that
have two additional benefits: they can be optimized for disk storage
characteristics and they can save most of the time of creating and garbage
collecting individual records.
James Thornton, jamesthornton.com>Services: Search Engines Submissions |
Electric Speed: Internet Marketing Technologies |