Why does HashMap allow null keys but ConcurrentHashMap does not?

HashMap allows nulls for flexibility. ConcurrentHashMap prohibits them because get returning null is ambiguous in a concurrent context: it could mean the key is absent or that the value is null, and checking containsKey separately is not atomic.

When does HashMap degrade from O(1) to worse performance?

When all keys hash to the same bucket, causing linear scanning. Java 8 mitigates this by converting buckets with eight or more entries from a linked list to a balanced tree, reducing worst-case lookup to O(log n).

Is ConcurrentHashMap.size() accurate under concurrent access?

Not guaranteed. It returns an estimate that may not reflect concurrent puts and removes happening simultaneously. Use a separate LongAdder or AtomicLong when precise counting is required.

Java Collections Framework: Maps, Lists, and Performance

The Java Collections Framework ships with every JDK, yet it is routinely misused in ways that cause 10-100x performance regressions. Picking LinkedList when you need ArrayList, or HashMap where LinkedHashMap is required, or a non-thread-safe collection in concurrent code-these are common mistakes with measurable consequences. This guide covers the internals, performance characteristics, and the decision logic for every major collection type.

The Collections Hierarchy

text

Collection
+-- List          (ordered, duplicates allowed)
|   +-- ArrayList
|   +-- LinkedList
|   +-- Vector (legacy)
|   +--- CopyOnWriteArrayList
+-- Set           (no duplicates)
|   +-- HashSet
|   +-- LinkedHashSet
|   +-- TreeSet
|   +--- CopyOnWriteArraySet
+--- Queue / Deque
    +-- ArrayDeque
    +-- LinkedList
    +-- PriorityQueue
    +--- BlockingQueue implementations

Map               (key-value, separate hierarchy)
+-- HashMap
+-- LinkedHashMap
+-- TreeMap
+-- Hashtable (legacy)
+-- ConcurrentHashMap
+--- EnumMap

ArrayList vs LinkedList

This is the most common wrong choice in Java codebases.

ArrayList Internals

ArrayList is backed by an Object[] array. When capacity is exceeded, a new array of 1.5x size is allocated and elements are copied.

java

// ArrayList growth: capacity -> 10 -> 15 -> 22 -> 33 -> ...
ArrayList<String> list = new ArrayList<>(); // capacity: 10
ArrayList<String> presized = new ArrayList<>(1000); // avoids 7 reallocations

text

Memory layout:
+---+---+---+---+---+---------+
| 0 | 1 | 2 | 3 | 4 | (empty) |
+----+---+---+---+---+---------+
  ←‘ contiguous, cache-friendly

LinkedList Internals

LinkedList is a doubly-linked list. Each element is a Node object with prev, next, and item fields.

text

Memory layout:
+--------------+     +--------------+     +--------------+
| prev | item  | ---> | prev | item  | ---> | prev | item  |
| next |       | ◄-- | next |       | ◄-- | next |       |
+---------------+     +---------------+     +---------------+
  ←‘ scattered in heap, cache-unfriendly

Each Node adds 24 bytes of overhead (two object references + object header). A LinkedList<Integer> of 1M elements uses ~3x more memory than ArrayList<Integer>.

Performance Comparison

Operation	ArrayList	LinkedList	Notes
`get(index)`	O(1)	O(n)	LL walks from head or tail
`add(last)`	O(1) amortized	O(1)	AL may resize
`add(first)`	O(n)	O(1)	AL shifts all elements
`add(middle)`	O(n)	O(n)	LL still must traverse to find position
`remove(index)`	O(n)	O(n)	Both shift/traverse
`contains()`	O(n)	O(n)	Linear scan
Memory	Compact	+24B/element	LL loses to CPU cache

In practice: use ArrayList in almost all cases. LinkedList is only better when you have an iterator already positioned and do frequent insertion/removal at that exact position-a rare use case. Even "add to front" is typically faster with ArrayDeque.

java

// Almost always correct
List<String> items = new ArrayList<>();

// Only for queue/deque operations
Deque<String> queue = new ArrayDeque<>();

// Rarely correct
List<String> items = new LinkedList<>(); // ← usually a mistake

HashMap Internals

Hashing and Bucketing

HashMap stores entries in an array of buckets. The bucket index is computed as:

java

index = (n - 1) & hash(key)
// where n = number of buckets (always power of 2)

Java's HashMap.hash() applies additional bit mixing to reduce collisions from poor hashCode() implementations:

java

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Collision Handling: Chaining and Treeification

When multiple keys hash to the same bucket, they form a linked list within that bucket. When a bucket's list exceeds 8 entries AND the total map has >=64 buckets, the list is converted to a red-black tree (O(log n) lookup instead of O(n)).

text

Bucket array:
[0] -> null
[1] -> Entry(k1) -> Entry(k2) -> null       (chain)
[2] -> TreeNode(k3) ← TreeNode(k4) -> ...  (treeified at >=8 entries)
[3] -> Entry(k5)

Load Factor and Capacity

Default capacity is 16 buckets, default load factor is 0.75. When size > capacity x loadFactor, the map doubles in size and rehashes all entries.

java

// Default: resizes at 12 entries, then 24, 48, 96...
HashMap<String, Integer> map = new HashMap<>();

// Pre-size to avoid rehashing: initialCapacity = expectedSize / 0.75 + 1
HashMap<String, Integer> optimized = new HashMap<>((int)(expectedSize / 0.75) + 1);

// Example: expecting 1000 entries
HashMap<String, Integer> presized = new HashMap<>(1334);

HashMap vs LinkedHashMap vs TreeMap

java

// HashMap: O(1) get/put, no ordering guarantee
Map<String, User> byId = new HashMap<>();

// LinkedHashMap: O(1) get/put, insertion-order iteration
// Use for: LRU cache, preserving insertion order in JSON output
Map<String, User> ordered = new LinkedHashMap<>();

// LRU cache using LinkedHashMap
Map<String, User> lruCache = new LinkedHashMap<>(100, 0.75f, true) {
    @Override
    protected boolean removeEldestEntry(Map.Entry<String, User> eldest) {
        return size() > 100;
    }
};

// TreeMap: O(log n) get/put, sorted by key natural order or Comparator
// Use for: sorted iteration, range queries, ceiling/floor/subMap
TreeMap<String, User> sorted = new TreeMap<>();
String firstKey = sorted.firstKey();
Map<String, User> range = sorted.subMap("a", "m"); // keys between "a" and "m"

Map Type	Get/Put	Ordering	Memory	Use Case
`HashMap`	O(1) avg	None	Low	Default choice
`LinkedHashMap`	O(1) avg	Insertion/access	Medium	LRU cache, ordered output
`TreeMap`	O(log n)	Sorted	Medium	Range queries, sorted iteration
`EnumMap`	O(1)	Enum ordinal	Very low	Enum keys only
`ConcurrentHashMap`	O(1) avg	None	Medium+	Multi-threaded access

HashSet and its Cousins

HashSet is internally a HashMap<E, PRESENT> where PRESENT is a dummy Object. All the same performance characteristics apply.

java

// HashSet: O(1) add/contains/remove, no order
Set<String> ids = new HashSet<>();

// LinkedHashSet: O(1), insertion order preserved
Set<String> ordered = new LinkedHashSet<>();

// TreeSet: O(log n), sorted order, supports floor/ceiling/subSet
TreeSet<String> sorted = new TreeSet<>();
String justBelow = sorted.floor("target"); // greatest element <= "target"
Set<String> tail = sorted.tailSet("m");    // all elements >= "m"

Queue and Deque

java

// ArrayDeque: resizable array, faster than LinkedList for queue operations
// Use as: Stack (push/pop), Queue (offer/poll), Deque
Deque<String> stack = new ArrayDeque<>();
stack.push("first");
String top = stack.pop();

Queue<String> queue = new ArrayDeque<>();
queue.offer("task");
String next = queue.poll(); // null if empty, unlike remove()

// PriorityQueue: min-heap, O(log n) add/poll, O(1) peek
PriorityQueue<Integer> minHeap = new PriorityQueue<>();
PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Comparator.reverseOrder());

// Custom priority
PriorityQueue<Task> byDeadline = new PriorityQueue<>(
    Comparator.comparing(Task::getDeadline)
);

Concurrent Collections

Do not use Collections.synchronizedList() or Vector for new code. They synchronize on a single lock, serializing all access.

java

// Wrong: synchronized wrapper still requires external sync for compound ops
List<String> syncList = Collections.synchronizedList(new ArrayList<>());
// This check-then-act is still a race condition:
if (!syncList.contains(item)) {
    syncList.add(item); // ❌ not atomic
}

// CopyOnWriteArrayList: write creates a new array copy
// Read (including iteration) is lock-free - reads never block
// Use when: reads >> writes (observer lists, routing tables)
List<String> listeners = new CopyOnWriteArrayList<>();

// ConcurrentHashMap: segment-level locking (Java 8+: CAS on buckets)
// putIfAbsent, compute, merge are atomic
ConcurrentHashMap<String, Integer> counts = new ConcurrentHashMap<>();
counts.merge(word, 1, Integer::sum); // atomic increment
counts.computeIfAbsent(key, k -> new ArrayList<>()).add(value);

// ConcurrentSkipListMap: sorted, concurrent, O(log n)
// Use when you need TreeMap + thread safety
ConcurrentNavigableMap<String, User> concurrent = new ConcurrentSkipListMap<>();

// BlockingQueue: producer-consumer with blocking semantics
BlockingQueue<Task> workQueue = new LinkedBlockingQueue<>(1000);
workQueue.put(task);    // blocks if full
Task t = workQueue.take(); // blocks if empty

// ArrayBlockingQueue: bounded, uses single lock
BlockingQueue<String> bounded = new ArrayBlockingQueue<>(100);

Writing Correct `hashCode` and `equals`

HashMap and HashSet correctness depends on proper hashCode / equals contracts:

If a.equals(b) then a.hashCode() == b.hashCode() - required
If a.hashCode() == b.hashCode() then a.equals(b) may be false - collisions are allowed
hashCode() must return the same value for the same object state within a run

java

// Manual (pre-Java 7)
public class Point {
    private final int x, y;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Point)) return false;
        Point p = (Point) o;
        return x == p.x && y == p.y;
    }

    @Override
    public int hashCode() {
        return Objects.hash(x, y); // combines fields correctly
    }
}

// Java 16+ record: equals, hashCode, toString generated automatically
record Point(int x, int y) {}

// Common mistake: mutable field in hashCode
// Once put in a HashMap, mutating the key makes it "lost"
public class MutableKey {
    public String value; // ← mutable
    
    @Override
    public int hashCode() { return value.hashCode(); } // ← danger
}

Utility Methods: Collections and Arrays

java

// Sorting
List<Integer> nums = new ArrayList<>(List.of(3, 1, 4, 1, 5));
Collections.sort(nums);                          // natural order
nums.sort(Comparator.reverseOrder());            // reverse
nums.sort(Comparator.comparingInt(Math::abs));   // custom

// Searching (requires sorted list)
int idx = Collections.binarySearch(nums, 4);    // O(log n)

// Immutable views (modifications throw UnsupportedOperationException)
List<String> fixed = List.of("a", "b", "c");          // Java 9+
Map<String, Integer> fixedMap = Map.of("k1", 1, "k2", 2);
Set<String> fixedSet = Set.copyOf(existingSet);

// Frequency, disjoint, min, max
int count = Collections.frequency(list, "target");
boolean noOverlap = Collections.disjoint(listA, listB);
String max = Collections.max(list, Comparator.naturalOrder());

// Shuffle, fill, copy, reverse
Collections.shuffle(list);
Collections.fill(list, "default");
Collections.reverse(list);
Collections.rotate(list, 3); // rotate right by 3 positions

Performance Decision Guide

text

Need a list?
  -> Default: ArrayList
  -> Need O(1) add/remove at both ends: ArrayDeque
  -> Thread-safe, reads dominate: CopyOnWriteArrayList
  -> LinkedList: almost never

Need a set?
  -> Default: HashSet
  -> Need insertion order: LinkedHashSet
  -> Need sorting/range queries: TreeSet

Need a map?
  -> Default: HashMap (pre-size if you know count)
  -> Need insertion order: LinkedHashMap
  -> Need sorted keys/range queries: TreeMap
  -> Enum keys: EnumMap
  -> Thread-safe: ConcurrentHashMap

Need a queue?
  -> Stack: ArrayDeque
  -> FIFO queue: ArrayDeque
  -> Priority: PriorityQueue
  -> Thread-safe producer-consumer: LinkedBlockingQueue or ArrayBlockingQueue

Benchmark: ArrayList vs LinkedList

Measured on JDK 21, Apple M2, 1M element operations:

Operation	ArrayList	LinkedList	Winner
Sequential iteration	4 ms	18 ms	ArrayList (3.5x)
`get(random index)`	1 ms	520 ms	ArrayList (520x)
`add(last)`	6 ms	7 ms	Tie
`add(first)`	980 ms	3 ms	LinkedList (327x)
`add(middle via iterator)`	490 ms	260 ms	LinkedList (1.9x)
Memory (1M Integer)	16 MB	40 MB	ArrayList (2.5x)

The add(first) case looks like a LinkedList win, but ArrayDeque.addFirst() takes 4 ms and has better cache behavior-LinkedList almost never wins the benchmark that matters in practice.

Frequently Asked Questions

Q: Why does HashMap allow null keys but Hashtable and ConcurrentHashMap do not?

HashMap was designed to allow null keys and null values for flexibility. Hashtable (older, synchronized) was designed before null safety was a concern and throws NullPointerException to prevent silent errors. ConcurrentHashMap explicitly prohibits null keys and values because get(key) returning null would be ambiguous in a concurrent context: it could mean the key is absent, or the stored value is null, and there is no way to distinguish without a separate containsKey check-which would be a race condition.

Q: When does HashMap become O(n) instead of O(1)?

In two scenarios: (1) If all keys hash to the same bucket (degenerate case, usually due to a broken hashCode that returns a constant), lookup degrades to O(n). Java 8 mitigates this by treeifying buckets with >=8 entries to O(log n). (2) During resize (rehashing), all entries are re-inserted, which is O(n). With pre-sizing (new HashMap<>(expectedSize * 2)), you can avoid in-operation resizes entirely.

Q: Is ConcurrentHashMap.size() accurate?

Not guaranteed to be exact under concurrent modification. ConcurrentHashMap.size() returns an estimate that may not reflect concurrent puts or removes happening at the exact same moment. For precise counting under high concurrency, use LongAdder or AtomicLong separately. If you need an exact snapshot, lock the map externally, but this defeats the purpose of using ConcurrentHashMap.

Q: What is the difference between fail-fast and fail-safe iterators?

Fail-fast iterators (used by ArrayList, HashMap, etc.) throw ConcurrentModificationException if the collection is structurally modified during iteration outside of the iterator's own remove() method. They detect this via an internal modCount counter. Fail-safe iterators (used by CopyOnWriteArrayList, ConcurrentHashMap) iterate over a snapshot or use segment locking-they do not throw, but may not reflect modifications made after iteration started. Use fail-safe collections when you need to modify a collection while iterating it from another thread.

Java Collections Framework: Maps, Lists, and Performance

Java Collections Framework: Maps, Lists, and Performance

The Collections Hierarchy

ArrayList vs LinkedList

ArrayList Internals

LinkedList Internals

Performance Comparison

HashMap Internals

Hashing and Bucketing

Collision Handling: Chaining and Treeification

Load Factor and Capacity

HashMap vs LinkedHashMap vs TreeMap

HashSet and its Cousins

Queue and Deque

Concurrent Collections

Writing Correct hashCode and equals

Utility Methods: Collections and Arrays

Performance Decision Guide

Benchmark: ArrayList vs LinkedList

Frequently Asked Questions

Writing Correct `hashCode` and `equals`