### slycelote's blog

By slycelote, 8 years ago,

Update: Turns out that both C# and Mono actually have the same problem as Java. I don't know C# though, so it would be great if someone reviewed the code of the benchmark (bottom of the post).

Recently I noticed that switching from HashSet to TreeSet in a Java program increased performance by several times. Since it didn't make any sense, I decided to investigate the issue, and here is what I found.

In my program I extracted an arbitrary member from a set in the following manner:

Integer elem = set.iterator().next();
set.remove(elem);


It turns out that implementation of HashSet.iterator() method in Java is poor: it always scans the bucket table from the very beginning. An excerpt from JDK code:

        HashIterator() {
expectedModCount = modCount;
if (size > 0) { // advance to first entry
Entry[] t = table;
while (index < t.length && (next = t[index++]) == null)
;
}
}


In my situation the total number of buckets was 218, and each time a new element was extracted some part of the table had to be traversed to find the first non-empty bucket... you get the idea. As far as I understand, the implementation is the same in Java 6, 7 and 8.

I did benchmarks for C++ and C# as well.

• Visual C++ 2012, Visual C# 2012 and Mono 2.10 all have the same problem.
• g++ 4.6.3 doesn't. Looking at the code, they maintain the index of the first non-empty bucket. In certain situations (e.g. if we have a long sequence of insertions followed by a long sequence of removals) it guarantees that each bucket will be traversed only once.

The code of benchmarks is below. What puzzles me further about Java version is that if HashSet is tested after TreeSet, the performance is even worse than when tested in isolation. I'm using OpenJDK 1.7 if that matters.

import java.util.*;

public class TestSets {
public static void main(String[] args) {
testSet(new TreeSet<Integer>());
testSet(new HashSet<Integer>());
}

static void testSet(Set<Integer> set) {
long start = System.currentTimeMillis();

final int N = 100000;
for (int i = 0; i < N; ++i)
for (int i = 0; i < N; ++i) {
Integer elem = set.iterator().next();
set.remove(elem);
}

long end = System.currentTimeMillis();
double elapsed = (end - start) * 0.001;
System.out.println("Elapsed time: " + elapsed + "s");
}
}

#include <chrono>
#include <iostream>
#include <set>
#include <unordered_set>

template<typename Set>
void testSet() {
auto start = std::chrono::system_clock::now();

Set set;
const int N = 600000;
for (int i = 0; i < N; ++i)
set.insert(i);

for (int i = 0; i < N; ++i) {
auto elem = *set.begin();
set.erase(elem);
}

auto end = std::chrono::system_clock::now();
std::chrono::duration<double> elapsed_seconds = end - start;
std::cerr << "Elapsed time: " << elapsed_seconds.count() << "s\n";
}

int main() {
testSet< std::set<int> >();
testSet< std::unordered_set<int> >();
return 0;
}

using System;
using System.Collections.Generic;

class TestSets
{
static void testSet (ISet<int> set)
{
var start = System.DateTime.Now;

const int N = 50000;
for (int i = 0; i < N; ++i)
for (int i = 0; i < N; ++i) {
var e = set.GetEnumerator();
e.MoveNext();
var elem = e.Current;
set.Remove(elem);
}

var end = System.DateTime.Now;
var elapsed = end - start;
System.Console.Out.WriteLine ("Elapsed time: " + elapsed);
}

public static void Main (string[] args)
{
testSet (new SortedSet<int> ());
testSet (new HashSet<int> ());
}
}


• +35

 » 8 years ago, # | ← Rev. 2 →   0 Scanning the bucket form the beginning is just… Genious! When I heard that in GCC list.size() had been implemented in O(N) (now it's changed) I thought there could be nothing more strange, but that one successfully overpassed it. Can anyone explain me what could be a purpose of that?
 » 8 years ago, # |   0 Switching to java.util.LinkedHashSet helps also.
 » 8 years ago, # |   0 Update: Turns out that both C# and Mono actually have the same problem as Java. I don't know C# though, so it would be great if someone reviewed the code of the benchmark (bottom of the post).
 » 8 years ago, # |   0 Iterator is for iterating. How would you iterate faster not storing O(N) additional memory?
•  » » 8 years ago, # ^ |   0 Performance of the first iteration step can be improved, see note on g++.
 » 8 years ago, # |   0 Why do you need to remove the first (actually random) element of a hashset? It's a rhetorical question, because you don't :)
•  » » 8 years ago, # ^ |   +8 Here is a 'real' example where I needed it: 6118155 (Not the clearest implementation of course, but still...)