Fastest Scanner in Java

Revision en1, by wrick, 2015-10-22 08:32:45

In my previous post, I discussed a fast Scanner for Scala.

To benchmark it, I wrote the fastest possible Scanner I could in Java:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.util.Arrays;

/**
 * Hand built using a char buffer
 */
public class ArrayBufferScanner extends AbstractScanner {
  private char[] buffer = new char[1<<4];
  private int pos = 0;   // if negative, nothing else to read

  private BufferedReader reader;

  public ArrayBufferScanner(BufferedReader reader) {
    super(reader);
    this.reader = reader;
  }

  @Override
  public boolean hasNext() {
    return pos != -1;
  }

  private void loadBuffer() throws IOException {
    pos = 0;
    while(true) {
      int i = reader.read();
      if (i == -1) {
        pos = -1;
        break;
      }
      char c = (char) i;
      if (c != ' ' && c != '\n' && c != '\t' && c != '\r' && c != '\f') {
        if (pos == buffer.length) {
          buffer = Arrays.copyOf(buffer, 2 * pos);
        }
        buffer[pos++] = c;
      } else if (pos != 0) {
        break;
      }
    }
  }

  @Override
  public String next() {
    try {
      loadBuffer();
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }
    return String.copyValueOf(buffer, 0, pos);
  }

  @Override
  public String nextLine() {
    try {
      return reader.readLine();
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }
  }

  @Override
  public int nextInt() {
    return Integer.parseInt(next());
  }
}

Is this the best I can do?

It is barely faster than Java's StreamTokenizer (NOT StringTokenizer) inspite of being much simpler than it: http://docs.oracle.com/javase/8/docs/api/java/io/StreamTokenizer.html

Java source: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/java/better/files/ArrayBufferScanner.java

Other Scanners: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/scala/better/files/Scanners.scala

Benchmark results: https://github.com/pathikrit/better-files/tree/master/benchmarks

Tags java, scala, scanner, input reading

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en3 English wrick 2015-10-25 10:15:43 245
en2 English wrick 2015-10-25 10:08:36 625
en1 English wrick 2015-10-22 08:32:45 2270 Initial revision (published)