Long story short, I've created a class called BigInt purely written in Java, which is pretty neat and outperforms
BigInteger. There are most likely (many) bugs, but since it's pretty basic at the moment, it'll be a great opportunity for the Codeforces community to learn how to do arbitrary-precision arithmetic. ^^
The standard library
BigInteger class in Java is known to be notoriously slow (at least for Java 7 and earlier, and I doubt Java 8 is any different). There are faster alternatives like Gnu Classpath providing support for GMP (which is optimized beyond infinity), but this is probably not a viable option for competitive programming. When it comes to pure Java we also have the
Apint class provided by the Apfloat library and the
LargeInteger class provided by the JScience library, but these two are probably not only clunky for competitive programming usage, they are also not that much faster than
BigInteger (in many cases they're slower, see section comparison below).
As a matter of fact I once had a conversation regarding the Quadratic Sieve with Torbjörn Granlund (the principal author of GMP) where we touched upon the issues of implementing QS in Java since
BigInteger is so slow. Upon this he said something like "Yeah BigInteger is really shitty. I can't understand why no one makes something better. I mean it would only take about 4 to 5 hours". This might be true for Torbjörn, but most likely not for the rest of us. However, the thought of creating my own kick-ass
BigInt class stayed with me since that day, and now I've finally (kind of) done it.
In my experience, in competitive programming or problem solving like Project Euler, we come in contact with big numbers in the form of cumulative results caused by many operations with operands of small magnitude (e.g. 1337 factorial). So my first design choice with this in mind, and in direct contrast to the
BigInteger class, was to make my
BigInt class mutable.
Let's have a look at how to calculate
BigInteger fac = BigInteger.ONE; for(int i = 1; i<=1337; i++) fac = fac.multiply(BigInteger.valueOf(i)); BigInt myFac = new BigInt(1); for(int i = 1; i<=1337; i++) myFac.umul(i); //umul is unsigned multiplication
Not only is the ways of
BigInt less verbose, it's also way faster. Since
BigInteger is immutable it must allocate a new array for the results, convert the
int to a
BigInteger object, perform the multiplication (using a general algorithm), create a new object and return it. The overhead is massive and unnecessary.
BigInt on the other hand just performs the operation directly on the internal array used to store the digits (in base 232) and grows the array (similar to
ArrayList) should it run out of capacity. (There are better algorithms for calculating big factorials, but that's not the point in this example...)
BigInteger is not suitable for many small operations. Is it suitable for operations using larger operands? As it turns out, no it's not.
The following time measurements are a results of some benchmarks run on my shitty 1.65GHz Dual Core computer. ^^
- Parsing: Converting two strings of length 500000 representing numbers in base 10 to the internal representation.
- Add: Adding a 100000 decimal digit number to another equally sized number, 100000 times. (MidBig-size cumulative benchmark.)
- Sub: Same as the addition experiment, but with subtractions.
- MidMul: Multiplying a 300 decimal digit number to a growing product, initially a 300 decimal digit number. (Mid-size cumulative benchmark.)
- TinyMul: The naive straightforward way to calculate 50000! (Small-size cumulative benchmark.)
- BigMul: Multiplication of two 500000 decimal digit numbers. (Big-size benchmark.)
- MidDiv: Dividing a 400000 decimal digit number by a 4000 decimal digit number a 1000 times. (Mid-size cumulative benchmark.)
- BigDiv: Dividing a 400000 decimal digit number by a 200000 decimal digit number. (Big-size benchmark.)
- toString: Converting the internal representation (having in decimal 213237 digits) to a decimal number string.
Test BigInteger BigInt Apint LargeInteger ---------------------------------------------------------- Parsing 31.602s 8.497s 0.049s 14.054s Add 6.394s 3.006s 27.279s 6.322s Sub 5.618s 2.243s 24.35s 6.026s MidMul 2.676s 2.259s 40.433s 3.176s TinyMul 10.683s 1.468s 35.535s 6.879s BigMul 12.332s 0.655s 0.289s 1.266s MidDiv 9.936s 7.558s 297.022s 3239.432s BigDiv 3.647s 2.536s 0.563s 5.335s toString 16.912s 4.614s 0.029s 15.116s
- Parsing: Apint is by far the fastest parser, this because it internally (probably) uses a power of 10 as base. BigInteger is by far the slowest.
- Add: BigInt is fastest due to its mutability. Apint is by far the slowest, since it uses a power of 10 base.
- Sub: The same goes here. Apint is slow when it comes to simple arithmetic operations.
- MidMul: A bit surprisingly Apint is the slowest here, while BigInt is the fastest.
- TinyMul: Not surprisingly BigInt is the fastest (since it's designed around this user case) and Apint the slowest.
- BigMul: BigInteger is by far the slowest since it utilizes a naive O(n2) algorithm, whereas BigInt uses (a parallel) Karatsuba, and Apint probably uses FFT making it the fastest.
- MidDiv: What the heck happened to LargeInteger!? 50 minutes runtime!? Well that's simply unacceptable. BigInt and BigInetger are the fastest here. Apint (and definitely LargeInteger) probably makes unwise algorithm choices.
- BigDiv: LargeInteger is the slowest, but by no means bad. Apint is once again fastest, this is really the area where it shines.
- toString: Both BigInteger and LargeInteger are just slow. LargeInteger also uses for some unknown reason a recursive approach, which could cause Stackoverflow if you don't increase your stack size (terrible design choice indeed...). Once again Apint is the fastest due to its internal representation.
This survey is by no means comprehensive, but it gives a hint and its verdict is clear.
BigInteger is good if you want a bug free (I guess ^^) and foolproof class.
LargeInteger isn't good for anything.
Apint is good if your program frequently parses and prints decimal numbers, it's also as of today the fastest implementation when it comes to multiplication and division of big numbers (although that will probably change soon ;D ). Otherwise
BigInt really is your choice, sure there's not much else to it than the "4 basic operations" and since it's mutable it's a bit risky (please be careful if start putting it into trees etc.).
The code for
BigInt can be found through the link at the top leading to the "Huldra project" (a reference to a forest creature in Scandinavian folklore) and I'll hope you might learn something from it. Most tutorials out there just teach you how to do it in base 10^9 which is a bit noobish.