Im writing this both to help others and test myself so I will try to explain everything at a basic level. Any feedback is appreciated.
A Fenwick Tree (a.k.a Binary Indexed Tree, or BIT) is a fairly common data structure. BITs are used to efficiently answer certain types of range queries, on ranges from a root to some distant node. They also allow quick updates on individual data points.
An example of a range query would be this: "What is the sum of the numbers indexed from [1,x]?"
An example of an update would be this: "Increase the number indexed by x by v."
A BIT can perform both of these operations in O(log N) time, and takes O(N) memory.
So how does this work?
BITs take advantage of the fact that ranges can be broken down into other ranges, and combined quickly. Adding the numbers 1 through 4 to the numbers 5 through 8 is the same as adding the numbers 1 through 8. Basically, if we can precalculate the range query for a certain subset of ranges, we can quickly combine them to answer any [1,x] range query.
The binary number system helps us here. Every number N can be represented in log N digits in binary. We can use these digits to construct a tree like so:
The length of an interval that ends at index I is the same as the least significant digit of that number in binary. (We exclude zero as its binary representation doesn't have any ones.) For example, interval ending at 7 (111) has a length of one, 4 (100) has a length of four, six (110) has a length of 2.
This gives the tree some interesting properties which make log N querying and updating possible.
- Every index has exactly one interval ending there. This is obvious from the way we constructed the tree.
- Every range [1,x] is constructable from the intervals given, and every range decomposes into at most log N ranges. (This will be proved below)
- Every index is included in at most log N intervals. (This will also be proved below)
Proof that every range [1,x] is constructable from the intervals given.
A range query can be defined recursively [1,x] = [1,a-1]+[a,x] where [a,x] is the interval ending at x. x's which are powers of two are base cases as they contain the range [1,x] precalced. a is never below 1 as it is defined as the least sig bit in x, and x — (least sig bit) is either positive or a base case.