Weramajstor's blog

By Weramajstor, history, 8 years ago, In English

Hello, I was just wondering if someone has seen some of the mainstream algorithms(let's say not completely trivial ones) in real computer applications like:Binary indexed tree,Segment tree,Convex Hull,Meet in the midde,etc.

Is search of strings when we press ctrl+F implemented with KMP string finding algorithm,questions like that? I find this topic interesting so I'm asking,obviously.

  • Vote: I like it
  • +18
  • Vote: I do not like it

»
8 years ago, # |
  Vote: I like it +4 Vote: I do not like it

Aho-Corasick string matching algorithm is used as the basis of the linux grep command.

»
8 years ago, # |
Rev. 2   Vote: I like it +1 Vote: I do not like it

If you use any kind of Unix, the GNU grep command uses the well known Boyer-Moore pattern matching algorithm.

Source: https://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

»
8 years ago, # |
  Vote: I like it +74 Vote: I do not like it

Yes, there are some.

Let's start with the basics. You can find heaps and linked lists in very different pieces of software, starting with Linux kernel (memory management as an example). Binary search appears in many places; I'm pretty sure that some databases use it underneath. I wouldn't even mention sorting.

When spatial problems arise, R-trees do the job. "OK Google, where is the nearest gas station?" There are millions of them, and you should find a closest one within a few milliseconds. Okay, R-tree is not really a competitive programming thing; how about Dijkstra's algorithm? Routing algorithms for road graphs are based on it. Of course it is not basic Dijkstra's under the hood, it is tweaked in several unbelievable ways, but be sure: when you ask your GPS navigator to build a route from A to B, it makes a call to this well-known algorithm. I do know what I say, I deal with routing algorithms for road graphs at work.

Want something less trivial? Sure. One very popular website with millions of users stores some entries within a treap. Several-hundred-million-nodes treap, distributed among many servers, but still. How about flows? Here we go: a problem of optimal graph partitioning (splitting a graph into several parts such that there are as few as possible edges between parts) is NP-complete but there are powerful heuristics which use network flows. ilyaraz can say more, he worked in a team with Andrew Goldberg where they developed a state-of-the-art algorithm which does the thing. The problem of graph partitioning arises in some spatial tasks, yet again.

String algorithms and suffix structures are widely used in bioinformatics. I'm not a specialist in this topic but have heard that some well-known (and even more unheard) algorithms were developed especially for analyzing ACGT-strings. Indeed, you have strings of unimaginable length on an alphabet of four, and you need match substrings and compute some other stuff, I don't really know much. Have you already come up with suffix-something?

There are more things which are widely used in practice and less known in olympiads. Real world give as some constraints and some reliefs. On the one hand, there are no "bad tests" for your solution. Most data conform to some distribution, and algorithms exploit it. No one makes an anti-hash test for you, no one creates an instance of a problem where your algorithm's running time degrades to O(n2), that's why heuristic algorithms are widely used. On the other hand, you have to deal with several lines of caching (L1-2-3, RAM, HDD, other data centers...). This gave birth to B-trees and a very large number of cache-oblivious algorithms.

Computer graphics involves many geometric algorithms. Again, I'm not a specialist in this field so know only a little. Ray tracing algorithm, for instance, is used in 3D rendering software. Delaunay triangulation is also "somewhere here", though now I cannot remember any exact cases where it's used.

I'm pretty sure that this is only the tip of the iceberg. I don't know any examples of using segment trees -- probably, scheduling? Also, there is a bunch of math algorithms, both number theory and calculus, used in cryptography and computational mathematics.

Whatever, it is unlikely that you will have to implement any of these at work: (almost) everything you might ever need is already written, tested, documented and open-sourced. Most likely it will be the part of boost.

(In the last sentence I'm kidding of course. I'll never forget how I tried to find a simple Primal-Dual algorithm (min-weight max-matching in general graph) readable implementation in C++, failed, found one in Python and spent a whole night translating it. That code still visits me in nightmares.)