Last year, I published ratings using a contest rating system that I had developed at the end of 2015. Back then, I promised to eventually write in detail about the system's inner workings.
Over the past week, I've cleaned up and optimized the code: it now takes 24 minutes to process the entire history of Codeforces on my small laptop!!!
More importantly, I cleaned up the paper. Please ignore the last sections for now, as they're incomplete, but the main sections that explain how the rating system was derived are now ready! I claim my Elo-R is a more principled extension of Elo/Glicko to the programming contest setting, with nicer properties than the systems that contest sites currently use. You can read it here.
The main work that remains to be done are quantitative empirical studies comparing the properties of the different ratings systems. Since this is just my hobby project, I might not have the time to do all of it alone. If anyone wants to help run experiments, let's chat about it!