Please, try EDU on Codeforces! New educational section with videos, subtitles, texts, and problems. ×

afcruzs's blog

By afcruzs, history, 5 years ago, In English,


I'm developing a code analysis project and I'd like to use codeforces source code from a bunch of submissions.

Is there any way to access to the source code of the submissions directly via the codeforces API?, any other idea is welcome :).


  • Vote: I like it
  • 0
  • Vote: I do not like it

5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Unfortunately, API does not have such method :( To get submission code, I parsed html. If you use python, just copypaste it from my plugin's code, otherwise, good luck :)

5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

You can parse HTML easily using something like Cheerio for NodeJS. Look at code.

If you use another language this kind of applications to parse HTML are very commons.

5 years ago, # |
  Vote: I like it +3 Vote: I do not like it

Parsing HTML is not the only option. If you go to status page of any previous contest ( for example), open developer tools in your browser and click on submission id, then you will see submission details and its code. In the network monitoring you can notice that your browser made a POST request to the CF url with submissionId and csrf_token passed as POST parameters. In the response you will see a JSON object with source code. This way you don't need to parse HTML. It is still a bit complicated to make it fully automatic cause csrf_token expires and you have to authenticate again but it is doable and looks more like API (you won't depend on markup for example).

  • »
    5 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    I once wanted to store all code of my accepted submissions.There I encountered this problem too. So basically noticed it is very easy to get to code without using any fancy thing like Selenium or even any codeforces API. Just store the contest status. You can get round id from question name(in personal submissions) or from web address (every submission). Then just use data to go to actual submission page and get the code.

    Here is code i wrote now python-2

    import requests
    from lxml import html
    # see start and end pages and specify in range
    for foo in range(1,6):
        # change address below accordingly
        page = requests.get('//'+str(foo))
        tree = html.fromstring(page.text)
        verdict = tree.xpath('//*[@id="pageContent"]/div[4]/div[6]/table/tr[position() > 1]/td[6]/span/span/text()')
        contestID = tree.xpath('//*[@id="pageContent"]/div[4]/div[6]/table/tr[position() > 1]/td[4]/a/text()')
        problemID = tree.xpath('//*[@id="pageContent"]/div[4]/div[6]/table/tr[position() > 1]/td[1]/a/text()')
        for i in range(len(contestID)):
            r = map(str,contestID[i].split())
            contestID[i] = r[0][:-1]
        for i in range(len(verdict)):
            # change verdict or dont specify at all
            if verdict[i] == 'Accepted':
                solutionPage = requests.get('//'+str(contestID[i])+'/submission/'+str(problemID[i]))
                tree2 = html.fromstring(solutionPage.text)
                code = tree2.xpath('//*[@id="pageContent"]/div[3]/pre/text()')
                # code stores the code, now you can do anything with it