Community Participation for Combating Plagiarism in Contests.

What I've observed over the few months that I've been a member here is that most of the people who participate in contests do it simply because they enjoy Competitive Programming; most of us aren't here to just inflate our ratings.
What if we we could put a system in place that is based on 'trusted members' (like trusted participants, but a bit more strict) volunteering to personally inspect a few randomly assigned anonymous submissions and check and flag submissions for things such as code obfuscation which are easily able to get through today's Plagiarism Detection System.

Because there are tens of thousands of active members who would satisfy the above criteria, every member would have to scrutinize at most a couple of accepted solutions and so this system should be easy to scale. Because the entries would be anonymous and people would have no way of knowing which users' submissions they'll have to scrutinize next, the chances of concerted mob behavior to 'game' the system would also be very, very low...
What to do with the submissions which have been flagged a certain number of times on a few different parameters is bound to remain an open question for at least a while and it would be great if you could share your suggestions...

Edit:
For scrutinizing submissions, a new section could be added on the official website (like contests and problemset) that is accessible to only the trusted members (most of us), and people would be able to flag submissions only from this section. All submissions are public for viewing, but when they are to be checked for plagiarism, on this section, they'll appear as randomly assigned anonymous entries.
Further, the Plag. Detection System could flag suspicious submissions for things such as copious amounts of comments, and only those submissions would appear on this section.

Comments (12)

Write comment?

Recyclops

2 years ago, # |

Auto comment: topic has been updated by Recyclops (previous revision, new revision, compare).

→ Reply

MohamedAboOkail

Good idea worth burying, Lol

2 years ago, # ^ |

That is what is going to happen most probably anyway...

qpwoeirut

+12

Seems like an interesting idea, but I have a question.

Given a submission, how do you know if it's plagiarized or not? There's definitely some red flags you can look for (if (false), while (false), etc.) but from my understanding there's plenty of things that slip through which are a bit less obvious. Is there some good way of telling if a submission is plagiarized without comparing it to another submission?

It would be interesting to instead do it with pairs of submissions, based on some edit distance heuristic. Then instead of doing all pairs, instead check pairs randomly–enough to make cheaters be scared of getting caught.

And of course there's the whole issue of whether the CF team is able/willing to implement this.

There will always be submissions that slip through, but all the plagiarized submissions that I've viewed were easy to discern (I know, Survivorship Bias); but still, considering how many I've viewed and how many blogs are being published on these daily, even if we could deal with only these submissions, I think it would make a significant difference.

bugdone

Because the entries would be anonymous

They wouldn't, all submissions are public; you can find the user of the submission.

You could somewhat mitigate this issue with a weighted score for each volunteer based on his accuracy (the way they do it in CSGO's Overwatch system).

Still, I doubt anything will be done since cheating using obfuscation has happened for years and they don't seem to care.

For scrutinizing submissions, a new section could be added (like contests and problemset), and people would be able to view and flag submissions only from this section.
All submissions are public for viewing, but when they are to be checked for plagiarism, they'll appear as anonymous entries.

You can get the code from the anonymous submission and compare it against all submissions from the contest (or all the submissions on codeforces if the contest is not specified), and you'll get the username.

codemastercpp

+11

While there are thousands of people qualified to be "trusted member" I doubt nearly enough people will want to waste time doing that.

High rated people are not affected by cheaters, low rated people are better off practicing than wasting time on scrutinizing submissions. So I doubt many people would want to do that.

This is a valid point, but because the process takes at most a minute/two, some people definitely will (just looking at the number of blogs being posted on this). And even if just 5-10% of the users decide to participate, this would mean at least a few thousand AC submissions being checked every single day (which is more than enough as usually only 2-3 contests are held every week).
And it isn't necessary for all submissions to be checked anyway. The more the people participate, the better the system will perform.

vivekdhir77

I don't think this is necessary because if a person cheats in contests, then he can only increase his rating to a specific threshold, after that, it will become tough to increase the rating after a certain point. Hence, he is going to leave competitive programming as his goal of competitive programming is "not his interest" but "rating".

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	157
6	maroonrk	156
7	-is-this-fft-	152
8	Petr	146
8	orz	146
10	BledDest	145

Recyclops's blog