What should be the strategy of using automatic system to detect code plagiarism? - Codeforces

→ Pay attention

Before contest
Pinely Round 4 (Div. 1 + Div. 2)
38:53:58
Register now »

*has extra registration

→ Streams

Atcoder ABC #364 Solution Discussion

By aryanc403

Before stream 14:03:58

View all →

→ Top rated

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

View all →

→ Find user

→ Recent actions

Detailed →

hieu_2004's blog

What should be the strategy of using automatic system to detect code plagiarism?

By hieu_2004, history, 4 years ago, In English

In English

Recently, I have seen a lot of blogs talking about the issues of cheaters. Therefore, I am currently thinking about using automatic system to catch them.

Currently, the most well-known automatic system for assisting of detecting plagiarism is MOSS(from Stanford). At first, I asked myself, why did not Codeforces use them? However, I look at the number of participants of each contests; it turns out that the count is approximately under 30000. So, we have to compare $$$4.5*10^8$$$ pairs of source code!

Assuming that the system can check $$$10^4$$$ pairs per second, we will need $$$45000$$$ seconds, which is just more than half a day, the same length as hacking procedure of Educational Rounds. But I believe that limit is much lower (I have not used it).

Is there any assistance like that could run that fast, if not MOSS? Is there any solutions that can drop the complexity of $$$O(n^2 * t)$$$? (assuming $$$t$$$ is the time for comparing a pair of code)

+16

hieu_2004
4 years ago
2

Comments

Comments (1)

Show archived | Write comment?

»

4 years ago, # |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

we can do the same thing on any random contest from any two month period, where we will decrease cutoff of similarity, so that more persons could get caught.

technically we can make relation tree of variables, like now you can do some automaton or suffix sorting kind of thing to make smaller groups, by neglecting those pairs which will definitely differ in code perspective.

also, i guess moss is system only for text based comparison, does it also compares machine level code??

→ Reply