Median works but mean does not

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	162
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

OK so there is this problem of CSES, Stick Lengths

Brief of problem: Given some numbers, you have to make all numbers equal. You can do + or — any number to any number in array and the overall cost is the sum of all the number changes you make. Example, 1 4 10. If we think to make all elements to 5 cost is abs(1-5)+abs(4-5)+abs(10-5). We have to minimize this cost.

Solution: You find the median and compute the answer. This means that you make all elements equal to median. For example in 1 4 10 => it is 4. So minimum answer is: abs(1-4)+abs(4-4)+abs(10-4) = 3+0+6 => 10

What I thought: Let all the numbers be on x axis. Now I want all the points to be at the smallest distance from one central point so that the overall sum should be minimum. Then I thought that this must be mean. Because mean is the middle value of all the elements, converging all the points to mean would be correct. But no it's not mean, it's median.

My Question: Why median works and mean does not. Please give me an easy explanation. A mathematical reasoning which I can build on my way of thinking. Because I am not able to understand why it doesn't work for mean.

Counter example: Let's say we have, 1 2 1005 The mean is = (1+2+1005)/3 => 336. So cost is abs(1-336)+abs(2-336)+abs(1005-336) => 1338 The median is 2. So cost is abs(1-2)+abs(2-2)+abs(1005-2) => 1004 So median works and that is how I basically cheated and solved the question but it doesn't satisfy that inner feeling of why.

PS: Please help and please explain in a light language.

Comments (4)

Show archived | Write comment?

Pensive

3 years ago, # |

← Rev. 2 →

Try the test case: 1 1 1 100000000

What do you observe if you use the mean? What do you observe when using the median?

Try solving for some p q r (sorted order). Think of them as points on the coordinate axis. What's the problem asking you to do? It's about finding a point such that sum of absolute differences between each pair of points is minimized. If you choose q to be that point, the answer is (q - p) + (r - q) + (q - q). If you choose some other point, let's say q + x for any value x, then the answer is ((q + x) - p) + (r - (q + x)) + (q + x - q) == (q - p) + (r - q) + x. Hence, the optimal point must be q.

Try solving for some case p q r s and notice that any value x satisfying q <= x <= r works. After solving these two cases, it's easy to show that median works.

→ Reply

case3

3 years ago, # ^ |

Thank you. You saved my CPU from getting blown up. And also I am sorry to ask such a simple thing but maybe after understanding it feels easier.

whohet

This is because the distance between the median and the rest of the values is less than the distance from any other point, While, mean is just a point of average value of the set.

Yes intuitively it happened to be mean for me. But now I got it. Thanks.

case3's blog