Why's my Convex Hull Trick so Slow? - Codeforces

→ Обратите внимание

До соревнования
Codeforces Round 941 (Div. 1)
2 дня
Зарегистрироваться »

*есть доп. регистрация

До соревнования
Codeforces Round 941 (Div. 2)
2 дня
Зарегистрироваться »

*есть доп. регистрация

→ Трансляции

AMA: TheOneYouWant

aryanc403

До начала 27:55:12

Всё →

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

Страны | Города | Организации

→ Лидеры (вклад)

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
3	adamant	162
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя minimario

Why's my Convex Hull Trick so Slow?

Автор minimario, история, 7 лет назад, По-английски

По-английски

Hi all,

I was solving this problem: Link. It's fairly easy, so if you don't really want to solve it, I'll go ahead and write the dp recurrence here:

dp[i][k] = max(dp[j][k - 1] + p[j] * (p[i] - p[j]))

A pretty obvious Convex Hull Trick (CHT) problem. But I'm getting TLE on the last test case. I opened some AC codes, and I didn't see much difference between our codes, so I'm wondering what makes mine so slow and the other one so fast!

If anyone has some insights, please comment (or PM me) with details!

(P.S.: If anyone has a fairly fast implementation of CHT that would like to challenge the 436 ms, go ahead and submit it :))

Thanks so much,

-minimario

Теги

cht, optimization, apio

+25

minimario
7 лет назад
3

Комментарии

Комментарии (3)

Написать комментарий?

»

7 лет назад, # |

Проголосовать: нравится

+65

Проголосовать: не нравится

It seems you have fallen victim to locality of reference (wiki).

I switched the indices of the array last in your code, and now it passes in 513 ms.

→ Ответить

»

»

7 лет назад, # ^ |

Проголосовать: нравится

0

Проголосовать: не нравится

orz, thanks so much! Needless to say, you have my upvote :)

So from my understanding, since I am accessing last[k][1], last[k][2], ..., last[k][k], the computer "mindlessly" expects me to access last[k][stuff] every time, whereas last[1][k], last[2][k], ..., will cause a lot of cache miss penalty.

So then, if I'm right, why didn't much change when I tried the same thing for the "dp" array? It seems to be accessing dp[1][k%2], dp[2][k%2], ..., and I hope that based on what I wrote before, dp[k%2][1], dp[k%2][2], ..., would be a faster option.

→ Ответить

»

»

»

7 лет назад, # ^ |

← Rev. 5 →

Проголосовать: нравится

+15

Проголосовать: не нравится

Perhaps "mindlessly" is not the right word.

My (admittedly limited) understanding is that the processor loads contiguous chunk of stuff from RAM into cache (which is uber fast); so when you access last[k][2] after last[k][1], it is expectedly already in cache, so the access is fast.

But your question still remains valid, and to resolve that, note that sizeof(dp) = 2 * 100005 * 4 bytes < 1 KB, whereas modern processors have L1 cache (the super fast kind) on the order of 32 KB, so my guess is that the processor loads the entire array into cache, so it doesn't miss at all.

EDIT: I can't multiply for shit, it's < 1 MB, which is larger than size of L1 cache, but still in the right order of magnitude for total cache size.

→ Ответить

Codeforces (c) Copyright 2010-2024 Михаил Мирзаянов

Соревнования по программированию 2.0

Время на сервере: 25.04.2024 13:34:48 (j3).

Десктопная версия, переключиться на мобильную.

При поддержке

TON