Finding the number of elements that have frequency >= k, in a given range

→ Pay attention

Before contest
Codeforces Round 941 (Div. 1)
4 days

Before contest
Codeforces Round 941 (Div. 2)
4 days

→ Streams

CodeChef Starters 131 Solution Discussion

By aryanc403

Before stream 31:40:47

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3650
2	Benq	3582
3	Geothermal	3570
3	orzdevinwang	3570
5	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3532
8	Radewoosh	3522
9	Um_nik	3483
10	jiangly	3468

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	151
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

color_me_red's blog

Finding the number of elements that have frequency >= k, in a given range

By color_me_red, history, 7 years ago, In English

With an array of size ≤ 10⁵, and with every a[i] in the range 1 ≤ a[i] ≤ 10⁵, there are ≤ 10⁵ queries.

Each query is of the form [l, r, k], and the output should the number of elements that appear ≥ k times in a[l]..a[j]. I've though on this for a while, and can't get any idea how to solve this. I've seen a similar problem before but can't find the statement anywhere.

I'm thinking squareroot decomposition could work, but have no clue to proceed. Thanks for any help!

range query

color_me_red
7 years ago
18

Comments (17)

Show archived | Write comment?

I_love_ProParThinkNot

7 years ago, # |

← Rev. 2 →

You can do this with Mo's algorithm, you can read more about it here. Note how the very problem they used there to explain the algorithm is similar to this :)

→ Reply

GreenGrape

7 years ago, # ^ |

In this particular case the complexity would be O(nsqrtlog^2n). Kinda sad imo :(

→ Reply

I_love_ProParThinkNot

7 years ago, # ^ |

← Rev. 2 →

Oops. I didn't notice k will vary for each query here. Sorry about that.

You can still do it in O(n*sqrt(n)*log(n)) though. You can maintain a segment tree or set on the frequency of elements.

Suppose the current frequency of the element you want to insert now is f. So, you will increase value of (f+1)-th node in segment tree by 1, and decrease value of f-th node by 1.

Same way, you do +1 to (f-1)-th node and -1 to f-th node while removing an element.

While querying, sum of range k to end is the answer.

But this takes around 5*10^8 operations which is still pretty costly :/

Edit: pranet described the n*sqrt solution, no need of any log factor at all.

→ Reply

GreenGrape

7 years ago, # ^ |

Ah, my bad. I initially though about maintaining a segment tree (on numbers themselves, not frequencies) with treaps inside. But your approach is much better :)

→ Reply

fofao_funk

7 years ago, # ^ |

← Rev. 2 →

I thinks it's possible to do this in O(N sqrt(N)). Through Mo's you need to maintain two arrays: app[] and kfreq[]; the former indicates for every element i how many appearsances of the said element i are there in the current segment; on the other hand, kfreq[i] will hold how many elements j are there such that app[j] > = i. Both updates on the arrays can be done in O(1). Then, for a query (L, R, K) you just need to consult kfreq[K].

→ Reply

pranet

7 years ago, # |

← Rev. 2 →

+22

O(n * sqrt(n)) solution

Run MO's. Maintain two arrays, arr[] and cnt[]

arr[i] = number of elements with frequency atleast i

cnt[i] = frequency of element i

When you add an element x, ++cnt[x], ++arr[cnt[x]];

When you delete an element, --arr[cnt[x]], --cnt[x];

Answer for a query = arr[k]

→ Reply

GreenGrape

7 years ago, # ^ |

Should be arr[k] prob?

Yet I don't see how you're gonna update everything in arr in constant time. It's >=, not just =.

→ Reply

I_love_ProParThinkNot

7 years ago, # ^ |

After you add +1 to cnt[x], it means that there is one more number now equal to or larger than arr[cnt[x]], so you just ++arr[cnt[x]]. When cnt[x] decreases, there no longer will be that number, so he is doing --arr[cnt[x]] before decreasing.

Here's an example.

Suppose whole array is 1 1 2 and first range contains 1 1.

When you get first 1, cnt[1] becomes 1, arr[cnt[1]] i;e arr[1] becomes 1. When you get second 1, cnt[1] becomes 2. arr[2] becomes 1.

So if you query k = 1 now, you will find answer as arr[1], if k = 2, answer will be arr[2].

If the next range is 1 2, you have to remove the first 1. So you have to do --cnt[1]. But then, arr[2] no longer will be 1, it will need to decrease too as cnt[1] will no longer be 2. So before doing --cnt[1], you first decrease arr[cnt[1]] i;e arr[2], it will become 0, then you do --cnt[1], it becomes 1.

So now if you query k = 2, you will see arr[2] is 0 i;e no number has frequency >= 2.

→ Reply

GreenGrape

7 years ago, # ^ |

Tricky but you're right, thnx :)

→ Reply

E869120

7 years ago, # |

← Rev. 3 →

UPD: I misreaded. Sorry for inconvenience.

Use Mo's algorithm.
Let a[i] be the number of element i in current range.
For each transition, you should do only [ add 1 to a[i] ] or [ add -1 to a[i] ]. But you shuold count the number of elements which is a[i]≥k in a[i]. So you can use BIT (Binary Index Tree).
The complexity of each query in BIT is O(log n), and the number of transition in Mo's algorithm is n*sqrt(n) times, so the solution's complexity is O(n*sqrt(n)*log(n)) time.

Sorry for my poor English.

→ Reply

YunQian

2 years ago, # ^ |

Actually we don't need to use BIT.

Let p[i] be the number of x such that a[x] >= i and we can maintain it in O(1) time.

For instance, if we need to add +1 to a[i] , simply making p[a[i]+1]++ is enough.

For a query the answer will just be p[k] .

The total time complexity is O(qsqrt(n)), without the log .

(Sorry for my bad English >_< )

→ Reply

avantika_05

2 years ago, # |

-59

Here I have a Python ONE LINER solution for the same. Assuming l and r are q[0] and q[1]:

return [sum(1 for y in dict(Counter(arr[x[0]-1:x[1]])).values() if y>=k) for x in query]

→ Reply

-is-this-fft-

2 years ago, # ^ |

← Rev. 2 →

+68

No you don't.

The only reason this question was asked (4 years ago!) is because it's not clear how to do it with good time complexity. Naive solutions, however short, are not relevant to this discussion.

→ Reply

avantika_05

2 years ago, # ^ |

-72

so what dude? I am just sharing my solution. That's my personal choice. Why do you have to break your head for that? Just calm down! Ok bro? The post was not about finding an optimal solution, it was about him not being able to solve the problem. I am sorry if I am being rude here. But I am getting so much down votes from the community for sharing my solution with this world. Quite discouraging!!!

→ Reply

-is-this-fft-

2 years ago, # ^ |

+11

The post was not about finding an optimal solution, it was about him not being able to solve the problem.

Considering that OP mentions square-root decomposition in the post, they definitely had time complexity in mind.

Also, the downvotes are in part due to this post being 4 years old.

→ Reply

avantika_05

2 years ago, # ^ |

-50

OP also mentioned in the previous line that "I've though on this for a while, and can't get any idea how to solve this", which means he didn't know to solve it but just had an option in his mind. But I didn't do anything wrong by sharing my one liner code. I thought pythonic people like me searching for an approach would actually be benefited by my code. But instead I got so much downvotes here. Felt really bad for stepping out to help. :(

→ Reply

jalsol

2 years ago, # ^ |

← Rev. 4 →

+30

The limits of the inputs are given, which is a clear indication that execution speed was the most important aspect. The solution you provided obviously does not meet such requirement. I'm not even mentioning how short, Pythonic codes in some cases can do more harm than good.

This leads to another problem: you failed to provide a good solution, while the post is 4 years old. You can literally see this warning right above the "Post" button:

This post was published a long time ago. Please refrain from commenting unless you have a really reasonable cause to leave a comment. Necroposting is discouraged by the community.

I believe if you do the same thing on other sites/forums, you will most likely receive the same responses.

→ Reply