Hello everyone.
Reference: here
problem statement:
An array is good if all subarrays have atleast 1 element in them whose frequency is 1.
Good : 1,2,1
Good: 1,2,5,2,4,3,4
Bad: 1,2,3,1,2,3
O(N^2) solution is easy but can anyone share a better approach for this question.
Thank you.
For each index i maintain the right-most index p such that p < i and A[p] = A[i], similarily maintain the left-most index q such that q > i and A[q] = A[i]. Now if the union of all the n-intervals [p+1, q-1] for all indices i is equal to [1..n]. Then the array is good, otherwise it is not.
Wrong Approach ! Thanks @ScoobyDoobyDoo
[3 2 4 2 4 3 2 4 2 4] Subarray of size 5 is good, size 4 is not
By good subarray I mean that it contains at least one unique element. The whole array needs to be good, so it has to contain an unique element. Find this element, now each subarray containing it is good, so we can just recursively repeat our algorithm for the elements to the left and to the right. This is pretty easy to do in $$$O(n \log n)$$$ complexity.
but won't be it quadratic if unique element is not in the middle like quick select worst case is O(n^2) if good pivot is not chosen
I think that it would not be possible to get guranteed time complexity of NlogN. And also in that reference link there is clearly said that expected time complexity so may be this could referred to time complex.. of randomized algorithms. Still I didn't have think too much about it so it can possible to get guranteed lower bound of NlogN.
So for time being this quick sort type solution would do better.
Yes, but there is quite well known trick to search simultaneously from the left border and the right border, so we make O(smaller part) so it will sum up to $$$O(n \log n)$$$.
Hi Anadi, We can maintain a min array and a right bound array (Lets say mn[n] , R[n]) While checking the ith index, we find its previous element with same value and next with same value (Lets say Prev[i] , Next[i] in O(nlog(n))) Then we update all elements in the range — [Prev[i], i] with max(R[j],Next[i] — 1) for each j in the range. In the min array we maintain the left most index with R[j] = i => mn[i] = j So, while checking the ith element, Prev[i] should be less than or equal to mn[i — 1] otherwise subarray — [mn[i — 1] , i] would not have any unique element. We can iteratively do the process in O(nlog(n)) time. I would love for you to check this approach.
I am not sure, probably don’t understand your approach. I cannot see why your approach wouldn’t fail on some easy test like $$$2$$$, $$$3$$$, $$$2$$$, $$$3$$$.
Not sure, but it seems a good implementation can guaranteed O(n logn) solution .
If there exist multiple unique elements, select one which is closest to middle index.
Otherwise, let suppose if there is only 1 unique element which is at leftmost index, then it is obvious that we get result just after one recursion as we can't find any unique element in next recursive call.
In general we can guaranteed, that if such unique element is find at kth index, recursion depth will not exceed k. so if k < logn, then complexity is obviously O(nlogn).
I know this seems quite abstract, but may be there is some mathematical proof of it.
Here's a counterexample where the recursion depth is $$$n/2$$$:
2 1 3 2 4 3 5 4 6 5 7 6 7
After each step, there's a unique value at the second index.
Expanding a little bit more on the following statement:
Yes, but there is quite well known trick to search simultaneously from the left border and the right border, so we make O(smaller part) so it will sum up to $$$O(n \log n)$$$.
Argument:
You start looking for a unique element from both ends simultaneously: $$$1, n, 2, n - 1, 3, n - 2...$$$.
Suppose you spot the unique element at position $$$i$$$. At this step you did $$$2 \cdot min(i, n - i)$$$ operations to check for each element if it is unique on this interval or not.
Then you run the algorithm recursively to the left and to the right. Notice that: $$$ min(i, n - i) \leq min(|left|, |right|)$$$ so basically the cost of each step is proportional to the smallest resulting interval.
Now to compute the complexity, suppose that every time you inspect an element you add a coin on it if it belongs to the minimal interval in that iteration. The number of check operations you will do in the end is proportional to the number of coins (in particular close to twice the number of coins).
Each time a coin is added on top of an element, it means the interval where this element was located was merged with a bigger interval, then its size increased in at least twice. So the number of coins $$$k$$$ must satisfy that $$$2^k \leq n$$$, then $$$k \leq \log_2 n$$$. Then in total the number of coins is $$$O(n \cdot \log n)$$$ which is indeed the complexity of this algorithm.
Notice that this analysis is pretty much the same as the technique of merging smaller to larger (usually on trees), just backwards.
Can somebody explain this part? Till now what I have understood (maybe wrong) is we are searching from the left border and right border i.e checking first i elements and last i elements, but how can we check element at position i is unique in complete array if we have only traversed first i and last i elements.
Can somebody please clear my doubt?
You can for instance precompute for each element the indices of its previous and next identical elements. Then to check that an element is unique in a subrange, it suffices to check that the precomputed indices fall outside of that subrange.
Not getting this!
"So the number of coins k must satisfy that 2^k ≤ n, then k ≤ log2n."
Shouldn't it be:
every time a coin is added, we have atleast 1 mirror element in larger interval
then k coins imply 2*k <= n, which is k <= n/2
Your reasoning is not wrong, but doesn't provide a good upper bound. You can do better.
Suppose an element has $$$k$$$ coins. It means it belonged $$$k$$$ times to the smallest side. The last time you added a coin to it the length of the interval containing it was $$$a_k$$$ and the length of the other side was $$$b_k \geq a_k$$$, but then it implies that in the previous step all $$$a_k + b_k$$$ elements were in one side, so when we add the previous coin the size of the interval was $$$a_{k -1} \geq a_k + b_k \geq 2\cdot a_k$$$. So we know the size of the intervals when we add a coin as we move backwards at least increased twice. This argument is enough to proof that the maximum number of coins added to an element is $$$O(\log_2 n)$$$
Got it nice intuition, thanks! It would help if they are some more problems of the same type (in terms of analysing time complexity)
I believe we can create a segment tree which stores all the unique elements for a given range and if this segment tree contains atleast 1 unqiue element for every range, the array is good.
there are n * (n — 1) / 2 differentes ranges
Actually, Let me rephrase it.
We can use divide and Conquer to solve this problem. How?
To find the unique elements in the current array: divide the array into 2 halves and get the set of unique elements in the left & right subarray.
Now, apply binary search for each unique element obtained from the left half on the right half and vice versa (eliminate them if found). We will be left with elements which are unique in the current array.
How to get unique elements from the left & right subarray: use the above steps.
While following this approach if for any subarray we can get at least 1 unique element, the array is good.
What's the constraint on Array elements? I mean what is the range of A[i]?
Whatever they may be but we can use coordinate compression and bring them in the range [1, n].
No that's not my point. I mean suppose N is in order of 10^5 and A[I] is in the range [1,100], then we can get some idea about how many elements are repeated and all.
gotcha.
Misinterpreted the problem to be longest sub array with a element having frequency 1 in the subarray.
we can sort the array in O(NlogN) time and then if the number has count = 1 (by keeping track of previous number) we can say if it is a good array otherwise bad.
might be useful: cs stackexchange answer
similar to one of the mentioned solutions.
Hii Could You share your Google Interview Experience !!
This question was asked to someone who posted the blog on Leetcode & when i saw this i couldn't comeup with anything effiecient so i added a blog here for help.
Create an array of next occurrence of each element then using segment tree descent we can find first unique element in l-r in logn. Now recurrence will be T(n)=T(n-1)+O(logn) which will be O(nlogn)
How are you going to find the first unique element in logN?
in the array of next occurrences this will be equivalent to finding first element greater than r in range l to r. This can be achieved by building a segment tree of max on occurrences array and descending the tree accordingly.
We can use map while for storing frequency which will take O(n) And can sort according to second element in increasing order this will take logn; and check the second element of first element of map whether it is 1 or not.
NlogN solution. Please check below link — https://leetcode.com/playground/JJD5246B
say arr = [1,2,5,2,4,3,4] number : frquency 1 1 2 2 3 1 4 2 5 1 index : 0 1 2 3 4 5 6 array : [1 2 5 2 4 3 4] freq=1 : ^ ^ ^ If indices of those elements which has freq = 1 is (r1,r2,r3...rn) If there is no repeated element in between r1,r2 and in between r2,r3 and in between r4,45 and so on, then there always will be one element with frequency = 1 in each subarray.
bool isGoodArray(vector v) { unordered_map<int, vector > m_pos; //key --> thats key's postitions in array int p = 0; for(int i = 0; i< v.size(); ++i) { m_pos[v[i]].push_back(i); if(m_pos[v[i]].size() == 1 || m_pos[v[i]].size() == 2) { p++; } } /* if all numbers have frequency more than 1, then * then the subarray which starts at the first element * and ends at the last element (that is the entire array) * is not a good array, so return from here */ if( p / 2 == m_pos.size()) { return false; } unordered_map<int, vector >::iterator it = m_pos.begin();
}
int main() { //int input[] = {1,2,5,2,4,3,4}; //int input[] = {1,2,3,1,2,3}; int input[] = {1,2,1}; vector v(input, input + sizeof(input) / sizeof(int) ); cout<<isGoodArray(v);
}
Step 1 : Traverse the array and store occurrences of each elements in an unordered map where the element is the key and it is mapped to the positions of the element in the given array. This operation will take o(N).
Step 2 : If positions of those elements which has freq = 1 are (r1,r2,r3...rn) If there is no repeated element in between r1,r2 and in between r2,r3 and in between r4,45 and so on.. then there always will be one element with frequency = 1 in each subarray.
For example, the array [1 ,2, 5, 2, 4 ,3 ,4] , elements 1,5,3 have frequency 1. Their positions are 0,2,5. Other elements 2,4 have frequency > 1. So now if 2 or 4 do not occur more than once in the subarrays [0 to 2],[2 to 5] , [5 to 6], then then no bad subarray is possible.
To implement step 2, we can store the positions(indices) of single frequency elements in a sorted data structure and then search the occurances (indices) of each of the multiple frequency element in that sorted array. We can use upper_bound() here. If two consecutive occurrences have same upper_bound, then a bad subarray is possible. The search in the sorted data structure will take o(logN). Total approximately O(NlogN). O(N) + O(NlogN) ===> O(NlogN)
Can there be any O(N) solutions?
Checking for the existence of a unique element itself would probably take $$$O(n \log n)$$$, or at least $$$\omega(n)$$$ time (deterministically), so if you want to avoid randomized solutions (including hash tables), the answer is probably no.
Yeah I'm having a problem similar to this, I tried the O(NlogN) way but it turns out to be TLE
link to question
Very well explained editorial
Thank you, got it AC ^_^
You might be finding the frequency each time (for checking if an element is unique), which would give TLE, since you do $$$O(n)$$$ work instead of $$$O(\min(i, n - i))$$$ work. As pointed out in one of the above comments, you'll need to maintain the indices of the previous equal element and the next equal element.
It might also be the case that you're not performing constant factor optimizations (or inadvertently doing too many unnecessary memory accesses — for instance, while using map).
:D thank you so much man, that really helped me.