Fastest way to know if pair of integers are present in input

→ Pay attention

Before contest
Codeforces Round 940 (Div. 2) and CodeCraft-23
3 days
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	adamant	164
2	awoo	164
4	TheScrasse	160
5	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Mike_Mirzayanov_Is_Pussy's blog

Fastest way to know if pair of integers are present in input

By Mike_Mirzayanov_Is_Pussy, 3 years ago, In English

Suppose we are given $$$n$$$ integers (anywhere from $$$1$$$ to $$$1e5$$$). We are then asked query weather if a number $$$x$$$ was given in input. Fastest way to solve this is using frequency array say $$$a$$$. We can do $$$++a[x]$$$ during input and check if $$$a[x]!=0$$$ while answering the query. Each query here can be answered in $$$O(1)$$$.

Now suppose I am given $$$n$$$ pairs of integers instead, let's say $$$n=5$$$ (their value can be from $$$1$$$ to $$$1e5$$$) and input are $$$(1,2),(10,11),(14,14),(1000,1320),(456,321)$$$. Now I am asked if $$$(10,11)$$$ was in the input, what is the fastest way to do it ? We can use map of pairs and we can answer each query in $$$O(logn)$$$. I wonder if there is any $$$O(1)$$$ method ?

Comments (13)

Write comment?

LanceTheDragonTrainer

3 years ago, # |

← Rev. 3 →

You can "lock" one of the dimensions and the problem becomes the same as the 1d version.

This means, you can use a table whose index represents the value of the left value in your pair. Then for each index in the table, you keep a hashmap which stores the right value in your pair.

For example, for (1, 2), (1, 3), (1, 4), (2, 3), (4, 5)

Our table will look like the following:

0 -> ()
1 -> (2, 3, 4)
2 -> (3)
3 -> ()
4 -> (5)
5 -> ()

So, in order to perform lookup for a query $$$(L, R)$$$, we just need to access index $$$L$$$ of the table and query the hashmap in there for value $$$R$$$. Since both data structures support $$$\mathcal{O}(1)$$$ amortized lookup, you have an $$$\mathcal{O}(1)$$$ lookup operation.

→ Reply

Mike_Mirzayanov_Is_Pussy

3 years ago, # ^ |

In worst case it can lead to $$$O(n)$$$ and unordered_map won't work for sure since it can be hacked i think.

→ Reply

LanceTheDragonTrainer

3 years ago, # ^ |

← Rev. 2 →

There are ways to avoid being hacked when you are using HashMap/unordered_map. Also, it is not that easy to make HashMap/unordered_map exhibit $$$\mathcal{O}(n)$$$ behaviour (you really have to try to make a adversarial input). In practice, these data structures (more often than not) stay true to their advertisement of amortized $$$\mathcal{O}(1)$$$ lookup.

→ Reply

LanceTheDragonTrainer

3 years ago, # ^ |

If you dislike amortized guarantees, I have another solution that will fulfill your requirement of $$$\mathcal{O}(1)$$$ answer to queries albeit incurring a slight increase in your overall time complexity.

If you are able to, sort both queries and input pairs by increasing order of left value then by increasing order of right value.

Thereafter place a pointer to the queries and the input pairs. Let's call the pointer the the queries pointerQ and the pointer to the input pairs pointerI.

Let the left and right values of the current query be valueQL and valueQR respectively.

Let the left and right values of the current input pair be valueIL and valueIR respectively.

If valueQL < valueIL, the answer for this query is NO (and pointerQ is incremented). Else if valueQL > valueIL, increment pointerI.

Else valueQL == valueIL holds true, so we compare by right value as follows:

If valueQR < valueIR, the answer for this query is NO (and pointerQ is incremented). Else if valueQR > valueIR, increment pointerI.

The only case possible now is that valueQR == valueIR. So the answer to this query is YES (and pointerQ is incremented).

→ Reply

Wind_Eagle

3 years ago, # ^ |

You can write your own hash for hashmap to store pairs.

→ Reply

Dan4Life

3 years ago, # |

← Rev. 5 →

+19

Assuming the biggest number is $$$x$$$, you can represent all pairs of numbers in base $$$x+1$$$ e.g $$$(5,6) = 5*7+6$$$(in base 7 since biggest number is 6) etc. Then the problem degenerates to the 1 integer case. This works for triples and so on (e.g $$$(5,4,6) = 5*(x+1)*(x+1)+4*(x+1)+6$$$ ($$$x=6$$$ in this case)).

Also, $$$x$$$ has to be small enough to fit in array limits if you want to store the converted number in an array) otherwise unordered map(with custom hashing, refer here) has to be used instead of an array to get that $$$O(1)$$$ time complexity.

Not sure if this works directly for negative numbers tho(too noob at base conversion lol). For negative numbers, say the minimum negative number is $$$-y$$$, just add offset of $$$y$$$ to all numbers then apply base conversion then hash. To look for a pair, add the offset to the pair, convert to the right base, then search.

In the problem given, the max number is $$$1e5$$$, so its enough to represent every pair of integers in base $$$1e5+1$$$ where the first number is the tens digit and the second one is the units digit

→ Reply

Mike_Mirzayanov_Is_Pussy

3 years ago, # ^ |

In the problem given, the max number is $$$1e5$$$, so its enough to represent every pair of integers in base $$$1e5+1$$$ where the first number is the tens digit and the second one is the units digit

counterexample: If input is $$$(1e5,1)$$$, then in base $$$1e5+1$$$ it would be $$$1e5*(1e5+1)+1$$$ i.e greater than $$$1e10$$$ and we can't store it in array. So I think your solution won't work for given constraints.

→ Reply

Dan4Life

3 years ago, # ^ |

Also, x has to be small enough to fit in array limits if you want to store the converted number in an array) otherwise unordered map(with custom hashing, refer here) has to be used instead of an array to get that O(1) time complexity.

→ Reply

grapo_Oranges

3 years ago, # |

+13

I'm not sure about the performance of hashmap, if you're going to test it on all possible combinations within the given constraints i.e. $$$[1, 10^{5}]$$$. I'm actually being serious about how to solve this in $$$O(1)$$$ time complexity. Use std :: bitset of size $$$10^{10}$$$.

Use the the following indexing idx = first.value * 10^5 + second.value

Also declare your bitset using this way, as many compilers don't allow direct declaration due to large memory allocation.

std::vector<std::bitset<1000000000UL>> wrapper(1); auto &hash = wrapper[0];

→ Reply

lis05

3 years ago, # ^ |

1.25gb, it's too much.

→ Reply

defnotmee

3 years ago, # |

-8

The easy method is parsing the pair into a string (like transforming {1,3} into "1///3") and using unordered_map on top of that. Hacks are a problem but you can actually scale that for bigger constraints unlike bitset.

The best way to do this if you dont mind being hacked is by hashing the number like this: multiply the first number by whatever the maximum bound is + 1 and then sum it with the second number. It will give you a unique number for each of them. Then, take that modulo a big prime number, but smaller than what you can allocate in memory. Then do a marcation array for that hash, and when you want to see if a pair is in the array, you check for the hash instead. If youre worried it may not pass the test cases because of collisions, do like 2 or 3 primes and/or use a random prime and it would be comically difficult to fail randomly

→ Reply

lis05

3 years ago, # |

if hashmap works in $$$O(1)$$$ for you, you can hash pair $$$a,b$$$ as $$$2e5*a+b$$$.

→ Reply

jeroenodb

3 years ago, # |

← Rev. 2 →

A way to get O(1) worst case with a low constant, but with a certain error probability is to use a bloomfilter.

A bloomfilter is essentially a bitset and multiple (randomly chosen) hash functions. If you have a pair, you can calculate (for example) 3 hash functions, and set the bits at those positions to true. When checking if a pair is in the bitset, calculate all 3 hash functions of this pair, and check if all bits at these positions are set to true.

If you have an appropriate family of hash functions that hashes pairs of integers to integers from 0 to M-1 (Where M is the size of bitset), you can achieve a really low false positive error probability.

This approach is O(n) preprocessing and O(1) querying.

→ Reply