### kingmoshe's blog

By kingmoshe, history, 4 months ago,

Hello everyone,

Since the practice round doesn't seem to provide a scoreboard, I'd like to open this thread to discuss scores and possible approaches for the problem. We got the following scores after some ideas:

A: 2

B: 5

C: 5

D: 1802

E: 2083

Total: 3897

We also tryied to calculate an upper bound for some of the tasks:

D upper bound: 1900

E upper bound: 2804 (probably can be significantly improved)

I will post my ideas in the comments. Did anyone managed to get a better sccore? or had an insight to the problem?

• +27

 » 4 months ago, # |   0 Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).
 » 4 months ago, # |   0 Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).
 » 4 months ago, # | ← Rev. 2 →   +8 Upper bound explanation(I worked on the problem with kingmoshe):First we need to explain the graph of clients. We represent each client as a vertex in a graph. Two clients or vertexs has an edge between them if there is no pizza that can satisfy both clients(for exmple one cllients like cheese and the other doesnt like cheese). It is easy to see that if you take a group of vertexs with no edges between each two vertex, then there is a pizza that satisfy does clients vertexes. Also it is very easy to see, that no pizza could satisfy 2 vertexes with an edge between them.Therefore we can make a reduction to an max anti-clique problem(which is NP).We did the upper bound by: 1. Search for the biggest clique that we can find. 2. We deduce that only one vertex from the clique could be chosen. 3. Remove the clique from the clients and start again. Then the number of cliques we find is a good upper bound.Where in d we found 1900 cliques. And in e we found 2804 cliques. This is approach for a good upper bound that could obviously be reduced more.
 » 4 months ago, # |   +8 Static solution for e that reached 2025 points(to get to 2078 we did a lot of dynamic approach of approving a given output). 1. Calculate the degree of each vertex. 2. Chosen the vertex of the lowest degree. 3. Remove the vertex and his neighbors of the graph. 4. go to step 1.
•  » » 4 months ago, # ^ |   0 How did you improve the score from 2025 to 2078?
•  » » » 4 months ago, # ^ |   0 We had two main *dynamic improvments1) We allready have a function for finding a large anti-clique so lets try the next improvment: a) take randomly half of you anti-clique b) remove from the graph all nodes in that half and all of their neighbor c) calculate the large anti-clique of the remaining graph d) if the new anti-clique plus the old half is larger than the old anti-clique keep it, otherwise remain with old anti-clique e) go back to part a for 1000 times2) lets call the selected anti-clique C, and the rest of the graph G a) create a new list, lets call it K b) pick at random a node from G that has a minimal amount of neighbors in C, and add it to K c) remove that node, and any of its neighbors (in G) from G d) remove all of the node neighbors from C e) if the amount of nodes in K is larger than the amount of nodes removed from C than K is better than the removed nodes in C ( and we increased the solution) f) repeat those steps 1000 timesboth of this improvments helped allot, but using both of them is what really improved, also after trieng a little bit more random, we manged to increase our score to 2803 dynamic improvment — take your solution try a small change if the scores increase keep that changed, otherwise dont keep it, try it again and again untill no improvment is found
•  » » » » 3 months ago, # ^ |   0 how the function that calculate large anti-clique works ? its already an NP problem
•  » » » » » 3 months ago, # ^ |   0 basicly, every time take the node with the least amount of neighbors and after taking it delete it's neigbors, if two nodes exists with same amount of nodes, then look at the degrees of its neighbors, you want does degrees to be as big as posible (or more precisley we maximized on the minum degree of the neighbor, the idea is that a neighbor with a small degree as a good chance to be picked later), and of course we added some random along the line, so we could run the process many times and get different outcomes.
•  » » 3 months ago, # ^ |   +3 hey so I am a beginner and I have absolutely no idea what we have to do so if you don't mind could you explain? I am a student from first year so I don't know dsa or dynamic programming yet.
• »
»
»
3 months ago, # ^ |
+14

Hey, no dynamic programming or complicated dsa required yet (although in the qualification round some may take advantage of it) but little knowledge on graph might help.

First, if it's your question and you are interested, if you didn't already do it find a team of between 2 and 4 people (including you) and register for google Hash code.

Second, you will be able to submit to the training problem which I explain below in spoiler in order that the post appears shorter.

subject
dumb solution
my strategy, ingredient-wise
best solution explained so far by kingmoshe

I wrote 500 lines by trying this problem Hope it'll help you and that I did not too much state the obvious, don't hesitate if you have any further question which are not complete solution of the problem

•  » » » » 3 months ago, # ^ |   +5 what a great explanation!
 » 4 months ago, # |   +8 Insighets on the problem. Test case e biggest clique is only 3. And after some removel of clique of size 3(~60), the biggest clique is of size 2. Also there are 68 clients of degree 0 which should allways be chosen.
 » 4 months ago, # |   +5 A: 2B: 5C: 5D: 1804E: 2083Total: 3899
•  » » 4 months ago, # ^ |   0 cool, what was your solution?
•  » » » 4 months ago, # ^ |   +5 Update: D is 1805 points now.Nothing interesting. A lot of random swaps in 10 threads for ~30 minutes
 » 4 months ago, # |   +5 A — 2 b- 5 C — 4 D — 1697 E — 799 total — 2507
 » 4 months ago, # |   +24 The optimal solution of D is 1805, which can be calculated using Binary Linear Programming.However, E is too large. It cost me about 1h to solve D but I failed to solve E with a night.
•  » » 4 months ago, # ^ |   0 How do you calculate it with binary linear programming? And can you calculate this for c and e as well?
•  » » » 4 months ago, # ^ | ← Rev. 3 →   +9 $choose[i]$: if we choose ingredient $i$.$satistied[i]$: if person $i$'s conditions are satisfied.maximize $\sum satistied[j]$s.t. $\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j \times satisfied[j]$ for $j = 1...n$, $cnt_j$ is number of $j$'s like and dislike.when $j$ is served($satisfied[j]=1$), that means $\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j$ so for $j~\text{like}~i,choose[i]=1$,for $j~\text{dislike}~i,choose[i]=0$ must hold.C's answer is 5. E has too many ingredients so it will consume too much time...
•  » » » » 3 months ago, # ^ |   +7 How did you specifically solved it ? I mean, in BIP you cannot drop some constraints as some of your constraints are going to be unsatisfied (we can't serve all the clients). And also, how did you calculate your objective in terms of variables ? Could you please give us more details about your approach ?
•  » » » » » 3 months ago, # ^ |   +5 Sorry, there are some mistakes in my formula. I have fixed it and added some explanation. Thank you very much.
 » 4 months ago, # |   0 I Myself had the following scores: 2 5 4 1697 805Here is my approch, i created two list wich store liked ingredients and disliked ingredients. Then created a dictionary for both, the key is the string for that ingredient, and the value would be the amount of each ingredient in both of the list. Then i compared the keyvaluepair and if the amount of ingredient in like dictionary is less or equal than in the dislike dictionary, i simply remove that ingredient from the firt liked list. after removing the duplicated string in liked list, i finally get the final output list.What do you thing of that approach. FYI i use c# as my favourite coding language
•  » » 3 months ago, # ^ |   0 I had same approach and got points : 2 + 5 + 4 + 1708 + 1004 = 2723looking at the discussion above I think they are solving each test case separately. So will try that too. For now the above points are from a single solution.
 » 4 months ago, # |   0 Our Scores: 2, 5, 5, 1687, 1849. Our Approach: Randomized Greedy.
 » 3 months ago, # | ← Rev. 2 →   +26 Our scores: A: 2 B: 5 C: 5 D: 1805 E: 2085 Total: 3902We first did a greedy graph algorithm for finding an approximate independent set (picking minimum degree node). Then we iteratively improved the solution with simulated annealing (basically random swapping and checking if the result becomes better). It found the solution of 2084 in E in 10 minutes, while A through D were found in a minute. During a testrun of the code, we scored 2085 in E, but we didn't implement writing the best solutions to a file, so this solution is lost forever.Edit: Found another 2085 solution and saved the solution this time.
•  » » 3 months ago, # ^ |   +8 How do you do swaps? Randomly picking a vertex you selected and one you did not select probably will give you too often a combination that is invalid?
•  » » » 3 months ago, # ^ |   +8 For the swaps we actually did two different things: Swap a random pizza ingredient from "present on pizza" to "not present on pizza" or vice versa. Pick a random person that is currently not satisfied and swap all ingredients which prevent him from being satisfied. We did a 50/50 split between these two kinds of swaps, which seemed to get the best result.
•  » » 3 months ago, # ^ |   0 what is your overall time complexity came?
 » 3 months ago, # |   -8 I have only 3398pts
 » 3 months ago, # | ← Rev. 2 →   0 A: 2 B: 5 C: 5 D: 1805 E: 2047Total: 3864
 » 3 months ago, # |   +19 My hashcode team required one more person. If anyone wants to participate google hashcode but till now not getting any team. So DM me!
 » 3 months ago, # |   0 I accidentally misread the statement and thought of a variant of the problem where one person likes the pizza if it contains at least one of its favourite ingredients. Is this version easier or has an interesting solution?
 » 3 months ago, # |   0 Looking for 1-2 last-minute replacements who can actually participate during the timeslot (apparently, timezones are hard). Python preferred but ultimately optional, visualization skills a plus! Got a score of 8,926,023 for last year's qualifying round which is definitely not a brag, but maybe that can speak to expectations (upper bound on stress level, let's say). Get in touch however you like, I will edit/update here if/when the spots are filled.
•  » » 3 months ago, # ^ |   0 Got a score of 8,926,023 for last year's qualifying round Care to explain how?
•  » » » 3 months ago, # ^ |   +5 If you're familiar with the problem (archived on kaggle?), then you'll know that despite the number of digits, it's a fairly minimal score. I'd have more useful to say about what not to do... don't team up with random people off the competition's facebook page don't solve an incorrect reading of the problem for a majority of the time don't prematurely optimize out of 'slow python' fears also don't be afraid of rewriting, especially if the thing that 'works' took you less than a majority of the time...? Swung a bit too far the opposite way this year, reached out early to people I knew, and welp...
•  » » » » 3 months ago, # ^ |   +5 Thank you!
 » 3 months ago, # | ← Rev. 2 →   0 A- 2, B- 5, C- 4, D- 1697, E- 799 Total : 2507 Using Hashmap for storing the frequency.
 » 3 months ago, # |   0 D-1539, E-1683..... total=3234, i approach this problem in this way, let us a definite person is in the group (final ans), so mapping its likes and dislikes in HashSet, and then check by including this person, how many other persons can join the group(this is done by checking likes of person can't be in that definite person dislike and vice-versa). and repeat this all for every person
 » 3 months ago, # |   0 Hey !I got 2 — 5 — 5 — 1805 — 2059 with Localsolver and a model based on a set.A, B and C are proven optimal and I have bounds for the others : 4684 for D (~200s) and 2529 for E (~300s).The score for E was still slowly increasing but I got bored.