kingmoshe's blog

By kingmoshe, history, 4 months ago, In English

Hello everyone,

Since the practice round doesn't seem to provide a scoreboard, I'd like to open this thread to discuss scores and possible approaches for the problem. We got the following scores after some ideas:

A: 2

B: 5

C: 5

D: 1802

E: 2083

Total: 3897

We also tryied to calculate an upper bound for some of the tasks:

D upper bound: 1900

E upper bound: 2804 (probably can be significantly improved)

I will post my ideas in the comments. Did anyone managed to get a better sccore? or had an insight to the problem?

 
 
 
 
  • Vote: I like it
  • +27
  • Vote: I do not like it

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).

»
4 months ago, # |
Rev. 2   Vote: I like it +8 Vote: I do not like it

Upper bound explanation(I worked on the problem with kingmoshe):

First we need to explain the graph of clients. We represent each client as a vertex in a graph. Two clients or vertexs has an edge between them if there is no pizza that can satisfy both clients(for exmple one cllients like cheese and the other doesnt like cheese). It is easy to see that if you take a group of vertexs with no edges between each two vertex, then there is a pizza that satisfy does clients vertexes. Also it is very easy to see, that no pizza could satisfy 2 vertexes with an edge between them.

Therefore we can make a reduction to an max anti-clique problem(which is NP).

We did the upper bound by: 1. Search for the biggest clique that we can find. 2. We deduce that only one vertex from the clique could be chosen. 3. Remove the clique from the clients and start again. Then the number of cliques we find is a good upper bound.

Where in d we found 1900 cliques. And in e we found 2804 cliques.

This is approach for a good upper bound that could obviously be reduced more.

»
4 months ago, # |
  Vote: I like it +8 Vote: I do not like it

Static solution for e that reached 2025 points(to get to 2078 we did a lot of dynamic approach of approving a given output). 1. Calculate the degree of each vertex. 2. Chosen the vertex of the lowest degree. 3. Remove the vertex and his neighbors of the graph. 4. go to step 1.

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How did you improve the score from 2025 to 2078?

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      We had two main *dynamic improvments

      1) We allready have a function for finding a large anti-clique so lets try the next improvment: a) take randomly half of you anti-clique b) remove from the graph all nodes in that half and all of their neighbor c) calculate the large anti-clique of the remaining graph d) if the new anti-clique plus the old half is larger than the old anti-clique keep it, otherwise remain with old anti-clique e) go back to part a for 1000 times

      2) lets call the selected anti-clique C, and the rest of the graph G a) create a new list, lets call it K b) pick at random a node from G that has a minimal amount of neighbors in C, and add it to K c) remove that node, and any of its neighbors (in G) from G d) remove all of the node neighbors from C e) if the amount of nodes in K is larger than the amount of nodes removed from C than K is better than the removed nodes in C ( and we increased the solution) f) repeat those steps 1000 times

      both of this improvments helped allot, but using both of them is what really improved, also after trieng a little bit more random, we manged to increase our score to 2803

      • dynamic improvment — take your solution try a small change if the scores increase keep that changed, otherwise dont keep it, try it again and again untill no improvment is found
      • »
        »
        »
        »
        3 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        how the function that calculate large anti-clique works ?
        its already an NP problem

        • »
          »
          »
          »
          »
          3 months ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          basicly, every time take the node with the least amount of neighbors and after taking it delete it's neigbors, if two nodes exists with same amount of nodes, then look at the degrees of its neighbors, you want does degrees to be as big as posible (or more precisley we maximized on the minum degree of the neighbor, the idea is that a neighbor with a small degree as a good chance to be picked later), and of course we added some random along the line, so we could run the process many times and get different outcomes.

  • »
    »
    3 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    hey so I am a beginner and I have absolutely no idea what we have to do so if you don't mind could you explain? I am a student from first year so I don't know dsa or dynamic programming yet.

    • »
      »
      »
      3 months ago, # ^ |
        Vote: I like it +14 Vote: I do not like it

      Hey, no dynamic programming or complicated dsa required yet (although in the qualification round some may take advantage of it) but little knowledge on graph might help.

      First, if it's your question and you are interested, if you didn't already do it find a team of between 2 and 4 people (including you) and register for google Hash code.

      Second, you will be able to submit to the training problem which I explain below in spoiler in order that the post appears shorter.

      subject
      reading, scoring, printing
      dumb solution
      my strategy, ingredient-wise
      best solution explained so far by kingmoshe

      I wrote 500 lines by trying this problem Hope it'll help you and that I did not too much state the obvious, don't hesitate if you have any further question which are not complete solution of the problem

»
4 months ago, # |
  Vote: I like it +8 Vote: I do not like it

Insighets on the problem. Test case e biggest clique is only 3. And after some removel of clique of size 3(~60), the biggest clique is of size 2. Also there are 68 clients of degree 0 which should allways be chosen.

»
4 months ago, # |
  Vote: I like it +5 Vote: I do not like it

A: 2
B: 5
C: 5
D: 1804
E: 2083
Total: 3899

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    cool, what was your solution?

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      Update: D is 1805 points now.
      Nothing interesting. A lot of random swaps in 10 threads for ~30 minutes

»
4 months ago, # |
  Vote: I like it +5 Vote: I do not like it

A — 2 b- 5 C — 4 D — 1697 E — 799 total — 2507

»
4 months ago, # |
  Vote: I like it +24 Vote: I do not like it

The optimal solution of D is 1805, which can be calculated using Binary Linear Programming.

However, E is too large. It cost me about 1h to solve D but I failed to solve E with a night.

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How do you calculate it with binary linear programming? And can you calculate this for c and e as well?

    • »
      »
      »
      4 months ago, # ^ |
      Rev. 3   Vote: I like it +9 Vote: I do not like it

      $$$choose[i]$$$: if we choose ingredient $$$i$$$.

      $$$satistied[i]$$$: if person $$$i$$$'s conditions are satisfied.

      maximize $$$\sum satistied[j]$$$

      s.t. $$$\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j \times satisfied[j]$$$ for $$$j = 1...n$$$, $$$cnt_j$$$ is number of $$$j$$$'s like and dislike.

      when $$$j$$$ is served($$$satisfied[j]=1$$$), that means $$$\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j$$$ so for $$$j~\text{like}~i,choose[i]=1$$$,for $$$j~\text{dislike}~i,choose[i]=0$$$ must hold.

      C's answer is 5. E has too many ingredients so it will consume too much time...

      • »
        »
        »
        »
        3 months ago, # ^ |
          Vote: I like it +7 Vote: I do not like it

        How did you specifically solved it ? I mean, in BIP you cannot drop some constraints as some of your constraints are going to be unsatisfied (we can't serve all the clients). And also, how did you calculate your objective in terms of variables ? Could you please give us more details about your approach ?

        • »
          »
          »
          »
          »
          3 months ago, # ^ |
            Vote: I like it +5 Vote: I do not like it

          Sorry, there are some mistakes in my formula. I have fixed it and added some explanation. Thank you very much.

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I Myself had the following scores: 2 5 4 1697 805

Here is my approch, i created two list wich store liked ingredients and disliked ingredients. Then created a dictionary for both, the key is the string for that ingredient, and the value would be the amount of each ingredient in both of the list. Then i compared the keyvaluepair and if the amount of ingredient in like dictionary is less or equal than in the dislike dictionary, i simply remove that ingredient from the firt liked list. after removing the duplicated string in liked list, i finally get the final output list.

What do you thing of that approach. FYI i use c# as my favourite coding language

  • »
    »
    3 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I had same approach and got points : 2 + 5 + 4 + 1708 + 1004 = 2723

    looking at the discussion above I think they are solving each test case separately. So will try that too. For now the above points are from a single solution.

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Our Scores: 2, 5, 5, 1687, 1849.
Our Approach: Randomized Greedy.

»
3 months ago, # |
Rev. 2   Vote: I like it +26 Vote: I do not like it

Our scores: A: 2 B: 5 C: 5 D: 1805 E: 2085 Total: 3902

We first did a greedy graph algorithm for finding an approximate independent set (picking minimum degree node). Then we iteratively improved the solution with simulated annealing (basically random swapping and checking if the result becomes better). It found the solution of 2084 in E in 10 minutes, while A through D were found in a minute. During a testrun of the code, we scored 2085 in E, but we didn't implement writing the best solutions to a file, so this solution is lost forever.

Edit: Found another 2085 solution and saved the solution this time.

  • »
    »
    3 months ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    How do you do swaps? Randomly picking a vertex you selected and one you did not select probably will give you too often a combination that is invalid?

    • »
      »
      »
      3 months ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      For the swaps we actually did two different things:

      1. Swap a random pizza ingredient from "present on pizza" to "not present on pizza" or vice versa.
      2. Pick a random person that is currently not satisfied and swap all ingredients which prevent him from being satisfied.

      We did a 50/50 split between these two kinds of swaps, which seemed to get the best result.

  • »
    »
    3 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    what is your overall time complexity came?

»
3 months ago, # |
  Vote: I like it -8 Vote: I do not like it

I have only 3398pts

»
3 months ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

A: 2 B: 5 C: 5 D: 1805 E: 2047

Total: 3864

»
3 months ago, # |
  Vote: I like it +19 Vote: I do not like it

My hashcode team required one more person. If anyone wants to participate google hashcode but till now not getting any team. So DM me!

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I accidentally misread the statement and thought of a variant of the problem where one person likes the pizza if it contains at least one of its favourite ingredients. Is this version easier or has an interesting solution?

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Looking for 1-2 last-minute replacements who can actually participate during the timeslot (apparently, timezones are hard). Python preferred but ultimately optional, visualization skills a plus! Got a score of 8,926,023 for last year's qualifying round which is definitely not a brag, but maybe that can speak to expectations (upper bound on stress level, let's say). Get in touch however you like, I will edit/update here if/when the spots are filled.

  • »
    »
    3 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Got a score of 8,926,023 for last year's qualifying round

    Care to explain how?

    • »
      »
      »
      3 months ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      If you're familiar with the problem (archived on kaggle?), then you'll know that despite the number of digits, it's a fairly minimal score. I'd have more useful to say about what not to do...

      • don't team up with random people off the competition's facebook page
      • don't solve an incorrect reading of the problem for a majority of the time
      • don't prematurely optimize out of 'slow python' fears
      • also don't be afraid of rewriting, especially if the thing that 'works' took you less than a majority of the time...?

      Swung a bit too far the opposite way this year, reached out early to people I knew, and welp...

»
3 months ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

A- 2, B- 5, C- 4, D- 1697, E- 799 Total : 2507 Using Hashmap for storing the frequency.

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

D-1539, E-1683..... total=3234, i approach this problem in this way, let us a definite person is in the group (final ans), so mapping its likes and dislikes in HashSet, and then check by including this person, how many other persons can join the group(this is done by checking likes of person can't be in that definite person dislike and vice-versa). and repeat this all for every person

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Hey !

I got 2 — 5 — 5 — 1805 — 2059 with Localsolver and a model based on a set.

A, B and C are proven optimal and I have bounds for the others : 4684 for D (~200s) and 2529 for E (~300s).

The score for E was still slowly increasing but I got bored.