kingmoshe's blog

By kingmoshe, history, 2 years ago, In English

Hello everyone,

Since the practice round doesn't seem to provide a scoreboard, I'd like to open this thread to discuss scores and possible approaches for the problem. We got the following scores after some ideas:

A: 2

B: 5

C: 5

D: 1802

E: 2083

Total: 3897

We also tryied to calculate an upper bound for some of the tasks:

D upper bound: 1900

E upper bound: 2804 (probably can be significantly improved)

I will post my ideas in the comments. Did anyone managed to get a better sccore? or had an insight to the problem?

  • Vote: I like it
  • +27
  • Vote: I do not like it

| Write comment?
»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by kingmoshe (previous revision, new revision, compare).

»
2 years ago, # |
Rev. 2   Vote: I like it +8 Vote: I do not like it

Upper bound explanation(I worked on the problem with kingmoshe):

First we need to explain the graph of clients. We represent each client as a vertex in a graph. Two clients or vertexs has an edge between them if there is no pizza that can satisfy both clients(for exmple one cllients like cheese and the other doesnt like cheese). It is easy to see that if you take a group of vertexs with no edges between each two vertex, then there is a pizza that satisfy does clients vertexes. Also it is very easy to see, that no pizza could satisfy 2 vertexes with an edge between them.

Therefore we can make a reduction to an max anti-clique problem(which is NP).

We did the upper bound by: 1. Search for the biggest clique that we can find. 2. We deduce that only one vertex from the clique could be chosen. 3. Remove the clique from the clients and start again. Then the number of cliques we find is a good upper bound.

Where in d we found 1900 cliques. And in e we found 2804 cliques.

This is approach for a good upper bound that could obviously be reduced more.

»
2 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Static solution for e that reached 2025 points(to get to 2078 we did a lot of dynamic approach of approving a given output). 1. Calculate the degree of each vertex. 2. Chosen the vertex of the lowest degree. 3. Remove the vertex and his neighbors of the graph. 4. go to step 1.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How did you improve the score from 2025 to 2078?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      We had two main *dynamic improvments

      1) We allready have a function for finding a large anti-clique so lets try the next improvment: a) take randomly half of you anti-clique b) remove from the graph all nodes in that half and all of their neighbor c) calculate the large anti-clique of the remaining graph d) if the new anti-clique plus the old half is larger than the old anti-clique keep it, otherwise remain with old anti-clique e) go back to part a for 1000 times

      2) lets call the selected anti-clique C, and the rest of the graph G a) create a new list, lets call it K b) pick at random a node from G that has a minimal amount of neighbors in C, and add it to K c) remove that node, and any of its neighbors (in G) from G d) remove all of the node neighbors from C e) if the amount of nodes in K is larger than the amount of nodes removed from C than K is better than the removed nodes in C ( and we increased the solution) f) repeat those steps 1000 times

      both of this improvments helped allot, but using both of them is what really improved, also after trieng a little bit more random, we manged to increase our score to 2803

      • dynamic improvment — take your solution try a small change if the scores increase keep that changed, otherwise dont keep it, try it again and again untill no improvment is found
      • »
        »
        »
        »
        2 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        how the function that calculate large anti-clique works ?
        its already an NP problem

        • »
          »
          »
          »
          »
          2 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          basicly, every time take the node with the least amount of neighbors and after taking it delete it's neigbors, if two nodes exists with same amount of nodes, then look at the degrees of its neighbors, you want does degrees to be as big as posible (or more precisley we maximized on the minum degree of the neighbor, the idea is that a neighbor with a small degree as a good chance to be picked later), and of course we added some random along the line, so we could run the process many times and get different outcomes.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    hey so I am a beginner and I have absolutely no idea what we have to do so if you don't mind could you explain? I am a student from first year so I don't know dsa or dynamic programming yet.

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it +14 Vote: I do not like it

      Hey, no dynamic programming or complicated dsa required yet (although in the qualification round some may take advantage of it) but little knowledge on graph might help.

      First, if it's your question and you are interested, if you didn't already do it find a team of between 2 and 4 people (including you) and register for google Hash code.

      Second, you will be able to submit to the training problem which I explain below in spoiler in order that the post appears shorter.

      subject
      reading, scoring, printing
      dumb solution
      my strategy, ingredient-wise
      best solution explained so far by kingmoshe

      I wrote 500 lines by trying this problem Hope it'll help you and that I did not too much state the obvious, don't hesitate if you have any further question which are not complete solution of the problem

»
2 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Insighets on the problem. Test case e biggest clique is only 3. And after some removel of clique of size 3(~60), the biggest clique is of size 2. Also there are 68 clients of degree 0 which should allways be chosen.

»
2 years ago, # |
  Vote: I like it +5 Vote: I do not like it

A: 2
B: 5
C: 5
D: 1804
E: 2083
Total: 3899

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    cool, what was your solution?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      Update: D is 1805 points now.
      Nothing interesting. A lot of random swaps in 10 threads for ~30 minutes

»
2 years ago, # |
  Vote: I like it +5 Vote: I do not like it

A — 2 b- 5 C — 4 D — 1697 E — 799 total — 2507

»
2 years ago, # |
  Vote: I like it +24 Vote: I do not like it

The optimal solution of D is 1805, which can be calculated using Binary Linear Programming.

However, E is too large. It cost me about 1h to solve D but I failed to solve E with a night.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How do you calculate it with binary linear programming? And can you calculate this for c and e as well?

    • »
      »
      »
      2 years ago, # ^ |
      Rev. 3   Vote: I like it +9 Vote: I do not like it

      $$$choose[i]$$$: if we choose ingredient $$$i$$$.

      $$$satistied[i]$$$: if person $$$i$$$'s conditions are satisfied.

      maximize $$$\sum satistied[j]$$$

      s.t. $$$\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j \times satisfied[j]$$$ for $$$j = 1...n$$$, $$$cnt_j$$$ is number of $$$j$$$'s like and dislike.

      when $$$j$$$ is served($$$satisfied[j]=1$$$), that means $$$\sum_{j~\text{like}~i} choose[i] + \sum_{j~\text{dislike}~i}~(1-choose[i]) \geq cnt_j$$$ so for $$$j~\text{like}~i,choose[i]=1$$$,for $$$j~\text{dislike}~i,choose[i]=0$$$ must hold.

      C's answer is 5. E has too many ingredients so it will consume too much time...

      • »
        »
        »
        »
        2 years ago, # ^ |
          Vote: I like it +7 Vote: I do not like it

        How did you specifically solved it ? I mean, in BIP you cannot drop some constraints as some of your constraints are going to be unsatisfied (we can't serve all the clients). And also, how did you calculate your objective in terms of variables ? Could you please give us more details about your approach ?

        • »
          »
          »
          »
          »
          2 years ago, # ^ |
            Vote: I like it +5 Vote: I do not like it

          Sorry, there are some mistakes in my formula. I have fixed it and added some explanation. Thank you very much.

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Our Scores: 2, 5, 5, 1687, 1849.
Our Approach: Randomized Greedy.

»
2 years ago, # |
Rev. 2   Vote: I like it +26 Vote: I do not like it

Our scores: A: 2 B: 5 C: 5 D: 1805 E: 2085 Total: 3902

We first did a greedy graph algorithm for finding an approximate independent set (picking minimum degree node). Then we iteratively improved the solution with simulated annealing (basically random swapping and checking if the result becomes better). It found the solution of 2084 in E in 10 minutes, while A through D were found in a minute. During a testrun of the code, we scored 2085 in E, but we didn't implement writing the best solutions to a file, so this solution is lost forever.

Edit: Found another 2085 solution and saved the solution this time.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    How do you do swaps? Randomly picking a vertex you selected and one you did not select probably will give you too often a combination that is invalid?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      For the swaps we actually did two different things:

      1. Swap a random pizza ingredient from "present on pizza" to "not present on pizza" or vice versa.
      2. Pick a random person that is currently not satisfied and swap all ingredients which prevent him from being satisfied.

      We did a 50/50 split between these two kinds of swaps, which seemed to get the best result.

»
2 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

A: 2 B: 5 C: 5 D: 1805 E: 2047

Total: 3864

»
2 years ago, # |
  Vote: I like it +19 Vote: I do not like it

My hashcode team required one more person. If anyone wants to participate google hashcode but till now not getting any team. So DM me!

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I accidentally misread the statement and thought of a variant of the problem where one person likes the pizza if it contains at least one of its favourite ingredients. Is this version easier or has an interesting solution?

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Looking for 1-2 last-minute replacements who can actually participate during the timeslot (apparently, timezones are hard). Python preferred but ultimately optional, visualization skills a plus! Got a score of 8,926,023 for last year's qualifying round which is definitely not a brag, but maybe that can speak to expectations (upper bound on stress level, let's say). Get in touch however you like, I will edit/update here if/when the spots are filled.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Got a score of 8,926,023 for last year's qualifying round

    Care to explain how?

    • »
      »
      »
      2 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      If you're familiar with the problem (archived on kaggle?), then you'll know that despite the number of digits, it's a fairly minimal score. I'd have more useful to say about what not to do...

      • don't team up with random people off the competition's facebook page
      • don't solve an incorrect reading of the problem for a majority of the time
      • don't prematurely optimize out of 'slow python' fears
      • also don't be afraid of rewriting, especially if the thing that 'works' took you less than a majority of the time...?

      Swung a bit too far the opposite way this year, reached out early to people I knew, and welp...