Блог пользователя scott_wu

Автор scott_wu, 10 лет назад, По-английски

Hi everyone! Addepar will be hosting an 8-hour long contest on Friday, June 27th at 4PM PDT. The contest will feature eight algorithmic problems, and prizes will be awarded to the top scorers, including iPad Minis for the top five, Addepar t-shirts for the top fifty, and a free trip to California to meet with Addepar cofounder Joe Lonsdale for the winner. You can register and find out more about the contest here.

Addepar is a tech startup in Mountain View, California that is building the next generation of financial software for institutions. It’s my second month here and it’s been a lot of fun so far -- we have tough challenges to work on and a strong, friendly community. We are actively recruiting and will be in touch with the top contestants about opportunities at Addepar. See our careers page for more info!

Problems were prepared by me (scott_wu), aime15, alex_wang, srnth, and Vladimir Novakovski. Good luck! We hope you enjoy the contest. :)

P.S.: Stay tuned for a possible Codeforces Addepar contest sometime in the near future!

UPD: Congratulations to the top five contestants, who will all be winning iPad Minis:

  1. Alex_2oo8
  2. stevenkplus
  3. random.johnnyh
  4. nika
  5. hogloid

Special congratulations to Alex_2oo8, who will win a free trip to Mountain View to meet with Joe Lonsdale. Thanks to everyone for participating!

  • Проголосовать: нравится
  • +118
  • Проголосовать: не нравится

»
10 лет назад, # |
  Проголосовать: нравится +118 Проголосовать: не нравится

We will be glad to host Codeforces Addepar! )

»
10 лет назад, # |
  Проголосовать: нравится -20 Проголосовать: не нравится

Is this an Art of Problem Solving contest? :)

»
10 лет назад, # |
  Проголосовать: нравится +6 Проголосовать: не нравится

What languages can we submit solutions in?

»
10 лет назад, # |
  Проголосовать: нравится +11 Проголосовать: не нравится

Why so unconvinient time for Russia? =)

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +12 Проголосовать: не нравится

    More like for Europe in general, it's totally through the night.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +32 Проголосовать: не нравится

    Because it is not on codeforces where all contests are convenient for Russia/Europe only

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      I often participate in contests on hackerrank, usually they runs in convinient time for Russia and Europe.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +15 Проголосовать: не нравится

    They're just letting European IOI teams try out what it feels like to write a contest when it's the middle of the night in your "native" time zone (since IOI is taking place in Taiwan this year) :)

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится +14 Проголосовать: не нравится

      :D

      What we (the Slovak team) did last year was arriving 3 days earlier to get used to the jetlag — and take a trip to the ocean. I managed to dip my leg in a roadside swamp on that trip...

      • »
        »
        »
        »
        10 лет назад, # ^ |
          Проголосовать: нравится +23 Проголосовать: не нравится

        Lucky you. We usually arrive at the last possible moment and then sleep at the opening ceremony, because that's how our state funding works.

»
10 лет назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

Contest is live here. Good luck!!

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Judge of "Engulf" seems to be broken. I can't get the results.

»
10 лет назад, # |
  Проголосовать: нравится +11 Проголосовать: не нравится

Will full score be available shortly?

»
10 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

I loved this contest so much! Thanks a lot for the night of fun ^_^

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

The problemset was so nice !! Both normal tasks and mini-marathon tasks were interesting.

I especially like Yaxis. It seems to be geometric problem, but actually data-structure problem.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    What was the solution for that problem?

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +31 Проголосовать: не нравится

    One can leverage a hidden C++ datastructure to solve this problem with very little code

    #include <stdio.h>
    #include <ext/pb_ds/assoc_container.hpp>
    #include <ext/pb_ds/tree_policy.hpp>
    #include <vector>
    #include <algorithm>
    
    using namespace std;
    
    typedef __gnu_pbds::tree< int, __gnu_pbds::null_type, std::less<int>, __gnu_pbds::rb_tree_tag, __gnu_pbds::tree_order_statistics_node_update> ordered_set;
    vector<pair<int, int> > v;
    
    int main()
    {
        ordered_set st;
    
        int n;
        scanf("%d", &n);
    
        for (int i = 0; i < n; ++ i)
        {
            int m, b;
            scanf("%d %d", &m, &b);
            v.push_back(make_pair(b, m));
        }
    
        sort(v.begin(), v.end());
    
        long long ans = 0;
        for (int i = 0; i < n; ++ i)
        {
            long long order = st.order_of_key(v[i].second);
            ans += order * (st.size() - order);
            st.insert(v[i].second);
        }
        st.clear();
        for (int i = n - 1; i >= 0; -- i)
        {
            long long order = st.order_of_key(v[i].second);
            ans += order * (st.size() - order);
            st.insert(v[i].second);
        }
    
        printf("%lld\n", ans);
    
        return 0;
    }
    
»
10 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

how to solve "Computing Analytics"?

  • »
    »
    10 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится +24 Проголосовать: не нравится

    Step 1. If for a certain coordinate there's less than sqrt(n) cache entries, that touch it, just iterate over every pair of cache entries and remember that their other ends are connected (for example into a set of pairs).

    Step 2. If for a certain coordinate there's more than sqrt(n) cache entries, that touch it, store that coordinate into a vector. That vector, obviously, will have no more that sqrt(n) coordinates.

    Step 3. For every query, first check if it is in the set of pairs from the first step. If it is not, then iterate over all sqrt(n) bad coordinates from step 2, and check if any of them is connected to both ends of the query.

  • »
    »
    10 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

    I stored every value that is in a pair with a in a set called adj[a], then for every query (a,b) checked to see if the intersection between adj[a] and adj[b] was not empty and memoized the result:

    if (adj[a].sz > adj[b].sz) swap(a,b);
    bool ok = false;
    if (adj[a].count(b)) ok = true;
    else {
    	for (set<int>::iterator it = adj[a].begin(); it != adj[a].end(); it++) {
    		if (adj[b].count(*it)) {
    			ok = true;
    			break;
    		}
    	}
    }
    ans[make_pair(a,b)]=ok;
    

    Now, the daunting question is... Does that actually work or did I cheat the test cases? :D

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      In the worst case left end of every query and left end of all the cache enties will be in the same point, in which case your solution will be O(n^2 log n). Apparently they didn't have such a test case.

      • »
        »
        »
        »
        10 лет назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        You may have seen the solution before the edit, but what about with the swap(a,b) line?

        • »
          »
          »
          »
          »
          10 лет назад, # ^ |
            Проголосовать: нравится +8 Проголосовать: не нравится

          With swap I guess the worst case is when all the queries have both ends at coordinates that have sqrt(n) cache entries touching them, in which case the complexity is fast enough.

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      This was actually one of the judge solutions. AlexSkidanov's solution above is the other intended solution.

  • »
    »
    10 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится -8 Проголосовать: не нравится

    Can be solved using bit optimization: for each coordinate store a bit-set of opposite cache entry ends. Query is answered by intersecting two bit-sets. Complexity is Q * N / 64.

    **P.S. This solution exceeds memory limit, but the tests were week enough to get passed**

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Luckily, I made it to top 50 (42, to be precise). It's a bit frustrated that I can't solve "Guessing numbers", which seemed to be a moderate problem. Can anyone give me the solution?

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +16 Проголосовать: не нравится

    Pick largest 2^k — 1 numbers out of N and print their sum. If 2^k — 1 > N then answer is 1.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится +13 Проголосовать: не нравится

    HellKitsune's solution is correct, but if you need to see why, represent your guesses as a balanced binary search tree with 2^k nodes. Each guess tells you on which branch the answer is going to be, so if the answer is any of the nodes of this tree, you will get it right.

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

i finished in 28 th position (Programmers' Day). any "lucky" prize for me? :D

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I didn't find anyone solving the last problem because I didn't see the problem statement through, sigh... I thought the last problem is not solvable at the contest.

Now can anyone provide thoughts about the last problem? Thank you.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    The strategy in the sample case (nested rectangles) is optimal. However, it is not always possible to build such nested rectangles (for instance, if the very first color is not the same as the very last one, you will not be able to build the outermost rectangle). For the first test case I got the highest score (25pts), the strategy was to first send a run with an assert to verify that first color is indeed equal to the last color (and it is). Then it is seems reasonable that with two colors the perfect configuration is possible (if you land the very first brick into the second column, then you have plenty of buffer for extra bricks of the first color in the first column and plenty of buffer for extra bricks of the second color in the second column). Naive approach was still hitting the issue when all the columns expect the brick of one color, and the next brick has the other color. To avoid it I was building the answer from both ends, merging in the middle.

    For other cases the my strategy was similar, except that now building the perfect nested rectangles is very unlikely, so one can allocate several rightmost columns for garbage, and try to build the perfect configuration in the remaining columns. Whenever you have a brick that you don't need in any column of your configuration at the moment, just send it to one of the garbage columns.

»
10 лет назад, # |
  Проголосовать: нравится +16 Проголосовать: не нравится

The problems were very interesting. I also liked the combination of standard algorithmic problems (with a known optimal/correct answer) with "partial score" problems (Risk Management and Engulf). I have only encountered this combination in Codechef long contests before and I was happy to see it in the Addepar Hackathon, too.

I am wondering if the submissions made during the contest will become visible (this occurs for most other Hackerrank contests, but it seems that for the Addepar Hackathon this isn't the case, yet). I am particularly interested in reading the code of the top scoring solutions for the above mentioned "partial score" problems.

I will take this opportunity to also provide some feedback on the "Risk Management" scoring function. The fact that any value lower than 90% of the best value found by the problem setters got a score of 0 made it rather difficult to assess improvements — I made many submissions which got 0 points on some test cases and there was no way to know if I was close to the 90% boundary on those test cases or very far away. Maybe this was intended, but in case it wasn't, a scoring function which also provides some non-zero score for submissions obtaining less than the 90% limit would have been better for the contestants. Anyway, eventually I managed to obtain non-zero scores on all the test cases and from then on improvements were easier to assess.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    You have a high score on the Risk Management problem. Can you describe your approach? Did you use some properties of covariance matrix?

    I tried to run Newton's method from random vectors, because it was easy to calculate Hessian matrix here (just the same matrix as in input but with diagonal multiplied by 2). But it seems that local searches are not very good here. So I scored only ~17 points for this problem.

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится +11 Проголосовать: не нравится

      No, I didn't use any properties of the covariance matrix. I started with a graph-based approach which only assigned -1 or +1 to the variables — I sorted the entries of the matrix in decreasing order of their absolute value. Then, for each value A(i,j), if A(i,j)>0 I tried to add i and j in the same component of the graph ; otherwise, if A(i,j) < 0 I tried to add them into different components. Since I only assigned -1 and +1, the graph was always bipartite and whenever I tried to add two vertices in the same component or in different components I only did it if the graph could be maintained bipartite after that. This got me good points on the last 3 test cases, but scored 0 on many of the other test cases.

      For the second part I used an incremental improvement approach. As long as I still had time left I chose a variable randomly — say i. Then I selected the best new value to assign to variable i, under the condition that the values of all the other variables remain the same. This is actually a 2nd degree equation. To keep things simple, however, I didn't use the properties of 2nd degree equations. Instead I iterated through possible values between -1 and +1 with a step of 0.005 and chose the value which brought the largest improvement (if any) this way.

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

How to solve Consistent Salaries ? It seems to be a segment tree, as the there's non sense to make the queries in O(N), but I haven't found a way to build such a tree the ways the problem needs, so I didn't even started to code a solution in contest time.

  • »
    »
    10 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Short answer: you need to keep track the difference in salaries between adjacent vertices. If there are some vertex u such the difference between salaries of parent[u] and u is positive then it's BAD else it's GOOD. And we are able to not care about all other pairs of vertices.

    More detailed:

    You need to figure out what the ways exists how we can present our tree and current salaries.

    Let's discuss the following subproblem: We are required to have such queries:
    1) Add to value val to the subtree of u
    2) Calculate current value of u.

    One of the ways is to write down an array of the vertices during dfs in order of their visit time. Now each subtree is a segment in this array, so you need to add value on the segment and be able to get the current value of some element. But this doesn't leads to the solution. There are some other difficulties we can't handle.

    The other way is to think where the vertices u are placed which influence a change of the salary of v if we add a value to the subtree of u. They are parents of v. So they are placed on the path from the root to the v. So we can come up with an another idea of solving this subproblem (another way to represent our tree and salaries):

    We can write on the edges (parent[u], u) the difference between salaries of u and parent[u]. Then the cost of the path from root to u will be equals to the actual salary of u. And the path cost from some vertex u to v (let's call it pathCost(u, v)) will be equals to the difference salary[v] - salary[u].

    We are able to maintain such thing during processing our queries. To process query 1 we just need to add a value to the edge (parent[u], u). And to answer query 2 we need to calculate cost of the path from root to specific vertex.

    So we should answer BAD if there are exist a pair of vertices u and v that pathCost(u, v) > 0. But to have pathCost(u, v) > 0 we need to have at least one edge with a positive value. But if have a positive value on the some edge (parent[w], w) it's already guarantee that at least salary of w is greater than salary of parent[w].

    So we have to answer GOOD only if we don't have positive edges in the tree and if there are at least one positive edge then answer is BAD. As the result, the only thing we should track is the number of positive edges. So no data structures are actually needed.

    • »
      »
      »
      10 лет назад, # ^ |
        Проголосовать: нравится +14 Проголосовать: не нравится

      I spent quite some time trying to come up with some data structures-based solutions for this problem, without success. Eventually, the obvious and very simple solution that you described occurred to me and I was pleasantly surprised that the problem had such a simple solution :) [ given that, on the surface, it seemed to require some kind of non-trivial data structures ]

      • »
        »
        »
        »
        10 лет назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Talking to a friend, he advised me to use an Link-Cut tree, is possible to model a solution for this problem using this ?

        • »
          »
          »
          »
          »
          10 лет назад, # ^ |
            Проголосовать: нравится +6 Проголосовать: не нравится

          Why would you want to switch an extremely simple array-only solution to a complicated data structure? Your friend is having some weird ideas...

          • »
            »
            »
            »
            »
            »
            10 лет назад, # ^ |
              Проголосовать: нравится 0 Проголосовать: не нравится

            It's obvious this solution is cleaner and better, but it's always good and to learn a new data structure.

            • »
              »
              »
              »
              »
              »
              »
              10 лет назад, # ^ |
                Проголосовать: нравится 0 Проголосовать: не нравится

              Yes, but it's better to learn the data structure in question on a problem that should be solved by it, not on one that shouldn't. It is, after all, important to learn which approaches are appropriate where, not just using cannons for everything (with no time left for the remaining problems...).

              • »
                »
                »
                »
                »
                »
                »
                »
                10 лет назад, # ^ |
                  Проголосовать: нравится +8 Проголосовать: не нравится

                Ok, but then could you suggest some problems that can only be solved using Link-Cut tree? I think those are very very rare. I was only able to find one problem in the past but I forgot the link :)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  10 лет назад, # ^ |
                    Проголосовать: нравится 0 Проголосовать: не нравится

                  That's why I didn't say "solved only by it", but "solved by it". Finding a problem that's truly cut out only for a particular algorithm is hard regardless of which algorithm it is :D

                  I suppose a good example of a problem that's quite suitable for link-cut tree is its motivation problem (implementing link, cut, find).

»
10 лет назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

where can i find the editorial for this contest?

»
10 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

How to solve the problem Guessing numbers?