Addepar Hackathon - Codeforces

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	atcoder_official	148
5	-is-this-fft-	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

Hi everyone! Addepar will be hosting an 8-hour long contest on Friday, June 27th at 4PM PDT. The contest will feature eight algorithmic problems, and prizes will be awarded to the top scorers, including iPad Minis for the top five, Addepar t-shirts for the top fifty, and a free trip to California to meet with Addepar cofounder Joe Lonsdale for the winner. You can register and find out more about the contest here.

Addepar is a tech startup in Mountain View, California that is building the next generation of financial software for institutions. It’s my second month here and it’s been a lot of fun so far -- we have tough challenges to work on and a strong, friendly community. We are actively recruiting and will be in touch with the top contestants about opportunities at Addepar. See our careers page for more info!

Problems were prepared by me (scott_wu), aime15, alex_wang, srnth, and Vladimir Novakovski. Good luck! We hope you enjoy the contest. :)

P.S.: Stay tuned for a possible Codeforces Addepar contest sometime in the near future!

UPD: Congratulations to the top five contestants, who will all be winning iPad Minis:

Special congratulations to Alex_2oo8, who will win a free trip to Mountain View to meet with Joe Lonsdale. Thanks to everyone for participating!

#include <stdio.h> #include <ext/pb_ds/assoc_container.hpp> #include <ext/pb_ds/tree_policy.hpp> #include <vector> #include <algorithm> using namespace std; typedef __gnu_pbds::tree< int, __gnu_pbds::null_type, std::less<int>, __gnu_pbds::rb_tree_tag, __gnu_pbds::tree_order_statistics_node_update> ordered_set; vector<pair<int, int> > v; int main() { ordered_set st; int n; scanf("%d", &n); for (int i = 0; i < n; ++ i) { int m, b; scanf("%d %d", &m, &b); v.push_back(make_pair(b, m)); } sort(v.begin(), v.end()); long long ans = 0; for (int i = 0; i < n; ++ i) { long long order = st.order_of_key(v[i].second); ans += order * (st.size() - order); st.insert(v[i].second); } st.clear(); for (int i = n - 1; i >= 0; -- i) { long long order = st.order_of_key(v[i].second); ans += order * (st.size() - order); st.insert(v[i].second); } printf("%lld\n", ans); return 0; }

if (adj[a].sz > adj[b].sz) swap(a,b); bool ok = false; if (adj[a].count(b)) ok = true; else { for (set<int>::iterator it = adj[a].begin(); it != adj[a].end(); it++) { if (adj[b].count(*it)) { ok = true; break; } } } ans[make_pair(a,b)]=ok;

Comments (54)

Write comment?

MikeMirzayanov

10 years ago, # |

+118

We will be glad to host Codeforces Addepar! )

→ Reply

AkshajK

-20

Is this an Art of Problem Solving contest? :)

sreddy2

What languages can we submit solutions in?

srnth

10 years ago, # ^ |

A list of supported languages can be found at https://hackerrank.com/environment.

zeliboba

+11

Why so unconvinient time for Russia? =)

Xellos

+12

More like for Europe in general, it's totally through the night.

Coder

+32

Because it is not on codeforces where all contests are convenient for Russia/Europe only

I often participate in contests on hackerrank, usually they runs in convinient time for Russia and Europe.

popoffka

+15

They're just letting European IOI teams try out what it feels like to write a contest when it's the middle of the night in your "native" time zone (since IOI is taking place in Taiwan this year) :)

+14

What we (the Slovak team) did last year was arriving 3 days earlier to get used to the jetlag — and take a trip to the ocean. I managed to dip my leg in a roadside swamp on that trip...

+23

Lucky you. We usually arrive at the last possible moment and then sleep at the opening ceremony, because that's how our state funding works.

aime15

Contest is live here. Good luck!!

hogloid

Judge of "Engulf" seems to be broken. I can't get the results.

Egor

Will full score be available shortly?

scott_wu

Yes, scores should be up in about 15 or 20 minutes.

Thanks for participating in the contest!

AlexSkidanov

We can just share the results for the last problem. I have 874, 3K, 15K and 20K.

Alex_2oo8

I have 774, 7018, 35821, 35819

mmaxio

Are these in order? I have 21k for #3, but only 8k for #4 :D

HellKitsune

740 5670 54678 34850

I have 708,5.8K, 52.5K,43K

I loved this contest so much! Thanks a lot for the night of fun ^_^

Glad you enjoyed it. Thanks for participating!

The problemset was so nice !! Both normal tasks and mini-marathon tasks were interesting.

I especially like Yaxis. It seems to be geometric problem, but actually data-structure problem.

xennygrimmato

What was the solution for that problem?

+31

One can leverage a hidden C++ datastructure to solve this problem with very little code

oversolver

how to solve "Computing Analytics"?

← Rev. 2 →

+24

Step 1. If for a certain coordinate there's less than sqrt(n) cache entries, that touch it, just iterate over every pair of cache entries and remember that their other ends are connected (for example into a set of pairs).

Step 2. If for a certain coordinate there's more than sqrt(n) cache entries, that touch it, store that coordinate into a vector. That vector, obviously, will have no more that sqrt(n) coordinates.

Step 3. For every query, first check if it is in the set of pairs from the first step. If it is not, then iterate over all sqrt(n) bad coordinates from step 2, and check if any of them is connected to both ends of the query.

ffao

I stored every value that is in a pair with a in a set called adj[a], then for every query (a,b) checked to see if the intersection between adj[a] and adj[b] was not empty and memoized the result:

Now, the daunting question is... Does that actually work or did I cheat the test cases? :D

In the worst case left end of every query and left end of all the cache enties will be in the same point, in which case your solution will be O(n^2 log n). Apparently they didn't have such a test case.

You may have seen the solution before the edit, but what about with the swap(a,b) line?

With swap I guess the worst case is when all the queries have both ends at coordinates that have sqrt(n) cache entries touching them, in which case the complexity is fast enough.

This was actually one of the judge solutions. AlexSkidanov's solution above is the other intended solution.

indy256

-8

Can be solved using bit optimization: for each coordinate store a bit-set of opposite cache entry ends. Query is answered by intersecting two bit-sets. Complexity is Q * N / 64.

**P.S. This solution exceeds memory limit, but the tests were week enough to get passed**

Aquacloud

Luckily, I made it to top 50 (42, to be precise). It's a bit frustrated that I can't solve "Guessing numbers", which seemed to be a moderate problem. Can anyone give me the solution?

+16

Pick largest 2^k — 1 numbers out of N and print their sum. If 2^k — 1 > N then answer is 1.

+13

HellKitsune's solution is correct, but if you need to see why, represent your guesses as a balanced binary search tree with 2^k nodes. Each guess tells you on which branch the answer is going to be, so if the answer is any of the nodes of this tree, you will get it right.

JuanMata

i finished in 2⁸ th position (Programmers' Day). any "lucky" prize for me? :D

ysymyth

I didn't find anyone solving the last problem because I didn't see the problem statement through, sigh... I thought the last problem is not solvable at the contest.

Now can anyone provide thoughts about the last problem? Thank you.

The strategy in the sample case (nested rectangles) is optimal. However, it is not always possible to build such nested rectangles (for instance, if the very first color is not the same as the very last one, you will not be able to build the outermost rectangle). For the first test case I got the highest score (25pts), the strategy was to first send a run with an assert to verify that first color is indeed equal to the last color (and it is). Then it is seems reasonable that with two colors the perfect configuration is possible (if you land the very first brick into the second column, then you have plenty of buffer for extra bricks of the first color in the first column and plenty of buffer for extra bricks of the second color in the second column). Naive approach was still hitting the issue when all the columns expect the brick of one color, and the next brick has the other color. To avoid it I was building the answer from both ends, merging in the middle.

For other cases the my strategy was similar, except that now building the perfect nested rectangles is very unlikely, so one can allocate several rightmost columns for garbage, and try to build the perfect configuration in the remaining columns. Whenever you have a brick that you don't need in any column of your configuration at the moment, just send it to one of the garbage columns.

mugurelionut

The problems were very interesting. I also liked the combination of standard algorithmic problems (with a known optimal/correct answer) with "partial score" problems (Risk Management and Engulf). I have only encountered this combination in Codechef long contests before and I was happy to see it in the Addepar Hackathon, too.

I am wondering if the submissions made during the contest will become visible (this occurs for most other Hackerrank contests, but it seems that for the Addepar Hackathon this isn't the case, yet). I am particularly interested in reading the code of the top scoring solutions for the above mentioned "partial score" problems.

I will take this opportunity to also provide some feedback on the "Risk Management" scoring function. The fact that any value lower than 90% of the best value found by the problem setters got a score of 0 made it rather difficult to assess improvements — I made many submissions which got 0 points on some test cases and there was no way to know if I was close to the 90% boundary on those test cases or very far away. Maybe this was intended, but in case it wasn't, a scoring function which also provides some non-zero score for submissions obtaining less than the 90% limit would have been better for the contestants. Anyway, eventually I managed to obtain non-zero scores on all the test cases and from then on improvements were easier to assess.

cmd

You have a high score on the Risk Management problem. Can you describe your approach? Did you use some properties of covariance matrix?

I tried to run Newton's method from random vectors, because it was easy to calculate Hessian matrix here (just the same matrix as in input but with diagonal multiplied by 2). But it seems that local searches are not very good here. So I scored only ~17 points for this problem.

No, I didn't use any properties of the covariance matrix. I started with a graph-based approach which only assigned -1 or +1 to the variables — I sorted the entries of the matrix in decreasing order of their absolute value. Then, for each value A(i,j), if A(i,j)>0 I tried to add i and j in the same component of the graph ; otherwise, if A(i,j) < 0 I tried to add them into different components. Since I only assigned -1 and +1, the graph was always bipartite and whenever I tried to add two vertices in the same component or in different components I only did it if the graph could be maintained bipartite after that. This got me good points on the last 3 test cases, but scored 0 on many of the other test cases.

For the second part I used an incremental improvement approach. As long as I still had time left I chose a variable randomly — say i. Then I selected the best new value to assign to variable i, under the condition that the values of all the other variables remain the same. This is actually a 2nd degree equation. To keep things simple, however, I didn't use the properties of 2nd degree equations. Instead I iterated through possible values between -1 and +1 with a step of 0.005 and chose the value which brought the largest improvement (if any) this way.

aajjbb

How to solve Consistent Salaries ? It seems to be a segment tree, as the there's non sense to make the queries in O(N), but I haven't found a way to build such a tree the ways the problem needs, so I didn't even started to code a solution in contest time.

Short answer: you need to keep track the difference in salaries between adjacent vertices. If there are some vertex u such the difference between salaries of parent[u] and u is positive then it's BAD else it's GOOD. And we are able to not care about all other pairs of vertices.

More detailed:

You need to figure out what the ways exists how we can present our tree and current salaries.

Let's discuss the following subproblem: We are required to have such queries:
1) Add to value val to the subtree of u
2) Calculate current value of u.

One of the ways is to write down an array of the vertices during dfs in order of their visit time. Now each subtree is a segment in this array, so you need to add value on the segment and be able to get the current value of some element. But this doesn't leads to the solution. There are some other difficulties we can't handle.

The other way is to think where the vertices u are placed which influence a change of the salary of v if we add a value to the subtree of u. They are parents of v. So they are placed on the path from the root to the v. So we can come up with an another idea of solving this subproblem (another way to represent our tree and salaries):

We can write on the edges (parent[u], u) the difference between salaries of u and parent[u]. Then the cost of the path from root to u will be equals to the actual salary of u. And the path cost from some vertex u to v (let's call it pathCost(u, v)) will be equals to the difference salary[v] - salary[u].

We are able to maintain such thing during processing our queries. To process query 1 we just need to add a value to the edge (parent[u], u). And to answer query 2 we need to calculate cost of the path from root to specific vertex.

So we should answer BAD if there are exist a pair of vertices u and v that pathCost(u, v) > 0. But to have pathCost(u, v) > 0 we need to have at least one edge with a positive value. But if have a positive value on the some edge (parent[w], w) it's already guarantee that at least salary of w is greater than salary of parent[w].

So we have to answer GOOD only if we don't have positive edges in the tree and if there are at least one positive edge then answer is BAD. As the result, the only thing we should track is the number of positive edges. So no data structures are actually needed.

I spent quite some time trying to come up with some data structures-based solutions for this problem, without success. Eventually, the obvious and very simple solution that you described occurred to me and I was pleasantly surprised that the problem had such a simple solution :) [ given that, on the surface, it seemed to require some kind of non-trivial data structures ]

Talking to a friend, he advised me to use an Link-Cut tree, is possible to model a solution for this problem using this ?

Why would you want to switch an extremely simple array-only solution to a complicated data structure? Your friend is having some weird ideas...

It's obvious this solution is cleaner and better, but it's always good and to learn a new data structure.

Yes, but it's better to learn the data structure in question on a problem that should be solved by it, not on one that shouldn't. It is, after all, important to learn which approaches are appropriate where, not just using cannons for everything (with no time left for the remaining problems...).

I_love_Hoang_Yen

Ok, but then could you suggest some problems that can only be solved using Link-Cut tree? I think those are very very rare. I was only able to find one problem in the past but I forgot the link :)

That's why I didn't say "solved only by it", but "solved by it". Finding a problem that's truly cut out only for a particular algorithm is hard regardless of which algorithm it is :D

I suppose a good example of a problem that's quite suitable for link-cut tree is its motivation problem (implementing link, cut, find).

space_man

where can i find the editorial for this contest?

coolbun

How to solve the problem Guessing numbers?

By reading the comments above.

scott_wu's blog