Maximum harmonic mean subarray

→ Pay attention

Before contest
Educational Codeforces Round 165 (Rated for Div. 2)
29:45:25
Register now »

→ Streams

CF Edu Round #165 Solution Discussion

By aryanc403

Before stream 31:55:23

Educational Round 165 Solution Discussion (with myself)

By Shayan

Before stream 31:55:23

View all →

→ Top rated

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	jiangly	3578
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	tourist	3565
8	maroonrk	3531
9	Radewoosh	3521
10	Um_nik	3482

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	146
8	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

zetaminusone's blog

Maximum harmonic mean subarray

By zetaminusone, history, 4 years ago, In English

Let there be a sequence of integers $$$x_1, x_2, ... x_n$$$. Then, the harmonic mean is defined as

$$$\frac{n}{\sum_{i=1}^n \frac{1}{x_i}}$$$

Given an array of positive and negative integers, what is the most efficient way to find the subarray with the maximum harmonic mean? Thanks!

zetaminusone
4 years ago
7

Comments (7)

Write comment?

clyring

4 years ago, # |

What is the source of this problem? Are there any other constraints?

→ Reply

zetaminusone

4 years ago, # ^ |

This question came as a random thought as I was trying to create more questions for my school's judge. It is purely a matter of interest ^^

→ Reply

steiner

4 years ago, # ^ |

Wouldn't it be just the maximum number in array?

→ Reply

clyring

4 years ago, # ^ |

OK. That explains my weird feeling reading the problem. With no further constraints, there can be division by zero in calculating the harmonic mean of some subarrays due to negative inputs, for example with 1 -1. Even if input guarantees that this cannot happen or it is clarified that this doesn't count, the correct output can be large when the sum in the denominator only approximately cancels out. The correct output for 8 -9 -9 10 is 1440. If there's a particularly efficient way to search for such near-cancellations, it's at least not obvious to me.

If instead all input is positive, then (unless you want the problem to be 'print the maximum of the input' in disguise) some constraint on the size of the subarrays to be considered seems appropriate.

→ Reply

zetaminusone

4 years ago, # ^ |

Thank you! I honestly had not realised at first and you pointed out many insightful observations. That resolves the problem :)

→ Reply

collapse

4 years ago, # |

← Rev. 2 →

There is a stupid systematic way in $$$O(N \log^3 N)$$$. Assume without loss of generality that there is at least one positive element.

Let $$$a_i = \frac 1{x_i}$$$ and $$$p_i$$$ be $$$\sum_{j = 1}^i a_j$$$ (prefix sum array of $$$a_i$$$).

We wish to pick two endpoints $$$l$$$ and $$$r$$$ such that $$$\frac{r - l + 1}{\sum_{i=l}^r \frac 1{x_i}}$$$ is maximized.

$$$ \begin{align} &\phantom{=.}\frac{r - l + 1}{\sum_{i=l}^r \frac 1{x_i}} \\ &= \frac{r - (l - 1)}{\sum_{i=(l - 1) + 1}^r \frac 1{x_i}} \\ \end{align} $$$

Let $$$l_0 = l - 1$$$,

$$$ \begin{align} &= \frac{r - l_0}{\sum_{i=l_0 + 1}^r \frac 1{x_i}} \\ &= \frac{r - l_0}{p_r - p_{l_0}} \\ \end{align} $$$

We wish to maximize this value. We note for later that $$$p_r - p_{l_0}$$$ must be positive. We binary search on $$$k$$$ to find how many $$$0 \leq l_0 \leq r \leq N$$$ satisfy

$$$ \begin{align} &= \frac{r - l_0}{p_r - p_{l_0}} \geq k \\ &= r - l_0 \geq p_rk - p_{l_0}k \\ &= r - p_rk \geq l - p_{l_0}k \\ \end{align} $$$

Let $$$b_i = i - p_ik$$$, then we want to check how many pairs $$$0 \leq l_0 \leq r \leq N$$$ satisfy $$$b_r \geq b_l$$$ AND $$$p_r > p_{l_0}$$$. This is a 3D partial order problem which can be solved using CDQ in $$$O(N \log^2 N)$$$. (Tutorial)

For each iteration of the binary search, if no pairs $$$l_0, r$$$ are found, then $$$k$$$ is too high, otherwise, it's at least that value of $$$k$$$.

Floating point accuracy is likely a big issue. The case where all elements are negative can be handled in a similar way.

→ Reply

clyring

4 years ago, # ^ |

← Rev. 2 →

Binary searching over $$$k$$$ probably requires some care, especially when you seem to claim $$$O(\log n)$$$ iterations as opposed to something like $$$O(\log \frac{1}{\varepsilon})$$$ iterations. The former is certainly achievable, for example by modifying your CDQ queries to provide a random sample and using that $$$(l_0, r)$$$ pair's highest possible value of $$$k$$$ as your next pivot, but isn't entirely obvious.

Floating point accuracy is a very big issue. Unless the bounds are very small, I would prefer to do all arithmetic for this problem in exact arithmetic rather than risk running headfirst toward the limits of what floating-point types the compiler might provide. Obviously this makes arithmetic operations rather expensive, but this seems to be a necessary price to pay. I was already able to craft a testcase that will run afoul of the precision available in a double with $$$n = 8$$$ and $$$| a_i | \leq 1000$$$:

When a = [1, 1, -979, -846, -729, 761, 884, 885], $$$p_8 = 2 + \frac{1}{6656 823 096 297 660} > p_2 = 2$$$, but when computed in doubles in the obvious way on my machine, $$$p_8 = p_2$$$.

Come to think of it, for a fixed $$$l_0$$$, the largest harmonic mean is achieved by minimizing the slope $$$\frac{p_r - p_{l_0}}{r - l_0}$$$ among points $$$r > l_0$$$ with $$$p_r > p_{l_0}$$$, which is a convex-hull query, and can be answered in $$$O((\log n)^2)$$$ arithmetic operations per query by maintaining a segment tree of dynamic (lower) convex hulls and adding points $$$(x, p_x)$$$ in decreasing order of $$$p_x$$$.

→ Reply