AI outperforms IMO gold medalists in geometry (this isn't surprising, relevant details in the post)

»

123gjweq2

5 недель назад, # |

+48

first they came for our geometry problems...

→ Ответить

»

AvaraKedavra

5 недель назад, # ^ |

+8

then they came for our adhoc problems...

→ Ответить

»

WeaponizedAutist

5 недель назад, # |

-30

who cares

AI waifus when?

→ Ответить

»

Guuber

5 недель назад, # ^ |

+33

What good are AI waifus if they can't beat me at IMO number theory

→ Ответить

»

WeaponizedAutist

5 недель назад, # ^ |

-34

you are just a blue, intelligence should not be very high on your priorities

→ Ответить

»

Guuber

5 недель назад, # ^ |

+21

:D. That logic doesn't make any sense. If AI is worse than me, I can complain that it's too bad. The complaining makes a lot more sense than if I was red.

If someone is very poor they have much more reason to say "I hope my future partner isn't even poorer than me!" than a billionaire would

→ Ответить

»

WeaponizedAutist

5 недель назад, # ^ |

-53

redditor detected, opinion rejected

→ Ответить

»

Red0

5 недель назад, # ^ |

0

you are an inactive purple :)

→ Ответить

»

shiven

5 недель назад, # |

+25

Imagine not tagging :|

This definitely doesn't mean that AI has 'conquered math', or even different domains of IMO. You're correct that the work speaks for the narrow domain of Geometry in math competitions.

The important highlight is the potential of symbolic methods and their huge gains in efficiency. They manage to solve 21/30 problems, close to the performance of the average silver medalist, under a time limit of 5 minutes on a laptop CPU. Most take under 5 seconds. I invite you to contrast that to the amount of compute it takes to train and infer with LLMs in the pipeline (e.g. AlphaGeometry) and you'll be surprised :)

→ Ответить

»

Guuber

5 недель назад, # ^ |

0

That is very true! I mostly saw people reacting to the 27 / 30 number and maybe wrote the post too much from that context. I should have written the post from a more neutral viewpoint!

→ Ответить

»

Guuber

5 недель назад, # ^ |

0

(Also didn't realize I could / should check if the authors of the paper were on codeforces...)

→ Ответить

»

hashman

5 недель назад, # ^ |

0

Also, geometry is much more mechanical than algebra, combinatorics, or NT. In my opinion (from my fairly limited MO experience), the order this will go is: geometry -> algebra -> NT -> combinatorics, because combinatorics is the least mechanical out of the four. So, when AI solves combinatorics is when we should really be worried.

→ Ответить

»

shiven

5 недель назад, # ^ |

0

Indeed, there are several things that make Geometry a good target for mechanical provers. We highlight some of the reasons for this in our paper.

As for ranking others, although I see where you're coming from, I don't think I can make a confident guess sadly. There may even exist pecularities that make some parts easier. For example, I can see many combinatorics problem which address a small enough finite set to be brute-forceable by computers, without using any 'reasoning' as such. Consider something about counting the number of ways to do X such that property Y is satisfied. I would assume quite a few LLMs would be able to write brute-force code that iterates over all ways to do X and checks Y. Similar to how ChatGPT can write inefficient solutions for CP problems. Sure, the computer didn't solve it in the way you'd expect. But MO problems just weren't designed to handle this amount of compute. Humans need to resort to elegant reasoning to solve those puzzles, but computers just might not need to :) Of course, that is a very specific example that you can't extend to different problems, but it's just meant to point out that the computational ways to address these problems might be different from how humans think about them.

→ Ответить

»

hashman

5 недель назад, # ^ |

0

Good points. But the thing I was talking about was problems of the form "prove something general about this infinite set of objects (could be anything, sets, numbers, boards, etc)", for which there is no finite case check approach, and you're forced to actually reason your way to the solution.

→ Ответить

»