Introduction to Reinforcement Learning.

Пожалуйста, подпишитесь на официальный канал Codeforces в Telegram по ссылке https://t.me/codeforces_official. ×

→ Обратите внимание

*есть доп. регистрация

→ Трансляции

aryanc403

До начала 17:26:34

Shayan

До начала 17:26:34

aryanc403

До начала 41:46:34

Всё →

→ Лидеры (рейтинг)

→ Лидеры (вклад)

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Introduction to Reinforcement Learning.

Правка en1, от bhikkhu, 2023-01-01 10:29:11

We would like to create a model that which when given a game state, it predicts the best move.

Lets say our game is the simple Tic Tac Toe. It is a small game and we can train the AI for it in a handful of minutes.

Here is our example neural network, reduced the number of hidden layer to avoid cluttering.

In the above network, the inputs are going to be board states. For example,

Lets assume the neural networks always predicts from the perspective of that the turn is of player -1.

If we can build a neural network, we can just flip the board and predict for the opposite player, easy peasy.

reinforcement learning, ai, self-play, tic tac toe

Правки

Rev.	Кто	Когда	Δ	Комментарий
en11	bhikkhu	2023-01-01 11:05:12	2	Tiny change: 'ke this:\n1. Run a' -> 'ke this:\n\n1. Run a'
en10	bhikkhu	2023-01-01 11:03:39	3	Tiny change: 'rformance Atari gam' -> 'rformance in Atari gam'
en9	bhikkhu	2023-01-01 11:02:31	39	Tiny change: 'ation\n — Run a pla' -> 'ation\n --> Run a pla'
en8	bhikkhu	2023-01-01 11:00:07	7
en7	bhikkhu	2023-01-01 10:57:01	2	Tiny change: 'radical net way that ' -> 'radical new way that '
en6	bhikkhu	2023-01-01 10:55:46	73
en5	bhikkhu	2023-01-01 10:54:33	1786	Tiny change: 'bb94c.png)' -> 'bb94c.png)\n\n' (published)
en4	bhikkhu	2023-01-01 10:45:44	62
en3	bhikkhu	2023-01-01 10:41:12	133
en2	bhikkhu	2023-01-01 10:38:11	958
en1	bhikkhu	2023-01-01 10:29:11	880	Initial revision (saved to drafts)

Соревнования по программированию 2.0

Время на сервере: 23.07.2024 02:13:26 (h2).

Десктопная версия, переключиться на мобильную.

При поддержке