'rl' 태그의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

관리 메뉴

목록rl (2)

RL Researcher

MAB Problem

1. Multi-Armed Bandit Probelm MAB는 아래와 같습니다. Consider the following learning problem. You are faced repeatedly with a choice among k different options, or actions, After each choice you receive a numerical reward chosen from a stationary probability distribution that depends on the action you selected. Your objective is to maximize the expected total reward over some time period. Expected total ..

Reinfrocement Learning 2021. 2. 19. 02:27

Human-level control through deep reinforcement learning

본 논문에서는 DQN Algorithm이 사용되었습니다. 앞으로 계속 업데이트 해 나갈 예정입니다. 여기서 DQN이란 Deep Q-Network의 약자이며, 인공신경망 즉 심층 인공신경망(Deep NN)이라고 합니다. 이 Algorithm에서는 수신 필드의 효과를 모방하기 위해서 타일형의 Convolutional Filter의 Layer를 사용합니다. Agent의 목표는 Cumulative Reward를 Maximise하는 방식으로 Action을 선택하는 것입니다. Deep Convolutional Neural Network를 사용하여 Optimal한 Value Function에 Approximation합니다. $$Q^{*}(s,a) = \underset{\pi}{max}E[r_{t} + \gamma r..

Reinfrocement Learning/Paper Review 2021. 2. 15. 11:23

Prev 1 Next

목록rl (2)

RL Researcher

티스토리툴바