Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
Tags
- 모두를 위한 RL
- pandas
- David Silver
- Jacobian Matrix
- 판다스
- 딥러닝
- rl
- Python Programming
- Series
- convex optimization
- 데이터 분석
- list
- optimization
- Deep Learning
- 논문
- neural network
- 리스트
- reinforcement learning
- statistics
- ML-Agent
- Hessian Matrix
- 강화학습
- Laplacian
- Linear algebra
- 유니티
- unity
- paper
- 사이킷런
- machine learning
- 김성훈 교수님
Archives
목록k-armed bandit (1)
RL Researcher

1. Multi-Armed Bandit Probelm MAB는 아래와 같습니다. Consider the following learning problem. You are faced repeatedly with a choice among k different options, or actions, After each choice you receive a numerical reward chosen from a stationary probability distribution that depends on the action you selected. Your objective is to maximize the expected total reward over some time period. Expected total ..
Reinfrocement Learning
2021. 2. 19. 02:27