video thumbnail
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Sparse Reinforcement Learning in High Dimensions

Published on 2012-04-253699 Views

Presentation

Sparse Reinforcement Learning in High Dimensions00:00
Project’s Description (1)01:56:16
Project’s Description (2)05:58:26
Publications (1)21:27:37
Publications (2)24:52:42
Sequential Decision-Making under Uncertainty29:55:35
Reinforcement Learning (RL)45:26:01
Markov Decision Process64:05:33
Value Function90:26:14
Optimal Value Function and Optimal Policy119:46:44
Properties of Bellman Operators132:05:59
Dynamic Programming Algorithms (1)140:11:18
Dynamic Programming Algorithms (2)152:24:42
Approximate Dynamic Programming (ADP)172:25:29
ADP Algorithms (1)186:41:22
ADP Algorithms (2)194:21:18
Curse of Dimensionality202:42:12
Motivation for Studying RL in High Dimensions214:34:16
Value Function Approximation (VFA)246:51:55
Feature Selection (1)261:48:27
Feature Selection in Value Function Approximation312:32:35
Feature Selection (2)327:07:13
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010)331:20:52
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Summary335:12:14
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Feature Selection365:11:42
Using random projections in RL and regression416:00:24
Compressed Least-Squares Regression (NIPS 2009)419:54:51
Compressed Least-Squares Regression: Summary421:36:46
LSTD with Random Projections (NIPS 2010)492:06:03
LSTD with Random Projections: Problem495:43:52
LSTD with Random Projections: Results (1)505:48:05
LSTD with Random Projections: Results (2)523:23:36
Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit (AISTATS 2012)551:17:03
Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit: Summary553:49:50
Using sparsity in value function approximation569:36:22
Finite-Sample Analysis of Lasso-TD (ICML 2011)574:24:27
Finite-Sample Analysis of Lasso-TD: Summary (1)575:59:10
Finite-Sample Analysis of Lasso-TD: Summary (2)646:03:59
Project’s Achievements668:04:59
Towards Adaptive RL Algorithms (1)705:42:44
Towards Adaptive RL Algorithms (2)733:15:43
Future Work774:41:42
Thank You!801:35:12