Sparse Reinforcement Learning in High Dimensions

Published on 2012-04-253699 Views

Workshops 2012 - Cumberland Lodge

Sparse Reinforcement Learning in High Dimensions00:00

Project’s Description (1)01:56:16

Project’s Description (2)05:58:26

Publications (1)21:27:37

Publications (2)24:52:42

Sequential Decision-Making under Uncertainty29:55:35

Reinforcement Learning (RL)45:26:01

Markov Decision Process64:05:33

Value Function90:26:14

Optimal Value Function and Optimal Policy119:46:44

Properties of Bellman Operators132:05:59

Dynamic Programming Algorithms (1)140:11:18

Dynamic Programming Algorithms (2)152:24:42

Approximate Dynamic Programming (ADP)172:25:29

ADP Algorithms (1)186:41:22

ADP Algorithms (2)194:21:18

Curse of Dimensionality202:42:12

Motivation for Studying RL in High Dimensions214:34:16

Value Function Approximation (VFA)246:51:55

Feature Selection (1)261:48:27

Feature Selection in Value Function Approximation312:32:35

Feature Selection (2)327:07:13

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010)331:20:52

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Summary335:12:14

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Feature Selection365:11:42

Using random projections in RL and regression416:00:24

Compressed Least-Squares Regression (NIPS 2009)419:54:51

Compressed Least-Squares Regression: Summary421:36:46

LSTD with Random Projections (NIPS 2010)492:06:03

LSTD with Random Projections: Problem495:43:52

LSTD with Random Projections: Results (1)505:48:05

LSTD with Random Projections: Results (2)523:23:36

Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit (AISTATS 2012)551:17:03

Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit: Summary553:49:50

Using sparsity in value function approximation569:36:22

Finite-Sample Analysis of Lasso-TD (ICML 2011)574:24:27

Finite-Sample Analysis of Lasso-TD: Summary (1)575:59:10

Finite-Sample Analysis of Lasso-TD: Summary (2)646:03:59

Project’s Achievements668:04:59

Towards Adaptive RL Algorithms (1)705:42:44

Towards Adaptive RL Algorithms (2)733:15:43

Future Work774:41:42

Thank You!801:35:12