Natural Language Understanding

Published on 2017-07-2710405 Views

DLSS & RLSS 2017 - Montreal

Presentation

Natural Language Processing, Language Modelling and Machine Translation00:00

Natural Language Processing11:14:40

Language models60:23:09

History: cryptography93:30:16

Language models - 1103:47:10

Language models - 2124:03:59

Language models - 3140:38:05

Evaluating a Language Model169:01:17

Language Modelling Data - 1183:11:26

Language Modelling Data - 2196:50:23

Language Modelling Overview230:41:47

N-Gram Models: The Markov Chain Assumption247:37:33

N-Gram Models: Estimating Probabilities260:32:33

N-Gram Models: Back-O269:13:27

N-Gram Models: Interpolated Back-O287:52:19

Provisional Summary313:21:28

Outline - 1365:49:32

Neural Language Models - 1367:14:07

Neural Language Models - 2372:07:30

Neural Language Models: Sampling - 1395:00:05

Neural Language Models: Sampling - 2399:49:09

Neural Language Models: Training - 1404:47:00

Neural Language Models: Training - 2408:47:40

Neural Language Models: Training - 3409:17:01

Comparison with Count Based N-Gram LMs432:35:56

Recurrent Neural Network Language Models - 1455:14:18

Recurrent Neural Network Language Models - 2469:08:54

Recurrent Neural Network Language Models - 3477:17:37

Recurrent Neural Network Language Models - 4482:55:26

Recurrent Neural Network Language Models - 5489:48:01

Recurrent Neural Network Language Models - 6514:34:00

Recurrent Neural Network Language Models - 7531:00:24

Comparison with N-Gram LMs531:32:25

Language Modelling: Review565:39:26

Gated Units: LSTMs and GRUs583:31:10

Deep RNN LMs - 1593:30:10

Deep RNN LMs - 2595:21:01

Deep RNN LMs - 3596:03:56

Deep RNN LMs - 4607:57:07

Deep RNN LM - 1612:13:50

Deep RNN LM - 2614:35:39

Scaling: Large Vocabularies - 1627:08:26

Scaling: Large Vocabularies - 2670:06:33

Scaling: Large Vocabularies - 3681:06:59

Scaling: Large Vocabularies - 4700:33:44

Scaling: Large Vocabularies - 5708:48:22

Scaling: Large Vocabularies - 6727:20:56

Scaling: Large Vocabularies - 7750:42:13

Sub-Word Level Language Modelling772:49:39

Regularisation: Dropout - 1840:30:55

Regularisation: Dropout - 2846:25:34

Regularisation: Bayesian Dropout (Gal)864:50:55

Evaluation: hyperparamters are a confounding factor876:09:29

Summary1046:39:17

Intro to MT1054:30:49

Parallel Corpora1071:48:20

MT History: Statistical MT at IBM - 11091:32:32

MT History: Statistical MT at IBM - 21116:06:41

Models of translation - 11135:42:25

Models of translation - 21157:32:40

IBM Model 1: The first translation attention model!1161:56:31

Encoder-Decoders131180:46:53

Recurrent Encoder-Decoders for MT14 - 11194:38:02

Recurrent Encoder-Decoders for MT14 - 21207:05:27

Recurrent Encoder-Decoders for MT14 - 31217:24:12

Attention Models for MT15 - 11218:23:33

Attention Models for MT15 - 21230:55:37

Attention Models for MT15 - 31234:36:01

Attention Models for MT15 - 41236:40:11

Attention Models for MT15 - 51246:42:33

Attention Models for MT15 - 61248:25:01

Returning to the Noisy Channel - 11262:30:16

Returning to the Noisy Channel - 21291:29:52

Decoding1310:58:46

Decoding: Direct vs. Noisy Channel - 11316:32:06

Decoding: Direct vs. Noisy Channel - 21323:25:49

Decoding: Noisy Channel Model1332:05:33

Segment to Segment Neural Transduction1345:11:36

Noisy Channel Decoding1373:38:34

Relative Performance161387:13:01

The End1401:51:59