NIPS Workshop on Learning from Multiple Sources, Whistler 2008

12 Lectures · Dec 13, 2008

About

While the machine learning community has primarily focused on analysing the output of a single data source, there has been relatively few attempts to develop a general framework, or heuristics, for analysing several data sources in terms of a shared dependency structure. Learning from multiple data sources (or alternatively, the data fusion problem) is a timely research area. Due to the increasing availability and sophistication of data recording techniques and advances in data analysis algorithms, there exists many scenarios in which it is necessary to model multiple, related data sources, i.e. in fields such as bioinformatics, multi-modal signal processing, information retrieval, sensor networks etc.

The open question is to find approaches to analyse data which consists of more than one set of observations (or view) of the same phenomenon. In general, existing methods use a discriminative approach, where a set of features for each data set is found in order to explicitly optimise some dependency criterion. However, a discriminative approach may result in an ad hoc algorithm, require regularisation to ensure erroneous shared features are not discovered, and it is difficult to incorporate prior knowledge about the shared information. A possible solution is to overcome these problems is a generative probabilistic approach, which models each data stream as a sum of a shared component and a private component that models the within-set variation.

In practice, related data sources may exhibit complex co-variation (for instance, audio and visual streams related to the same video) and therefore it is necessary to develop models that impose structured variation within and between data sources, rather than assuming a so-called 'flat' data structure. Additional methodological challenges include determining what is the 'useful' information to extract from the multiple data sources, and building models for predicting one data source given the others. Finally, as well as learning from multiple data sources in an unsupervised manner, there is the closely related problem of multitask learning, or transfer learning where a task is learned from other related tasks.

More information about workshop - http://web.mac.com/davidrh/LMSworkshop08/

Related categories

9326 Views

Lecture

15:39

Discussion & Future Directions

Dec 20, 2008

3144 Views

Lecture

NIPS Workshop on Learning from Multiple Sources, Whistler 2008

About

Related categories

Uploaded videos:

Multiview Clustering via Canonical Correlation Analysis

Karen Livescu

Multi-View Dimensionality Reduction via Canonical Correlation Analysis

Sham M. Kakade

The Double-Barrelled LASSO (Sparse Canonical Correlation Analysis)

David R. Hardoon

Learning Shared and Separate Features of Two Related Data Sets using GPLVMs

Gayle Leen

Multiview Fisher Discriminant Analysis

Tom Diethe

Selective Multitask Learning by Coupling Common and Private Representations

Jaisiel Madrid-Sanchez

Regression Canonical Correlation Analysis

Jan Rupnik

Multiple kernel learning for multiple sources

Francis R. Bach

GP-LVM for Data Consolidation

Neil D. Lawrence

Two-level infinite mixture for multi-domain data

Simon Rogers

Probabilistic Models for Data Combination in Recommender Systems

Sinead Williamson

Discussion & Future Directions

VIDEOLECTURES

LEGAL