Kernel methods are well-established machine learning approaches that are able to learn complex relationships of data through high-dimensional Hilbert-spaces. Operator-valued kernel (OVK) methods extend these approaches into multi-dimensional output domains, where several target functions are predicted.
OPERA-lib will consist of various structured-learning machine learning methods utilising OVKs. The library is a Python module utilising standard open-source libraries Numpy, Scipy and Matplotlib, and is designed for compatibility to Scikit-learn machine learning library.
Let there be N-sized dataset . We wish to estimate a p-dimensional target function
which is in vector notation
The full operator-valued kernel over the dataset is
where each block consists of p times p kernel values, for instance over the "components" i,j
where p is the number of targets. A common scalar kernel is the gaussian kernel
The OVK kernel method extends trivially the kernel ridge regression and classification.
To install OPERA-lib, use the Python
pip package manager
To update OPERA-lib to most recent version, use
pip, see https://pip.pypa.io/en/latest/installing.html
Input-Output kernel Regression (IOKR)
Link prediction is addressed as an output kernel learning task through semi-supervised Output Kernel Regression. Working in the framework of RKHS theory with vector-valued functions, we establish a new representer theorem devoted to semi-supervised least square regression. We then apply it to get a new model (POKR: Penalized Output Kernel Regression) and show its relevance using numerical experiments on artificial networks and two real applications using a very low percentage of labeled data in a transductive setting.
- Céline Bouard, Florence d'Alché-Buc and Marie Szafranski (2011) Semi-Supervized Penalized Output Kernel Regression for Link Prediction. In ICML 2011.
Reverse engineering of gene regulatory networks remains a central challenge in computational systems biology, despite recent advances facilitated by benchmark in silico challenges that have aided in calibrating their performance. Nonlinear dynamical models are particularly appropriate for this inference task, given the generation mechanism of the time-series data. We have introduced a novel nonlinear autoregressive model based on operator-valued kernels. A flexible boosting algorithm (OKVAR-Boost) that shares features from L2-boosting and randomization-based algorithms is developed to perform the tasks of parameter learning and network inference for the proposed model.
- Lim et al., (2013) OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks. Bioinformatics 29 (11):1416-1423.
Large-scale OVK learning
OVK ODE models
- Vazquet and Walter (2003): Multi-output support vector regression. In IFAC Symposium on System Identification
- Early paper connecting Gaussian processes, kriging and kernel methods over multi-dimensional outputs.
- Caponnetto, Micchelli, Pontil and Ying (2008): Universal multi-task kernels. JMLR 9:1615-1646
- Analysis and examples of classes of positive semi-definite and universal operator-valued kernels.
- Micchelli and Pontil (2006): On learning vector-valued functions. Neural computation 17
- Theory and practical considerations for OVKs.
- Kadri, Ghavamzadeh and Preux (2013): A generalized kernel approach to structured output learning. ICML
- Structured output learning with empirical covariance OVKs with various image recognition applications and experiments.
- Senbabaoglu, Lim, Michalidis and d'Alche-Buc (2013): OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks. Bioinformatics 29 (11):1416-1423.
- Auto-regressive time-series models using operator-valued kernels and boosting.
- Bouard, d'Alché-Buc and Szafranski (2011): Semi-Supervized Penalized Output Kernel Regression for Link Prediction. In ICML 2011.
- Operator-valued output kernels, with semisupervised extensions and network inference application.