Research areas
Metric and Representation Learning
The objective of representation and metric learning is to build new spaces of representations to improve the performance of a classification, regression or clustering algorithm either from distance constraints or by making use of fine decomposition of instances in complete samples. We address this topic both from a theoretical standpoint and a practical one by providing new methods and algorithms in metric learning or deep learning.
Transfer Learning and Domain Adaptation
Transfer learning/domain adaptation is a research topic which addresses the problem of transferring some knowledge or some models from a given source task to a related but different target one. The Data Intelligence team works on different subproblems among which: avoiding negative transfer, scalability, lifelong learning, adaptive online learning.
Machine Learning for Fraud and Anomaly Detection
We develop specific machine learning methods for the special topic of fraud and anomaly detection. A particularity of this topic is to deal with highly unbalanced datasets in a context where fraudsters strategies evolve over time. Our work targets applications in bank fraud detection and anomaly detection in medicine and accident prevention.
Machine Learning for Computer Vision Applications
We develop specific machine learning approaches for computer vision applications. In particular, we focus on finding relevant representations for describing images or videos, instance-based scene labeling, color constancy.
Machine Learning for Natural Language Processing
Natural Language Processing (NLP) uses a lot of machine learning techniques in various tasks nowadays. In our team, we contribute to this field according to three main directions. First, we work on the development of new learning models for language learning taking into account contextual and semantic information to facilitate and improve learning. Second, we focus on methods for learning new Word Embeddings (word representations) that can be used by NLP methods. Third we investigate Deep Learning techniques for Text Summarization.
Data Mining for complex data: documents, graphs, social networks
The information collected in digital form is growing not only in quantity but also in complexity. This means that in many domains, we are trying to capture different aspects about objects of the real world. We can have some aspects represented in tabular form, unstructured text, graphs modeling relationships between objects, images of various complexity: schemas or photographic pictures for instance and, we can have information in different media. This leads to revisit usual mining methods to be able to analyze large collections of complex data. We address this issue by finding relevant representations and similarity measures and designing methods well-suited to solve efficiently mining tasks such as prediction, clustering, classification or pattern extraction.
Data Mining for Image and Video Analysis
The extraction of relevant visual patterns in images and videos are essential to characterize the structure of the object considered. These patterns can be used to define new representations, to propose some data summarizations/characterizations or to detect some particular elements in images or videos or even perform some classification or tracking tasks. We are particularly intested in the extraction of subsequences of graph-based patterns for object tracking in videos.
Social and Personalized Information Retrieval
The quantity of information available on the Web (concept of Big Data) and the presence of dynamic web data requires fine tuned information retrieval systems to propose relevant documents to a specific user query. However, it is generally difficult for classic systems to index efficiently all the web and their answers are not specific enough to user needs. We propose to investigate Social and Personalized Information Retrieval to tackle these drawbacks.