Similarity of Locally Structured Data in Computer Vision

The Solstice ANR is a four year project launched in February 2014 and is funded by the French National Research Agency ANR.

SoLSTiCe is a fundamental research project which aims at designing new models and tools for representing and managing images and videos in order to, e.g., retrieve images or videos which are similar to a query image or video; recognize objects in images; track objects in videos; or detect typical activities in videos. To tackle those applications, a solution is to describe images by global features such as, e.g.,histograms or eigen-features. A major current trend is to use bag-of-visual-words (BoVW) models, the basic idea of which is to extract local features from small image regions so that images are mapped into a vector space of visual words. However, BoVW models as many other global models proposed in the literature, do not integrate structural information such as spatial or temporal relationships holding between local features which hinders their applicability to realistic problems requiring large discriminance. The lack of structural information can be an advantage as it is easier to make the models invariant to a large class of transformations. However, the drawback is their lack of ability to model geometrical and temporal relationships between parts of objects and actions, which is required for complex applications.

In this project we would like to explore locally structured data (LSD), which combine visual features (such as interest points, segmented regions or visual words) with discrete structures (such as strings, trees, combinatorial maps or, more generally, graphs) in order to model local (spatio-temporal) relationships holding between these features. Using LSD for classification, recognition or indexing tasks will bring us to study 3 main issues:

  • Extracting LSD from images and videos: We extract relevant visual features and structure them w.r.t. spatial and temporal relationships.
  • Measuring the similarity of LSD: We design relevant similarity measures for comparing LSD, and e_cient algorithms for computing these measures.
  • Mining LSD: We characterize LSD by means of frequently (or infrequently) occurring patterns (itemsets, sequences or graphs) and use them to create discriminative features for solving computer vision tasks.

Laboratoire Hubert Curien (LaHC) CNRS UMR 5516. Contact: Elisa Fromont (Elisa dot Fromont at univ-st-etienne dot fr).
Laboratoire d’InfoRmatique en Image et Systèmes d’information (LIRIS) CNRS 5205. Contact: Christine Solnon (christine dot solnon at insa-lyon dot fr)

Website of the project.