Strings And Trees for Thumbnail Images Classification

"Strings And Trees for Thumbnail Images Classification" ANR Blanc 07-1__184534 is a three-year project launched in January 2008. SATTIC is funded by the French National Research Agency ANR and is partially supported by the IST Programme of the European Community, under the PASCAL 2 Network of Excellence, IST-2006-216886 In order to manage some of the huge data sets that are now available, and more particularly to classify, recognize or search through these sets, one needs a representation system which is rich enough to describe the data while allowing an efficient and mathematically well understood exploitation. This sort of representation is both well defined and nicely computed when data are numerical values, or more generally vectors of numerical values. However, many objects are poorly modelled with such vectors of numerical values that cannot express notions such as sequentiality or relationships between attributes. In particular, this project aims at representing and exploiting thumbnail images such as those returned by search engines like Google. If much work has been done on images having high definition levels, none concerns the question of filtering these small images, the definition of which is too low to allow a segmentation into regions and/or the exploitation of wide support local measures. An appealing alternative lays in modelling images by extracting and symbolically structuring salient points: salient points, corresponding to the image high contrast points, may be easily detected in thumbnail images; we propose to structure them by means of strings, trees, or more generally graphs, in order to integrate information on saliency degree or spatial relationships. We propose in this project to study the capabilities of such salient point structuring to model and exploit thumbnail images. This goal implies the definition of a new paradigm for analysing and statistically characterizing symbolic structured data, at odds with classical approaches used for numerical data.

Laboratoire Hubert Curien (LaHC) CNRS UMR 5516, including ex-EURISE, Saint-Etienne. Contact: Jean-Christophe Janodet (Jean dot Christophe dot Janodet at univ-st-etienne dot fr)
Laboratoire d’InfoRmatique des Images et des Systèmes d’information (LIRIS), CNRS UMR 5205, Lyon. Contact: Christine Solnon (Christine dot Solnon at liris dot cnrs dot fr)