PhD student position
Location: Laboratoire d’Informatique de Grenoble
http://lig.imag.fr/
Group: Multimedia Information Modeling and Retrieval
http://mrim.imag.fr/
Funding: 36-month fix term contract, about 1380 Euros net per month.
Application deadline: 30 September 2009.
Starting date: 1st November or 1st December 2009.
Supervisors: Georges Quénot (Researcher at CNRS)
Philippe Mulhem (Researcher at CNRS)
Contact: Georges.Quenot@imag.fr, Philippe.Mulhem@imag.fr.
Title: Using context for semantic indexing of image and video documents
The automated document indexing image and video is a difficult problem
because of the “distance” between the arrays of numbers encoding these
documents and the concepts (e.g. people, places, events or objects)
with which we wish to annotate them. Methods exist for this but their
results are far from satisfactory in terms of generality and accuracy.
They generally operate by supervised or semi-supervised learning: the
system learns to recognize concepts from positive and negative examples;
it “generalizes” from these examples. Existing methods typically use a
single set of such examples and consider it as uniform. This is not
optimal because the same concept may appear in various contexts and its
appearance may be very different depending upon these contexts. The
context may be: the type of broadcast (television news, fiction,
entertainment, advertising, etc.), Date, place, country or culture of
broadcasting or production, or the modalities present or absent (for
documents in black and white and / or without sound, for instance).
The context may generally be regarded as another concept or as a set
of other concepts. The concepts and relations between them can be
represented in ontologies. One can interpret the relationship within
an ontology like the fact that the elements are likely to be together
or not in an image or a in a video shot and this information can be
used for their automatic annotation.
The proposed subject concerns the use of the context to improve the
performance of classifiers. The main idea is to consider, for each
concept to be recognized, a number of contexts in which it may appear
and to train a classifier for each of these contexts. During the
recognition, the appropriate classifier is used according to the
identified context. Alternatively, a weighted combination (fusion)
of classification results can be used if we only have probabilities
of being in a given context. Such an approach presents several
difficulties. The first one is the identification of context during
the recognition: in some cases, it may be known explicitly (from
metadata, for example) but, in general, it is actually another
concept, which also has to be recognized. The second difficulty is
the need for a very important total volume of training data so that,
for each context, there are enough examples to properly train a
classifier. There is a complexity that is related to simultaneously
manage the tuning of multiple classifiers for each concept. The
third difficulty concerns the problem of merging the outputs of
different classifiers in the frequent case in which there are
uncertainties about the context actually present during the
recognition. Implementation will possibly be based on the use of
network operators (extractors of features, classifiers and merge
modules), of ontologies to manage relationships between concepts
and of active learning for automatic training data collection.
The developed methods will be evaluated in the context of national
and international campaigns like TRECVID
(http://www-nlpir.nist.gov/projects/trecvid/). The work will be
done in the context of the Quaero program (http://www. quaero.org).
This will, among other things, give access to a large volume of
annotated image and video data.
http://mrim.imag.fr/georges.quenot/postes/These-Quaero-2-FR.pdf
http://mrim.imag.fr/georges.quenot/postes/These-Quaero-2-EN.pdf
September 11, 2009
PhD position at LIG research lab, Grenoble, France
Subscribe to:
Post Comments (Atom)
0 coment:
Post a Comment