-
Create semantic representations (ontologies) and semantic structures that are generalizable across languages,
media and modalities.
-
Investigate various input modalities and information sources (speech, natural language text, query text,
eye-tracking, user preferences, past user behavior) and their synergies in multimedia information retrieval tasks.
-
Investigate various output modalities (text, graphics, video, audio) to create a visualization of multimedia query results
and help the user navigate a multimedia database.
-
Build natural and efficient multimodal multi-turn human-computer interfaces (combining 1-3) that advance the state-of-the-art
and demonstrate the main concepts of our research.