BabyRobot: Child-Robot Communication (2016-2018)

I am the technical coordinator of the EU-IST H2020 BabyRobot project. In the BabyRobot project we model human-robot communication as a three-step process: sharing attention, establishing common ground and forming shared goals. Our main goal is to create robots that analyze and track human behavior over time in the context of their surroundings (situational) using audio-visual monitoring in order to establish common ground and intention-reading capabilities. In BabyRobot we focus on the typically developing and autistic spectrum children user population in order to define, implement and evaluate child-robot interaction application scenarios for developing specific socio-affective, communication and collaboration skills. Breakthroughs in core robotic technologies are needed to support this research mainly in the areas of motion planning and control in constrained spaces, gestural kinematics, sensorimotor learning and adaptation. BabyRobot ambition is to create robots that can establish communication protocols and form collaboration plans on the fly will have impact beyond the consumer and healthcare application markets addressed here.

SpeDial: Spoken Dialogue Analytics (2013-2015)

I am the coordinator of the EU-IST FP7 SpeDial project. In the SpeDial project we propose a process for spoken dialogue service development, enhancement and customization of deployed services, where data logs are analyzed and used to enhance the service in a semi-automated fashion. A list of mature technologies will be used to: 1) identify hot-spots in the dialogue and propose alternative call-flow structures, 2) select among list of prompts to reach target KPIs, 3) update grammars using transcribed service data and 4 customize application for specific user populations. Specifically, the list of technologies used will be: affective modeling of spoken dialogue, call-flow/discourse analysis, machine translation, crowd-sourcing, grammar induction, user modeling. The technologies listed above will be integrated in a service-doctoring platform that will enhance deployed services. Our business model is quick deployment of a prototype service, followed by service enhancement using our platform. The reduced development time and time-to-market will provide significant differentiation for SME in the speech services areas, as well as, end-users. The business opportunity is significant especially given the consolidation of the speech services industry and the lack of major competition.

BabyAffect: Affective and behavioral modeling of early lexicalizations of ASD and TD children (2014-2015)

I am the coordinator of the Greek SRT Aristeia II BabyAffect project (research excellence grant). The main scientific preposition behind the BabyAffect project is that the extra-lexical and extra-linguistic stream in child-caregiver communication, e.g., affect, communicative intent, is an important source of (often complementary) information that enhances significantly the lexical acquisition process in early childhood both in terms of quality (e.g., semantic categorization ability) and quantity (rate of learning, vocabulary spurt). We intend to demonstrate this both experimentally using statistical information extracted from audio- visual recordings of infants (and their caregivers) and formally using cognitive models of the lexical acquisition process using parallel distributed models and semantic networks. BabyAffect Main Goals are: 1) To develop a computational model for early vocabulary development using multimodal data conveying emotions and communicative functions from typical and atypical populations. 2) To collect and make available to different disciplines (AI, Psycholinguists, Developmental Psychologists, Human Language Technology) a large amount of multimodal data from Greek speaking children of the one-word stage in natural environments. 3) To investigate the ability of typical and atypical children to express emotions and communicative functions through distinct acoustic patterns, in order to develop an automatic screening tool for detecting children with autism and language delay (on the basis of their ability to use distinct acoustic patterns to express different emotions and communicative functions).

PorDial: Language Resources for Portable Multilingual Dialogue Systems (2012-2014)

I am the coordinator of the EU-IST FP7 PortDial project. The PortDial project aims to apply grammar induction and semantic web technologies towards the creation of domain-specific multilingual SDS resources, specifically, data-linked ontologies and grammars. The main goal of PortDial is to design machine-aided methods for creating, cleaning-up and publishing multilingual domain ontologies and grammars for spoken dialogue system prototyping in various application domains. The project aims at delivering a commercial platform for quick prototyping of interactive spoken dialogue applications to new domains and languages. It will focus on the corresponding multilingual collections of resources for specific application domains and a multilingual linked data ontological corpus that can be freely used for SDS research and prototyping for non-commercial purposes. With the main contribution of this project, partners expect to save up to 50% of development time, significantly improve grammar coverage, and lowering the barrier-to-entry for speech services prototyping by introducing data-populated ontologies and grammar induction. The application domains of PortDial include entertainment, travel, finance and customer service.

CogniMuse: Multimodal Signal and Event Processing In Perception and Cognition (2012-2015)

I am a collaborator for the Greek SRT Aristeia research excellence grant Cognimuse. Motivated by the grand challenge to endow computers with human-like abilities for multimodal sensory information processing, perception and cognitive attention, CogniMuse undertakes fundamental research in modeling multisensory and sensory-semantic integration via a synergy between system theory, computational algorithms and human cognition. It focuses on integrating three modalities (audio, vision and text) toward detecting salient perceptual events and combining them with semantics to build higher-level stable events through controlled attention mechanisms. My main contribution to CogniMuse is on the text modality, as well as on the fusion of low-level (sensory) and high-level (semantic) information.

USC/ISI Collaborative Projects

I am involved in a variety of joint research efforts with my long-time collaborator Prof. Shri Narayanan at the SAIL lab at USC. These include: speech analysis and recognition of children speech, analysis of narratives of autistic children, application of network-based DSMs and semantic-affective models to computation politics and the legal domain, as well as automatic analysis of movie content.

Past Projects: 2000-2010

  • I was the project coordinator for the Greek SRT PENED project on aerodynamic modeling of the vocal tract during speech production running from 2006 to 2009. This was a three way collaboration between our team at Tech. Univ. of Crete and two groups at NTUA (fluid dynamics, speech processing). For some relevant results see joint work with Dr. Pirros Tsiakoulis at my AM-FM modulation related publications.
  • I led the effort for the TSI-TUC team for the EU-IST FP6 Network of Excellence MUSCLE on multimedia understanding from 2004 to 2008. Notable outputs from this project include our collaborative work with NTUA on saliency-based movie summarization, see also relevant publications here.
  • I led the effort for the TSI-TUC team for the EU-IST FP6 STREP project HIWIRE on robust speech recognition from 2004 to 2007. One of the main outputs of the projects was the HIWIRE front-end that reduced word error rate by 20% relative compared to the ETSI advanced standard front-end. For more details on the HIWIRE project and HIWIRE database see my robust ASR publications.
  • I was the principal investigator of the spoken dialogue team at Bell Labs, Lucent Technologies for the DARPA Communicator project on spoken dialogue systems 2000-2001. I worked together with Eric Fosler-Lussier, Egbert Ammicht, Jeff Kuo and many others towards modular and generalizable information seeking dialogue systems. For our papers on semantic and pragmatic processing see my publications section of multimodal dialogue systems.
  • I have also participated in various research efforts as a consultant including the EU-FP6 FET project ASPI on speech inversion, the EU-FP7 STREP project DictaSign on sign language recognition, the Greek SRT PENED project on multimodal interfaces etc.