Читать книгу The Handbook of Multimodal-Multisensor Interfaces, Volume 1 - Sharon Oviatt - Страница 9
ОглавлениеFigure Credits
Figure 1.1 From: S. L. Oviatt, R. Lunsford, and R. Coulston. 2005. Individual differences in multimodal integration patterns: What are they and why do they exist? In Proc. of the Conference on Human Factors in Computing Systems [CHI ’05], CHI Letters. pp. 241–249. Copyright© 2005 ACM. Used with permission.
Figure 1.2 From: S. Oviatt and P. Cohen. 2015. The Paradigm Shift to Multimodality in Contemporary Computer Interfaces. Morgan Claypool Synthesis Series. San Rafael, CA. Copyright © 2015 Morgan & Claypool Publishers. Used with permission.
Figure 1.3 From: M. Ernst and H. Bulthoff. 2004. Merging the senses into a robust percept. Trends in Cognitive Sciences, 8(4):162–169. Copyright © 2004 Elsevier Ltd. Used with permission.
Figure 1.4 (left) From: S. Oviatt, A. Cohen, A. Miller, K. Hodge, and A. Mann. 2012b. The impact of interface affordances on human ideation, problem solving and inferential reasoning. In Transactions on Computer Human Interaction. Copyright© 2012 ACM. Used with permission.
Figure 2.1 From: B. E. Stein, T. R. Stanford, and B. A. Rowland. 2014. Development of multisensory integration from the perspective of the individual neuron. Nature Reviews Neuroscience, 15(8):520–535. Copyright © 2014 Macmillan Publishers Ltd. Used with permission.
Figure 2.2 From: K. H. James, G. K. Humphrey, and M. A. Goodale. 2001. Manipulating and recognizing virtual objects: where the action Is. Canadian Journal of Experimental Psychology, 55(2):111–120. Copyright © 2001 Canadian Psychological Association. Used with permission.
Figure 2.3 From: The Richard D. Walk papers, courtesy Drs. Nicholas and Dorothy Cummings Center for the History of Psychology, The University of Akron.
Figure 2.4 From: A. F. Pereira, K. H. James, S. S. Jones, and L. B. Smith. 2010. Early biases and developmental changes in self-generated object views. Journal of Vision, 10(11):22:1–13. Copyright © 2010 Association for Research in Vision and Ophthalmology. Used with permission.
Figure 2.6 From: A.J. Butler and K. H. James. 2013. Active learning of novel sound-producing objects: motor reactivism and enhancement of visuo-motor connectivity. Journal of Cognitive Neuroscience, 25(2):203–218. Copyright © 2013 Massachusetts Institute of Technology.
Figure 2.7 From: A.J. Butler and K. H. James. 2013. Active learning of novel sound-producing objects: motor reactivism and enhancement of visuo-motor connectivity. Journal of Cognitive Neuroscience, 25(2):203–218. Copyright © 2013 Massachusetts Institute of Technology.
Figure 2.8 From: K. H. James and I. Gauthier. 2006. Letter processing automatically recruits a sensorymotor brain network. Neuropsychologia, 44(14):2937–2949. Copyright © 2006 Elsevier Ltd. Used with permission.
Figure 2.9 From: K. H. James and T. Atwood. 2009. The role of sensorimotor learning in the perception of letter-like forms: Tracking the causes of neural specialization for letters. Cognitive Neuropsychology, 26(1):91–110. Copyright © 2009 Taylor & Francis. Used with permission.
Figure 2.10 From: K. H. James and S. N. Swain. 2011. Only self-generated actions create sensori-motor systems in the developing brain. Developmental Science, 14(4):673–687. Copyright © 2011 John Wiley & Sons Inc. Used with permission.
Figure 2.11 From: K. H. James and L. Engelhardt. 2012. The effects of handwriting experience on functional brain development in per-literate children. Trends in Neuroscience and Education, 1:32–42. Copyright © 2012 Elsevier Ltd. Used with permission.
Figure 2.13 From: A. F. Pereira, K. H. James, S. S. Jones, and L. B. Smith. 2010. Early biases and developmental changes in self-generated object views. Journal of Vision, 10(11):22:1–13. Copyright © 2010 Association for Research in Vision and Ophthalmology. Used with permission.
Figure 2.15 From: Boring, E. G. (1964). Size-constancy in a picture. The American Journal of Psychology, 77(3), 494–498. Copyright© 1964 University of Illinois Press. Used with permission.
Figure 2.16 From: K. H. James and I. Gauthier. 2009. When writing impairs reading: Letter perception’s susceptibility to motor interference. Journal Experimental Psychology General, 138(3):416. Copyright © 2009 American Psychological Association. Used with permission.
Figure 3.2 (bottom left) From: O. S. Schneider and K. E. MacLean. 2016. Studying Design Process and Example Use with Macaron, a Web-based Vibrotactile Effect Editor. In Proceedings of the Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems—HAPTICS ’16. Copyright © 2016 IEEE. Used with permission.
Figure 3.2 (bottom right) Courtesy of Anton Håkanson
Figure 3.4 Based on: B. Buxton. 2007. Sketching User Experiences: Getting the Design Right and the Right Design. Morgan Kaufmann Publishers Inc.
Figure 4.1 From: K. Hinckley, J. Pierce, M. Sinclair, and E. Horvitz. 2000. Sensing techniques for mobile interaction. In Proceedings of the 13th Annual ACM Symposium on User Interface Software and Technology, pp. 91–100. San Diego, CA. Copyright © 2000 ACM. Used with permission.
Figure 4.2 Based on: W. Buxton. 1995. Integrating the periphery and context: A new taxonomy of telematics. In Proceedings of Graphics Interface ’95. Quebec City, Quebec, Canada.
Figure 4.3 Based on: W. Buxton. 1995. Integrating the periphery and context: A new taxonomy of telematics. In Proceedings of Graphics Interface ’95. Quebec City, Quebec, Canada.
Figure 4.4 Image courtesy of iStock.com/monsitj. Webpage © Bill Buxton http://www.billbuxton.com/multitouchOverview.html.
Figure 4.5 (video) From: M. Wu and R. Balakrishnan. 2003. Multi-finger and whole hand gestural interaction techniques for multi-user tabletop displays. In Proceedings of the 16th Annual ACM Symposium on User Interface Software and Technology (UIST ’03), pp. 193–202, Vancouver, BC, Canada. Copyright © 2003 ACM. Used with permission.
Figure 4.6 (video) From: C. Harrison, R. Xiao, J. Schwarz, and S. E. Hudson. 2014. TouchTools: leveraging familiarity and skill with physical tools to augment touch interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). Copyright © 2014 ACM. Used with permission.
Figure 4.7 From: M. Annett, A. Gupta, and W. F. Bischof. 2014. Exploring and Understanding Unintended Touch during Direct Pen Interaction. ACM Transactions on Computer-Human Interaction 21(5): Article 28 (39pp). Copyright © 2014 ACM. Used with permission.
Figure 4.8 Image courtesy of Julia Schwarz, http://juliaschwarz.net.
Figure 4.8 (video) From: J. Schwarz, R. Xiao, J. Mankoff, S. E. Hudson and C. Harrison. 2014. Probabilistic palm rejection using spatiotemporal touch features and iterative classification. In CHI’14. Toronto, Canada. Copyright © 2014 ACM. Used with permission.
Figure 4.9 Video courtesy of P. Brandl, J. Leitner, T. Seifried, M. Haller, B. Doray, &P. To. Used with permission.
Figure 4.10 (video) From: I. Siio and H. Tsujita. 2006. Mobile interaction using paperweight metaphor. In Proceedings of the 19th Annual ACM Symposium on User Interface Software and Technology (UIST ’06). Copyright © 2006 ACM. Used with permission.
Figure 4.11 (video) Courtesy of M. Annett, A. Gupta, and W. F. Bischof. Used with permission.
Figure 4.12 Based on: K. Hinckley and H. Song. 2011. Sensor synaesthesia: Touch in motion, and motion in touch. CHI ’11. Vancouver, BC, Canada. ACM, New York.
Figure 4.13 (video) From: K. Hinckley and H. Song. 2011. Sensor synaesthesia: Touch in motion, and motion in touch. CHI ’11. Vancouver, BC, Canada. Copyright © 2011 ACM. Used with permission.
Figure 4.14 (video) From: M. Goel, J. Wobbrock, and S. Patel. 2012b. GripSense: Using built-in sensors to detect hand posture and pressure on commodity mobile phones. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST ’12). Copyright © 2012 ACM. Used with permission.
Figure 4.15 (video) Courtesy of A. Roudaut, M. Baglioni, and E. Lecolinet. Used with permission.
Figure 4.16 (video) From: L.-P. Cheng, H.-S. Liang, C.-Y. Wu, and M. Y. Chen. 2013b. iGrasp: grasp-based adaptive keyboard for mobile devices. In Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems (CHI ’13). Copyright © 2013 ACM. Used with permission.
Figure 4.17 (video) From: M. F. M. Noor, A. Ramsay, S. Hughes, S. Rogers, J. Williamson, and R. Murray-Smith. 2014. 28 frames later: predicting screen touches from back-of-device grip changes. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’14). Copyright © 2014 ACM. Used with permission.
Figure 4.18 From: K. Hinckley and B. Buxton. 2016. Inking outside the box: How context sensing affords more natural pen (and touch) computing. In T. Hammond, editor. Revolutionizing Education with Digital Ink. Springer International Publishing, Switzerland. Copyright © 2016 Springer. Used with permission.
Figure 4.19 (video) Courtesy of D. Yoon, K. Hinckley, H. Benko, F. Guimbretière, P. Irani, M. Pahud, and M. Gavriliu. Used with permission.
Figure 4.20 Based on: K. Hinckley. 1997. Haptic issues for virtual manipulation. Department of Computer Science, University of Virginia, Charlottesville, VA.
Figure 4.21 (video) Courtesy of Bill Buxton.
Figure 4.22 (video) Courtesy of Bill Buxton.
Figure 4.24 (video) Courtesy of K. Hinckley, M. Pahud, N. Coddington, J. Rodenhouse, A. Wilson, H. Benko, and B. Buxton. Used with permission.
Figure 4.26 Based on: K. Hinckley, M. Pahud, H. Benko, P. Irani, F. Guimbretiere, M. Gavriliu, X. Chen, F. Matulic, B. Buxton, and A. Wilson. 2014. Sensing techniques for tablet+stylus interaction. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST’14), Honolulu, HI. ACM, New York.
Figure 6.1 Based on: Wagner, P., Malisz, Z., & Kopp, S. (2014). Gesture and Speech in Interaction: An Overview. Speech Communication, 57(Special Iss.), 209–232.
Figure 6.2 Based on: Wagner, P., Malisz, Z., & Kopp, S. (2014). Gesture and Speech in Interaction: An Overview. Speech Communication, 57(Special Iss.), 209–232.
Figure 6.3 Based on: Kopp, S., Bergmann, K., & Kahl, S. (2013). A spreading-activation model of the semantic coordination of speech and gesture. Proceedings of the 35th Annual Meeting of the Cognitive Science Society (pp. 823–828). Austin, TX, USA: Cognitive Science Society.
Figure 6.4 Based on: Kopp, S., Bergmann, K., & Kahl, S. (2013). A spreading-activation model of the semantic coordination of speech and gesture. Proceedings of the 35th Annual Meeting of the Cognitive Science Society (pp. 823–828). Austin, TX, USA: Cognitive Science Society.
Figure 6.7 From: Bergmann, K., Kahl, S., & Kopp, S. (2013). Modeling the semantic coordination of speech and gesture under cognitive and linguistic constraints. In R. Aylett, B. Krenn, C. Pelachaud, & H. Shimodaira (Eds.), Proceedings of the 13th International Conference on Intelligent Virtual Agents (pp. 203–216). Copyright © 2013, Springer-Verlag Berlin Heidelberg. Used with permission.
Figure 6.8 From: Bergmann, K., Kahl, S., & Kopp, S. (2013). Modeling the semantic coordination of speech and gesture under cognitive and linguistic constraints. In R. Aylett, B. Krenn, C. Pelachaud, & H. Shimodaira (Eds.), Proceedings of the 13th International Conference on Intelligent Virtual Agents (pp. 203–216). Copyright © 2013, Springer-Verlag Berlin Heidelberg. Used with permission.
Figure 7.3 (video) From: Wilson, G., Davidson, G., & Brewster, S. (2015). In the Heat of the Moment: Subjective Interpretations of Thermal Feedback During Interaction. Proceedings CHI ’15, 2063–2072. Copyright © 2015 ACM. Used with permission.
Figure 7.4 From: David K. McGookin and Stephen A. Brewster. 2006. SoundBar: exploiting multiple views in multimodal graph browsing. In Proceedings of the 4th Nordic conference on Human-computer interaction: changing roles (NordiCHI ’06), Anders Mørch, Konrad Morgan, Tone Bratteteig, Gautam Ghosh, and Dag Svanaes (Eds.), 145–154. Copyright © 2006 ACM. Used with permission.
Figure 7.5 (video) From: Ross McLachlan, Daniel Boland, and Stephen Brewster. 2014. Transient and transitional states: pressure as an auxiliary input modality for bimanual interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). Copyright © 2014 ACM. Used with permission.
Figure 7.6 (video) Courtesy of David McGookin, Euan Robertson, and Stephen Brewster. Used with permission.
Figure 7.7 (left) Video courtesy of David McGookin and Stephen Brewster. Used with permission.
Figure 7.7 (right) Video courtesy of David McGookin and Stephen Brewster. Used with permission.
Figure 7.8 (left) From: Plimmer, B., Reid, P., Blagojevic, R., Crossan, A., & Brewster, S. (2011). Signing on the tactile line. ACM Transactions on Computer-Human Interaction, 18(3), 1–29. Copyright © 2011 ACM. Used with permission.
Figure 7.8 (right) From: Yu, W., & Brewster, S. (2002). Comparing two haptic interfaces for multimodal graph rendering. Proceedings HAPTICS ’02, 3–9. Copyright © 2002 IEEE. Used with permission.
Figure 7.9 (video) From: Beryl Plimmer, Peter Reid, Rachel Blagojevic, Andrew Crossan, and Stephen Brewster. 2011. Signing on the tactile line: A multimodal system for teaching handwriting to blind children. ACM Transactions on Computer-Human Interaction 18, 3, Article 17 (August 2011), 29 pages. Copyright © 2011 ACM. Used with permission.
Figure 7.10 (right) From: Euan Freeman, Stephen Brewster, and Vuokko Lantz. 2014. Tactile Feedback for Above-Device Gesture Interfaces: Adding Touch to Touchless Interactions. In Proceedings of the 16th International Conference on Multimodal Interaction (ICMI ’14), 419–426. Copyright © 2014 ACM. Used with permission.
Figure 7.11 (left) Video courtesy of Euan Freeman. Used with permission.
Figure 7.11 (right) Video courtesy of Euan Freeman. Used with permission.
Figure 7.12 (video) From: Ioannis Politis, Stephen A. Brewster, and Frank Pollick. 2014. Evaluating multimodal driver displays under varying situational urgency. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). New York, NY, USA, 4067–4076. Copyright © 2014 ACM. Used with permission.
Figure 8.1 Based on: A. H. Maslow. 1954. Motivation and personality. Harper and Row.
Figure 8.2 (left) From: D. McColl, W.-Y. G. Louie, and G. Nejat. 2013. Brian 2.1: A socially assistive robot for the elderly and cognitively impaired. IEEE Robotics & Automation Magazine, 20(1): 74–83. Copyright © 2013 IEEE. Used with permission.
Figure 8.2 (right) From: P. Bovbel and G. Nejat, 2014. Casper: An Assistive Kitchen Robot to Promote Aging in Place. Journal of Medical Devices, 8(3), p.030945. Copyright © 2014 ASME. Used with permission.
Figure 8.2 (video) Courtesy of the Autonomous Systems and Biomechtronics Laboratory (ASBLab) at the University of Toronto
Figure 8.3 From: M. Nilsson, J. Ingvast, J. Wikander, and H. von Holst. 2012. The soft extra muscle system for improving the grasping capability in neurological rehabilitation. In Biomedical Engineering and Sciences (IECBES), 2012 IEEE EMBS Conference on, pp. 412–417. Copyright © 2012 IEEE. Used with permission.
Figure 8.4 From: T. Visser, M. Vastenburg, and D. Keyson. 2010. Snowglobe: the development of a prototype awareness system for longitudinal field studies. In Proc. 8th ACM Conference on Designing Interactive Systems, pp. 426–429. Copyright © 2010 ACM. Used with permission.
Figure 8.5 (right) Video courtesy of Cosmin Munteanu & Albert Ali Salah. Used with permission.
Figure 8.6 From: C. G. Pires, F. Pinto, V. D. Teixeira, J. Freitas, and M. S. Dias. 2012. Living home center–a personal assistant with multimodal interaction for elderly and mobility impaired e-inclusion. Computational Processing of the Portuguese Language: 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17–20, 2012, Proceedings. Copyright © 2012 Springer-Verlag Berlin Heidelberg. Used with permission.
Figure 8.7 From: Morency, L.P., Stratou, G., DeVault, D., Hartholt, A., Lhommet, M., Lucas, G.M., Morbini, F., Georgila, K., Scherer, S., Gratch, J. and Marsella, S., 2015, January. SimSensei Demonstration: A Perceptive Virtual Human Interviewer for Healthcare Applications. AAAI Conference on Artificial Intelligence, (pp. 4307–4308). Copyright © 2015 AAAI Press. Used with permission.
Figure 8.7 (video) Courtesy of USC Institute for Creative Technologies. Principal Investigators: Albert (Skip) Rizzo and Louis-Philippe Morency.
Figure 8.8 From: F. Ferreira, N. Almeida, A. F. Rosa, A. Oliveira, J. Casimiro, S. Silva, and A. Teixeira. 2014. Elderly centered design for interaction–the case of the s4s medication assistant. Procedia Computer Science, 27: 398–408. Copyright © 2014 Elsevier. Used with permission.
Figure 8.9 Courtesy of Jocelyn Ford.
Figure 8.11 Courtesy of © Toyota Motor Sales, U.S.A., Inc.
Figure 8.12 Courtesy of © 2016 ANSA.
Figure 8.12 (video) Courtesy of Robot-Era Project, The BioRobotics Institute, Scuola Superiore Sant’Anna, Italy.
Figure 8.13 From: R. Shilkrot, J. Huber, W. Meng Ee, P. Maes, and S. C. Nanayakkara. 2015. Fingerreader: a wearable device to explore printed text on the go. In ACM Transactions on Computer-Human Interaction, pp. 2363–2372. Copyright © 2015 ACM. Used with permission.
Figure 8.14 Adapted from: B. Görer, A. A. Salah, and H. L. Akin. 2016. An autonomous robotic exercise tutor for elderly people. Autonomous Robots.
Figure 8.14 (video) Courtesy of Binnur Görer.
Figure 8.15 From: B. Görer, A. A. Salah, and H. L. Akin. 2016. An autonomous robotic exercise tutor for elderly people. Autonomous Robots. Copyright © 2016 Springer Science+Business Media New York. Used with permission.
Figure 9.3 From: P. Qvarfordt and S. Zhai. 2009. Gaze-aided human-computer and human-human dialogue. In B. Whitworth and A. de Moo, eds., Handbook of Research on Socio-Technical Design and Social Networking Systems, chapter 35, pp. 529–543. Copyright © 2009 IGI Global. Reprinted by permission of the copyright holder.
Figure 9.4 From: P. Qvarfordt and S. Zhai. 2009. Gaze-aided human-computer and humanhuman dialogue. In B. Whitworth and A. de Moo, eds., Handbook of Research on Socio-Technical Design and Social Networking Systems, chapter 35, pp. 529–543. Copyright © 2009 IGI Global. Reprinted by permission of the copyright holder.
Figure 9.7 From: P. Qvarfordt and S. Zhai. 2005. Conversing with the user based on eye-gaze patterns. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI ’05), pp. 221–230. Copyright © 2005 ACM. Used with permission.
Figure 10.1 From: S. Oviatt, R. Lunsford, and R. Coulston. 2005. Individual differences in multimodal integration patterns: What are they and why do they exist? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 241–249. Copyright © 2005 ACM. Used with permission.
Figure 10.5 (left) From: P. R. Cohen, M. Johnston, D. R. McGee, S. L. Oviatt, J. Pittman, I. Smith, L. Chen, and J. Clow. 1997. QuickSet: Multimodal interaction for distributed applications. In Proceedings of the Fifth ACM International Conference on Multimedia, pp. 31–40. Copyright © 1997 ACM. Used with permission.
Figure 10.6 Video courtesy of Phil Cohen. Used with permission.
Figure 10.7 From: P. R. Cohen, D. R. McGee, and J. Clow. 2000. The efficiency of multimodal interaction for a map-based task. In Proceedings of the Sixth Conference on Applied Natural Language Processing, Association for Computational Linguistics, pp. 331–338. Copyright © 2000 Association for Computational Linguistics. Used with permission.
Figure 10.7 (video) Video courtesy of Phil Cohen. Used with permission.
Figure 10.8 Video courtesy of Phil Cohen. Used with permission.
Figure 10.9 From: D. R. McGee, P. R. Cohen, M. Wesson, and S. Horman. 2002. Comparing paper and tangible, multimodal tools. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 407–414. Copyright © 2002 ACM. Used with permission.
Figure 10.10 Video courtesy of Phil Cohen. Used with permission.
Figure 10.11 From: P. Ehlen and M. Johnston. 2012. Multimodal interaction patterns in mobile local search. In Proceedings of ACM Conference on Intelligent User Interfaces, pp. 21–24. Copyright © 2012 ACM. Used with permission.
Figure 10.12 Based on: M. Johnston, J. Chen, P. Ehlen, H. Jung, J. Lieske, A. Reddy, E. Selfridge, S. Stoyanchev, B. Vasilieff, and J. Wilpon. 2014. MVA: The Multimodal Virtual Assistant. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Association for Computational Linguistics, pp. 257–259.
Figure 10.13 Based on: Wahlster, Wolfgang (2002): SmartKom: Fusion and Fission of Speech, Gestures, and Facial Expressions. In Proc. of the 1st International Workshop on Man-Machine Symbiotic Systems. Kyoto, Japan, pp. 213–225. Used with permission.
Figure 10.14 Courtesy of openstream.com. Used with permission.
Figure 10.15 (right) From: P. R. Cohen, E C. Kaiser, M. C. Buchanan, S. Lind, M. J. Corrigan, and R. M. Wesson. 2015. Sketch-thru-plan: a multimodal interface for command and control. Communications of the ACM, 58(4):56–65. Copyright © 2015 ACM. Used with permission.
Figure 10.16 From: P. R. Cohen, E C. Kaiser, M. C. Buchanan, S. Lind, M. J. Corrigan, and R. M. Wesson. 2015. Sketch-thru-plan: a multimodal interface for command and control. Communications of the ACM, 58(4):56–65. Copyright © 2015 ACM. Used with permission.
Figure 11.1 From: R. A. Bolt. 1980. Put-that-there: Voice and gesture at the graphics interface. ACM SIGGRAPH Computer Graphics, 14(3): 262–270. Copyright © 1980 ACM. Used with permission.
Figure 11.1 (video) Courtesy of Chris Schmandt, MIT Media Lab Speech Interface group.
Figure 11.3 From: P. Maragos, V. Pitsikalis, A. Katsamanis, G. Pavlakos, and S. Theodorakis. 2016. On shape recognition and language. In M. Breuss, A. Bruckstein, P. Maragos, and S. Wuhrer, eds., Perspectives in Shape Analysis. Springer. Copyright© 2016 Springer International Publishing Switzerland. Used with permission.
Figure 11.4a (video) Courtesy of Botsquare.
Figure 11.4b (video) Courtesy of Leap Motion.
Figure 11.5 Based on: Krahnstoever, S. Kettebekov, M. Yeasin, and R. Sharma. 2002. A real-time framework for natural multimodal interaction with large screen displays. In Proceedings of the International Conference on Multimodal Interfaces, p. 349.
Figure 11.6 Based on: L. Pigou, S. Dieleman, P.-J. Kindermans, and B. Schrauwen. 2015. Sign language recognition using convolutional neural networks. In L. Agapito, M. M. Bronstein, and C. Rother, eds., Computer Vision—ECCV 2014 Workshops, volume LNCS 8925, pp. 572–578.
Figure 11.7 Based on: N. Neverova, C. Wolf, G. W. Taylor, and F. Nebout. 2015. Multi-scale deep learning for gesture detection and localization. In L. Agapito, M. M. Bronstein, and C. Rother, editors, Computer Vision—ECCV 2014 Workshops, volume LNCS 8925, pp. 474–490.
Figure 11.8 Based on: D. Yu and L. Deng. 2011. Deep learning and its applications to signal and information processing [exploratory DSP]. IEEE Signal Processing Magazine, 28(1): 145–154.
Figure 11.11 From: Pavlakos, S. Theodorakis, V. Pitsikalis, A. Katsamanis, and P. Maragos. 2014. Kinect-based multimodal gesture recognition using a two-pass fusion scheme. In Proceedings of the International Conference on Image Processing, pp. 1495–1499. Copyright © 2014 IEEE. Used with permission.
Figure 11.11 (video) Courtesy of Stavros Theodorakis. Used with permission.
Figure 12.1 Based on G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior. 2003. Recent advances in the automatic recognition of audio-visual speech. Proceedings of the IEEE, 91(9): 1306–1326.
Figure 12.2a From: J. Huang, G. Potamianos, J. Connell, and C. Neti. 2004. Audio-visual speech recognition using an infrared headset. Speech Communication, 44(4): 83–96. Copyright © 2004 Elsevier B.V. Used with permission.
Figure 12.2b (top) Courtesy of iStock.com/kursatunsal
Figure 12.2b (middle) Courtesy of iStock.com/Stratol
Figure 12.2b (bottom) Courtesy of FLIR Systems, Inc.
Figure 12.6 Based on G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W. Senior. 2003. Recent advances in the automatic recognition of audio-visual speech. Proceedings of the IEEE, 91(9): 1306–1326.
Figure 12.7 Based on: E. Marcheret, G. Potamianos, J. Vopicka, and V. Goel. 2015b. Scattering vs. discrete cosine transform features in visual speech processing. In Proceedings of the International Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing (FAAVSP), pp. 175–180.
Figure 12.9 Based on: S. Thermos and G. Potamianos. 2016. Audio-visual speech activity detection in a two-speaker scenario incorporating depth information from a profile or frontal view. In Proceedings of the IEEE Spoken Language Technology Workshop (SLT), pp. 579–584.
Figure 12.10 Based on: E. Marcheret, G. Potamianos, J. Vopicka, and V. Goel. 2015b. Scattering vs. discrete cosine transform features in visual speech processing. In Proceedings of the International Joint Conference on Facial Analysis, Animation, and Auditory-Visual Speech Processing (FAAVSP), pp. 175–180.