Multimodal Context Awareness Requirements for Supporting Video User Interactions Following along how-to videos requires alternating focus between understanding procedural video instructions and performing them. Examining how to support these continuous context switches for the user has been largely unexplored. In this paper, we describe a user study with thirty participants who performed an hour-long cooking task while interacting with a wizard-of-oz hands-free interactive system that is aware of both their cooking progress and environment contexts. Through analysis of the session scripts, we identify a dichotomy between participant query differences and workflow alignment similarities, under-studied interactions that require AI functionality beyond video navigation alone, and queries that call for multimodal sensing of a user’s environment. By understanding the assistant experience through the participants’ interactions, we identify design implications for a smart assistant that can discern a user’s task completion flow and personal characteristics, accommodate requests within and external to the task domain, and support nonvoice-based queries.
References
2023
Identifying Multimodal Context Awareness Requirements for Supporting User Interaction with Procedural Videos
Georgianna Lin, Jin Yi Li, Afsaneh Fazly, and 2 more authors
In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023
Following along how-to videos requires alternating focus between understanding procedural video instructions and performing them. Examining how to support these continuous context switches for the user has been largely unexplored. In this paper, we describe a user study with thirty participants who performed an hour-long cooking task while interacting with a wizard-of-oz hands-free interactive system that is aware of both their cooking progress and environment contexts. Through analysis of the session scripts, we identify a dichotomy between participant query differences and workflow alignment similarities, under-studied interactions that require AI functionality beyond video navigation alone, and queries that call for multimodal sensing of a user’s environment. By understanding the assistant experience through the participants’ interactions, we identify design implications for a smart assistant that can discern a user’s task completion flow and personal characteristics, accommodate requests within and external to the task domain, andsupport nonvoice-based queries.