Madaari

Madaari (noun) /Mada:ri/ Hindi term for a street performer, storyteller who uses puppets or animals in the act.

An Authoring Tool for Digital Storytelling

Roles

UX Research, Engineering

Timeline

~ 4 months

Location

Weave Lab, IIIT-D

Overview

Modern forms of storytelling, from theatre to Disney shows, leverage technology and multimedia such as audio, animations, animatronics, lighting, and more to create an engaging and immersive experience. Translating such experiences into everyday storytelling for parents and teachers is, however, an arduous task requiring multiple technical skills. Madaari is a tool built to enable anyone to create such digtal storytelling experiences easily and rapidly.

Goals and Features

The goal was to enable users to incorporate different modalities of digital storytelling to their personal experience. Taking inspiration from modern shows, theatre, theme parks, and 4D cinema, we listed the following output capabilities for our tool allowing storytellers to experiment with their creativity:

Next, we considered how users might want to actuate these actions while reading the story. On reviewing some typical storytelling sessions, we realised storytellers often use various forms of expression besides speech such as facial expressions and gestures. We saw an opportunity to use these elements as cues to associate multiple actions with.

As an example, a storyteller could say — "It was a sunny day" which would change the room's lighting to bright yellow and play an audio of chirping birds in the background. Or a sad face expression from the storyteller could cue the animation of rain and turn the fan on to simulate wind.

We decided to add the following sensing capabilities to our tool:

Note: Adding different modalities to our tool in no way implied that the user must use all these features. These functionalities are only provided as options for a user to choose from based on their requirements.

Design Considerations

After establishing the technological goals for our tool, we worked towards the design process and implementation. We started with a user personna having no prior programming experience with the motivation of narrating a story supported by the above mentioned multimedia modalities. A workflow was decided with three clear stages for the user.

The user journey would start by selecting the central aspect of the tool — the story. Next, the user would program the different actions relevant to the selected story. Finally, one would simply read the story as intended while observing the programmed actions come to life.

The Implementation

Building an interactive interface, we allowed this part of the process to be iterative. It started with building modules from the base wireframes but evolved constantly by adding small functionalities that felt necessary as the tool materialized to a real usable interface. As the user journey suggested, the interface consists of three main screens, with a common header providing navigation while highlighting the step-wise process.

Selecting Stories

Programming

Reading

Selecting Stories

We determined multiple ways in which a user might wish to enter a story. One could either simply type a story, or paste one from the internet. One could also want to take a picture out of a storybook, upload a text file, or even narrate. We implemented all these options for the user to enter their story using popular speech recognition, OCR, and text extracting APIs. The design of this screen is essentially a text editor to place all the emphasis on the task at hand. In addition, we provided Madaari with its own database of stories for users to select from saving them the trouble of browsing through external sources to get started.

To increase the usability of a new system like Madaari, we provided a wizard tutorial as a quick walkthrough guide. Later, we use the same wizard modal during the programming step to succintly describe the functionalities Madaari provides.

Programming using a block-based environment

This step enables a user to be creative and connect the elements of multimedia storytelling with their chosen story. To accomplish the task of lowering the barrier of developing such complex modules, while providing programming-like freedom, we took inspiration from block-based visual programming. Tools like MIT's Scratch, Blockly, and other block-based authoring tools are often used to help novice programmers and children utilize programming capabilities in an interactive, visual construct.

Designing blocks and their functions

We represent output capabilities as action blocks in the programming space while the sensing capabilities are cue blocks. Control blocks like if-then and repeat, are provided to connect the action and cue blocks together and extend control over actions respectively. Further, functionalities like programming a sequential set of actions or combining two cues together are covered through apparent blocks such as 'wait for x seconds' and 'or' cue block. Despite its ease, certain blocks like animation, music, and robot are accompanied by panels that provide a better preview to test actions. All of this is tied together to the story that is always accessible as a reference and can be used to generate keywords as cues.

The user experience of programming through blocks

We ensured that the user experience was designed in a way to correctly guide the user, preventing possible errors instead of correcting them. For example, the user could not drag an action block into the if condition space or use blocks with blank values. Alongside, a compiler-like check was implemented to filter logical errors before proceeding to the reading environment.

We also provided subtle feedback responses for the user to their actions. For example, adding a keyword cue in the programming space highlights that words in the reference story. Generating a word from the helper panel puts emphasis on a newly generated block through a glow animation reinforcing the success of their action.

How was this made?

All the logic for implementing this interface is written in vanilla Javascript and jQuery (a bit old school, eh?) from scratch (not MIT Scratch). The program created using blocks is translated to arrays of Javascript objects referring to actions and cues that are linked by index and uploaded to Firebase for real-time access on multiple devices.

Reading the Story

Finally, the user can experience the digital storytelling experience that s/he programmed using Madaari. Since we incorporated multiple modalities, the reading mode is designed to be dynamic. First, the user receives a prompt to check the connected devices based on the modalities used in the program. Next, the user observes the reading screen with the story written in a large readable font. The keywords on the story are highlighted according to the program and light up when the particular word is said to indicate the detection of the cue. Other dynamic feedback windows such as animation preview, expression detector displaying the live feed from the front camera are placed in a collapsible panel on the right. We designed the reading screen such that the user is comfortable reading the story through their laptop, tablet or even a storybook with Madaari running on the side. It implements a dark mode for night time stories for a kid by a parent. The user can also enable and control the speed for autoscroll while reading.

Detecting Cues

We use the WebSpeechAPI to detect speech and Affectiva's canvas based API implementation for expression detection. The real-time results from these are matched with the programmed cues loaded from Firebase and the corresponding actions are executed if the match is true.

User Studies & Limitations

To validate our work, we conducted user studies with 14 adults between the ages of 19 and 45. The study included four tasks with three tasks being goal oriented and the fourth task being an open ended activity. Feedback was received through pre, mid, and post study questionnaires. The study allowed us to analyse the strengths and limitations of Madaari. Though most users were able to complete all the tasks and reported Madaari as fairly easy to use, we observed the following limitations with the tool:

Final thoughts and Future

Though a working prototype, I'm satisfied with the current version of Madaari and the tool's ability to enable anyone to author digital storytelling experiences. Madaari could definitely be scaled to a community of storytellers sharing their programmed stories. Additionally, tools like Madaari can be used for teaching children. Multi-sensory engagement through stories can be used to enhance communication with autistic children. I plan to revisit this project soon with changes that allow it to be used effectively by the open source community.

Madaari was conceptualized and developed in collaboration with Weave Lab, IIIT-D . The project was also submitted as a research paper at CHI 2020.