Dialogue-based tutoring at scale : Design and Challenges

Maria Chang, Matthew Ventura,Jae-wook Ahn,Peter Foltz,Tengfei Ma,Tejas I. Dhamecha, Smit, Marvaniya,Patrick Watson,Cassius D’helon, Amy Wetzel, Andy Packard Haas, Kaitlyn Banaszynski,John Behrens, Gailene Nelson,Sharad C. Sundararajan,Ravi Tejwani,Shazia Afzal,Nirmal Mukhi

semanticscholar(2018)

引用 1|浏览3
暂无评分
摘要
In 2016, IBM and Pearson announced a partnership to deliver a next generation learning service in the form of a dialogue-based tutor. Dialogue-based tutoring systems have demonstrated efficacy, but they are difficult to design and scale across domains. We have developed a framework for enabling digital courseware with a dialogue-based tutoring experience that can be applied to new domains with additional domain-specific content, but without re-design of the conversation flow or use case. The framework uses a content model that’s consistent across domains, which enables a general dialogue-based tutoring strategy. We identify several challenges to this approach, as well as recommendations for future work. The Pearson-IBM partnership In 2016, IBM and Pearson announced a partnership to deliver next generation learning services in the form of a digital tutoring system. The aim of this partnership was to create learning experiences powered by Pearson’s highquality content and IBM’s Watson technologies. While there have been many successful intelligent tutoring systems (ITSs), our challenge was scaling the process across multiple disciplines and titles. This paper is organized into 4 parts. First, we briefly review the value of dialogue-based tutoring systems. Second, we describe what we call the Watson dialogue-based tutor. Third, we will describe some of our approaches to scaling and evaluating the Watson dialogue-based tutor. Finally, describe the limitations of our approach and recommendations for future work. Dialogue-based tutoring systems Dialogue-based tutoring systems (DBTs) are an approach to ITSs that create a learning experience driven by natural language dialogue and classification of student natural language responses (e.g., Graesser, 2011). DBT conversations can be described as Socratic because the tutor guides the student through concepts via dialogue moves, which can include questions, hints, and other prompts. DBTs have been claimed to support a variety of learning principles and strategies, like encouraging constructive behaviors and self-explanations (M. T. H. Chi, 2009), deep reasoning questions (e.g., Graesser & Person, 1994), and conceptual understanding through scaffolding (e.g., VanLehn, 2011). DBTs require students to construct natural language responses, which can have a positive impact on memory and comprehension of source text (e.g., McNamara, 1992). DBTs can also provide immediate feedback to facilitate learning (Shute, 2008). For example, AutoTutor is a DBT, i.e., an ITS that initiates discourse with a student. The discourse patterns of the earliest AutoTutor were inspired by analyses of approximately 100 hours of non-expert human tutoring interactions (Graesser, 2011), which showed that students in need of tutoring are not active, self-regulated learners, and are not aware of their knowledge deficits. This affects how students converse: they do not effectively take command of the tutorial agenda and typically ask only 6-8 genuine information-seeking questions per hour. In contrast, tutors set 100% of the agenda, introduced 93% of the topics, presented 82% of examples, and asked 80% of the questions. Tutors do this by invoking a curriculum script of topics, problems, questions, and examples to drive a Socratic tutoring dialogue with students. Based on this tutor analysis, AutoTutor was designed to control the conversation through an expectation-misconception discourse model of tutoring (Graesser, 2011). This consists of a set of anticipated correct ideal answers (expectations) and a set of invalid answers frequently expressed by students (misconceptions). AutoTutor follows this design in a five-step tutoring framework: (1) tutor poses a question/problem, (2) student attempts to answer, (3) tutor provides brief evaluation as feedback, (4) collaborative interaction to improve the answer, (5) tutor checks if student understands. Efficacy of DBTs Steenbergen-Hu and Cooper (2014) conducted a meta-analysis of 39 studies evaluating the use of ITSs (including DBTs) in higher education. The researchers found an overall, moderate, positive effect (g = .35) favoring the use of ITS over other instructional conditions. When compared specifically to alternatives that were either “self-reliant learning activities” or no-treatment conditions, the use of ITSs appeared to offer a large advantage (g =.86). AutoTutor in particular has shown significant learning gains over non-interactive learning materials in a variety of math and science domains: computer literacy, physics, biology, and critical thinking (Graesser, 2011). Typically, higher gains were found for more complex questions, such as “how” and “why” questions, versus shallow questions, such as “who” or “what” questions (Nye, Graesser and Hu, 2014). Scalability of DBTs While DBTs have demonstrated a wide range of possible behaviors and pedagogical strategies, building a DBT for a new domain, course, or textbook is a non-trivial task. Even when the use case and learning goals are clearly defined, creating the necessary domain models for these tutoring systems can be very challenging for domain experts. This is a general problem for ITSs, which is why the researchers behind the most widely adopted tutoring systems have also developed authoring tools (e.g., Aleven, McLaren, Sewall, & Koedinger, 2009). Watson DBT Watson dialogue-based tutor (WDBT) follows the design of AutoTutor in many respects, but with a content creation and iterative design cycle to support application to new domains. WDBT begins with deep reasoning questions and then provides hints to assist students to give a response that matches a set of assertions or knowledge components. An example transcript from an interaction with WDBT is shown in Table 1. WDBT is made up of 6 components to achieve this functionality: (1) Domain Model, (2) Dialogue Content, (3) Natural Language Response Classification, (4) Question Answering, (5) Learner Modeling, and (6) Dialogue Management. Domain Model The domain model defines the knowledge and skills we want students to learn. We create a domain model for a specific title (i.e. textbook) that breaks down the knowledge into educational objectives consisting of learning objectives and enabling objectives. Learning objectives are broad learning goal statements e.g., “Analyze physical changes that occur in middle adulthood.” Enabling objectives are more granular learning goal statements that support the learning objective e.g., “Identify the physical benchmarks of change in middle adulthood.” Educational objectives serve as the foundation for creating content and assessment in Pearson, so the same framework was used to enable WDBT. The domain model also contains misconception statements that are aligned to educational objectives. Domain experts create the domain models, choosing learning and enabling objectives that are particularly difficult for students or that would benefit from a conversational learning experience. For each title, domain models typically contain 24 learning objectives and 110 enabling objectives (about 5 enabling objectives per learning objective). Because the learning objectives follow the general organization of the title, the domain model acts as an extension of the core content in the title. Dialogue Content The dialogue content for WDBT consists of a mix of content created manually by domain experts, and automatically extracted content. Subject matter experts (SMEs) are responsible for authoring: • Main question and answer pairs aligned to both learning and enabling objectives. The questions are broad, high-level conceptual questions, to assess deep comprehension of source text. WDBT delivers these questions to students. The corresponding answers represent ideal responses and are used to evaluate student answer attempts. • Hint questions and assertions for every main question and answer pair. Assertions are independent knowledge components that make up a main answer. Hints are intended to elicit an assertion from a student. WDBT delivers these hint questions to students. • Misconception statements and corresponding feedback messages for every applicable learning and enabling objective. In the event that response classification detects that a student is expressing a misconception, WDBT delivers the corresponding feedback message. Other forms of dialogue content are automatically or semi-automatically extracted: • Question-answer pairs based on key terms and other sources within the textbook. • Concept groupings based on key concepts found in the unstructured text. These groupings are automatically extracted (1) but validated by a subject matter expert. • Fill in the blank prompts that are dynamically created within the dialogue using an assertion and response classification gap analysis. This approach to content organization is applied across titles to facilitate faster adaptation to new titles, but it is not without its limitations. There is still a considerable amount of human effort required for this process, although it is much less than what would be required if new conversational scripts (and associated content) were created for each domain. This points to the inherent trade-off between human effort savings and dialogue richness that can be addressed by better automatic content extraction techniques. Table 1: An excerpt of an interaction between a learner and Watson Dialogue-Based Tutor (WDBT). Dialogue moves and responses from WDBT are left aligned. Learner responses are right aligned. What is the nature of gender differences? its is common for gender to be defined by sex Looks like I have something different... How do sociologists define sex? biological differneces Well said. Sex refers to the biological differences between men and women. There are 2 key facts left in this review. What is gender? idk No problem. I’ll help you get this! Gender refers to socially created differences between men and women. How are gender diff
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要