Augmented AI for ESL Learning

Written by sinyaa | Published 2020/09/07
Tech Story Tags: teaching-english-online | english-as-secondary-language | english | microsoft-azure | phaser3 | open-source | machine-learning | augmentation

TLDR In traditional forms of classroom ESL training, the emphasis is placed on syntax, critical vocabulary, and building transitions from the original known language to English. With forward chaining, an AI system can create a new fact, based on knowledge of an old truth, and rules about how facts interact. The problem with a bot approach to ESL lies in the fact that conversation is highly variable and achieving an end can be accomplished in numerous ways. Only scripting a single ideal interaction between a bot and a human is not sufficient. The bot needs to know, or learn, how to deal with exceptions.via the TL;DR App

With the development of new technologies in artificial intelligence there is an opportunity to introduce Guided Conversational Learning in language
acquisition. JAAS Foundation, a non-profit organization, provides opportunities for learners worldwide to build a foundation for their future with free educational games powered by state of the art e-Learning technologies.
JAAS Foundation developed and maintains open source system for guided conversation in ESL learning. Results from a cohort of nearly 4,000 learners show that the use of well-designed interactive guided conversation games is engaging, self-reinforcing, and meeting with universal acceptance. More than 31% of learners completed all activities and scenario steps.
In traditional forms of classroom ESL training, the emphasis is placed on syntax, critical vocabulary, and building transitions from the original known language to English. In a legacy ESL course, instructors often rely on repetitive grammar practice, reading and exercises that come from books, flashcards, and other passive methods (Zaban, 2018). Keeping students motivated in such a lethargic learning process is a challenge. Zaban contends that polyglot learning is hampered by a lack of compelling content, dynamic presentation,  and learner involvement. Judgment, whether imagined or real, is also seen as a potential barrier, often making the learner reluctant to attempt to speak the new language.
All of these concerns can be addressed with the use of well-designed interactive conversation agents.
The current study shows how bots can be programmed to be very dynamic, engaging, visually stimulating, and supportive of repetition with spaced interval learning while being completely void of judgment that might discourage an ESL learner.
The problem with a bot approach to ESL lies in the fact that conversation is highly variable and achieving an end can be accomplished in numerous ways. Only scripting a single ideal interaction between a bot and a human is not sufficient.
The bot needs to know, or learn, how to deal with exceptions. Some exceptions may be valid even though they are not ideal. A learner may express something valid, logical, grammatically correct, and that intuitively makes sense but was not anticipated by the program. Other responses may contain selected appropriate vocabulary, but be totally nonsensical due to grammar issues, construction faults, or other
problems. Finally, some answers will be merely incorrect.
A solution to the variability of speech problem includes the design of an AI conversational model based on flexible templates that can be delivered in a highly governed methodology. In AI research, this form of a directive pathway through a knowledge base is referred to as forward chaining, or forward reasoning (Irawen, n.d.). With forward chaining, an AI system can create a new fact, based on knowledge of an old truth, and rules about how facts interact. Manisekaran (2018) offers an example that demonstrates chaining taking the logical form of:
Conclude from “A”, given that “A implies B”, that “B” is true.
Example:
A: It is raining
A implies B: If it is raining, then the street is wet 
B: The street is wet.
Based on the knowledge of a fact “A,” and the rule of interaction that “A implies B”, the program was able to state a new fact, “B”. While this is an elementary example, it serves to show the nature of AI forward chaining as a deductive inference. It also leads to the characterization of reverse chaining as an abductive logical inference, where a conclusion “B”, based on the inference that “A implies B”, leads to “A” being true.
In the simple example above:
Conclude from “B,” given that “A implies B,” that “A” is true.
Example:
B: The street is wet.
A implies B: If it is raining, then the street is wet
                                            A: It is raining.
One inference “If it is raining, the street is wet” lead to two AI chaining outcomes based on known or learned facts.
As the knowledge base increases, the complexity of implications and valid results grows exponentially. The nature of authentic patterns for an ESL dialog can be extremely complex and variability can be normalized through the use of highly orchestrated directed chaining. In learning theory, this is known as Guided Conversation (Steward, Keegan, Holmberg, Beare,, & Smith, 1983). The nature of guided conversation in all forms of distance education is seen as a directed means to transfer knowledge, permitting natural language variability, while not straying too far from the patterned template for specific lesson content.

In JAAS Foundation's open source project called DTML (Distance Teaching and Mobile learning) platform, a conventional learning system is
represented as  a finite state machine, and a state machine execution environment represented as a game and set of back-end services.
Finite state machine in which each state is described as:
  • q a trigger phrase
  • W a set of candidate words, where W = {w1, w2, … wn}     
  • P a set of solution phrases, where P is a set of key-value pairs {(wi,metadata)} where wi ⊆ W}. The metadata provide additional information about a word in the    phrase and are used in solution mapping.      
  • S a solution provided by a user, where S ⊆ W
q → W → P
Based on the vocabulary and relationships introduced to the scripts, a finite number of solutions can be constructed of the candidate words to form a solution set. Each solution in the set would, in turn, trigger a state transition. The collection of solution phrases contains optional metadata describing the properties of the specific word in the phrase.
For example, the phrase “Hello, how are you doing?” can be encoded into a solution phrase as “[hello] how are you [doing]” indicating that the words “hello” and “doing” are optional. Therefore solutions such as “how are you?” or “how are you doing?” will be considered valid. Figure 1 depicts a simple, successful transition, where the learner has selected the correct vocabulary words S and has strung them together correctly into a valid response S ⊆ P triggering a transition into another state.
Figure 1. DTML Finite State Machine
In the case of a failed transaction, the application must be able to handle a state transition that prompts the user to try again. The retry until successful is an essential element in the DTML learning approach and it serves to reinforce ESL learning. Such an approach resembles natural learning when parents or teachers correct a student by providing the right answer.
The system invokes the suggestion service which dynamically analyzes the set of possible phrases p and the solution provided by the user s by calculating similarity metrics based on Levenshtein Distance
(Navarro, Gonzalo 2001) and a set of heuristic rules.  Figure 2 shows a simplified State example with a single failure and a directed prompt to attempt a second user input.
Figure 2. DTML State Machine with Input Validation
The DTML system architecture comprises a Conversation Editor, and Game Execution Environment, stacked on a JavaScript SDK, supported by a REST API. All of this is open source code available from DTML via github.

The game machine execution environment was developed with the Phaser.js framework in JavaScript. Phaser.js is an open source HTML5 framework for development of 2D games (Marín-Vega, 2016). The
execution is completely client side, meaning that all code is executed in the
browser and the only input needed is the game configuration file represented as JSON. This setup allows the application to cache the game executor on CDN for faster delivery and avoids the compatibility problems of Flash (Santally, 2016).
The DTML Conversational Editor, also developed in JavaScript, includes an instance of Talkit, a web-based, non-linear game dialogue editor. Talkit is an intuitive, powerful game dialogue editor that offers a visual GUI and exported scripts and game content automatically generated and encoded in JSON. In addition, there is a validator, configuration extension for the game, and a module import/export capability. The editor would ideally be used by an educator creating scripts for the bot to execute.
The DTML conversation script editor employs a simple and intuitive block-based GUI, making it easy to build a game script in stages. Figure 3 shows an elementary game script example:
Figure 3. DTML Conversation Editor
The simple and intuitive nature of the editor suits it to be used by teachers and game designers without requiring them to have formal knowledge of any programming languages. The short feedback loop afforded by the editor allows for fast building, testing, revising, and publishing of game scripts. The dialogues can be as straightforward or as complex as needed for a particular scenario, or to target the needs of a specific audience. The machine learning technologies that comprise the DTML solution are applied to create the most effective and efficient dialogues.
Finally, the DTML Game Execution Environment features Phaser, an HTML5 game framework which produces games that are completely cross-platform and cross-browser compatible. The execution environment also
features a robust conversation finite state machine, user context control,
speech processor, and a game configuration module.
The DTML JavaScript SDK is a light-weight, client-side, SDK written in JavaScript that provides the functionality required by guided conversation games, including methods for starting and stopping a game, obtaining lists of vocabulary words, scoring responses, recording telemetry events, and so on. Figure below shows the overall architecture.
Figure 4 DTML System Architecture
The DTML online platform has, to date, been very successful, reaching learners in 177 countries around the world. More than 50,000 monthly sessions executing on the platform. Learners exposed to the ESL learning games and other DTML bots rate the experience, on average, at 4 out of 5 stars.
In the course of thirty days, following the introduction of the latest generation of DTML Guided Conversation Games, 3,855 students
started a new Guided Conversation ESL game. Of this population, almost all used the programs’ capabilities to allow repeat functionality to reinforce ESL guided learning. More than half of the cohort used hints provided as scaffolding in the learning process. More than 1,180 learners, or 31%, reached the end of the game. 18% or 680 learners ran out of lives during the game, and 2,010 learners, or 51%, dropped off at some point without completing the entire scenario.
You can try one of conversational games published on dtml platform here https://dtml.org/esl/conversation_restaurant or look at the code in this github repository: https://github.com/distance-teaching-and-mobile-learning/DTML.Games
Written by:
Dr. Aleksey Sinyagin, Executive Director, JAAS Foundation
Prof. Devan Shepherd, Assistant Professor, School of Technology, Rasmussen College

Written by sinyaa | Aleksey is Sr. Engineering Manager with over 15 years of experience in delivering SW solutions.
Published by HackerNoon on 2020/09/07