UQFact on Smart Phone
Abstract
Answer same questions every day is boring and a waste of human resources. Traditional QA system is designed to analyse natural language and retrieve answer automatically. With the increasing trend in chatting robots, this project integrates traditional QA system on Android device to allow students to ask various kind of questions related to UQ facts. This project will not only help reduce UQ receptionists’ workload, but also allow students to ask questions at any time from anywhere.
Chapter 1: Introduction
This project is to program a robot QA (Question Answering) system on Android device that can take conversations with students and answer questions about UQ.
Figure 1 shows the basic architecture of UQFact.
Figure 1: Basic Architecture of UQFact
1.1 The Goal of UQFact
Students can simply talk to their mobile phone and get their answer back. The knowledge base mainly focus on UQ facts, such as program information, course profile, location details, common knowledge of UQ, etc. It can answer questions like: “What is the cost of master of computer science?”, “What is the length of bachelor of Information Technology?”, “What is course description of Information Systems?”, “where is school of business?”, “Where can I get student ID card?”, etc.
We understand Australian like chit chat when first meet each other, and therefore simple casual chat is also supported. Greeting, farewell, emotion, hobby, and even some fun phrases can be handled by UQFact. For example, you can send queries like: “How are you?”, “Do you have a hobby?”, “Are you real?”, “Where do you live?”, “Where do you work?”, etc. And remember that UQFact is keep learning every day!
Figure 2 shows a simple QA chat scenario on UQFact.
Figure 2: Simple Chat Scenario
1.2 The Scope of UQFact
The main logic of UQFact is to understand user’s intention and fetch answer from our relational database engine. An android application has been designed and developed to provide the interaction platform for users.
This system focuses on integrating different technologies, platforms and algorithms to build an intelligent chatbot to answer questions and take conversations with end users. Speech recognition and speech synthesis are integrated into android application; Pre-defined conversation trees power the dialogue mode; Intent recognition and natural language processing are used to understand questions; Model driven methodology is used to generate supervised learning labelled data; Tornado Web server handles HTTP requests and MySQL database stores UQ knowledge including all courses, programs, etc. To build the knowledge base, two powerful web crawlers has been programmed to fetch UQ facts from UQ websites. One of them is used to retrieve information from static web pages, the other one is used to fetch data from dynamic websites.
The most powerful feature of UQFact is the self-learning capability. If one user asks a question that UQFact cannot provide answer to it, the chatbot will seek your help by answering back “Sorry, we could not answer this question. Would you be able to tell me the correct answer?”. This is really useful for human receptionist because they understand what kind of questions have been asked the most. They can use this feature to simulate question answering scenario and train UQFact using voice speech.
A deeper and detailed explanation about how UQFact is built can be found in Chapter 3, Implementation. And a full list of current features of UQFact can be viewed in Appendix I.
1.3 Structure of the Thesis
The thesis is organized as follows. In Chapter 2, research background will be introduced and discussed in detail. In Chapter 3, we will take a look at how the system is designed and what has been done to implement UQFact. In Chapter 4, test results, current problems and challenges will be studied. And the thesis will end with a discussion about how to make UQFact better in the future.
Chapter 2: Research Background
In this chapter, technical concepts and related works will be reviewed. This project involves a wide range of techniques including question answering system, chatbot, speech recognition and speech synthesis, natural language processing, model driven development, etc.
2.1 Question Answering System
Research in Question-Answering (QA) is not new. The QA problem has been addressed in the literature since the beginning of computing machines. Two early QA systems were BASEBALL and LUNAR, both were very effective in their chosen domains[1]. In 1960s and early 70s, many other question-answering program were introduced in various narrow domains of knowledge. These systems all highly rely on constructed and organized knowledge base[1], which is basically some pre-defined question and pre-defined answer. Build this kind of knowledge base is time consuming and not efficient when the answering wide domain questions. It requires more space to storage sentences (or phrases), and the time cost to match sentences will increase dramatically. Also, this kind of sentence-to-sentence mapping is quite easy to fail because different people often ask in different ways.
In late 1970s and 80s, computational linguistics developed rapidly, natural language processing techniques were introduced and widely used in QA system. System like PLANES[2] used a number of augmented transition networks to match phrases with specific meaning, which is especially useful for understanding meaningless queries. From the perspective of current research focus in question answering, the key limitation of this kind of work is that the knowledge base is still in a limited domain, not an open-ended collection of unstructured text to answer wider range questions[3].
Modern QA systems typically included a document retrieval module uses search engines to gather documents that are likely to contain the answer, and several complex NLP module to filter the documents to a short but explicit answer[1]. Together with the power of some supervised machine learning techniques and more feeding data, applications like Siri, Google Now are able to answer almost all domain questions[4].
Figure 3 shows general architecture for a question answering system. Six main tasks are: Question Analysis, Document Collection Process, Candidate Document Selection, Candidate Document Analysis, Answer Extraction, and Response Generation.
Figure 3: General Architecture for a Question Answering System[3]
2.3 Chatbot
Since Alan Turing introduced the famous Turing Test[5] as a replacement for the question “Can machines think?”, his idea have been widely discussed, attacked and defended over and over. For some people, Turing’s paper has been considered to represent the beginning of artificial intelligence and the Turing Test has become their ultimate goal[6]. Socially intelligent robots will provide both a natural human-machine interface and a mechanism for complete more human-like behavior[7]. However, require a robot to act like human physically is much harder than think and talk like human. The easiest way to mimic a human is to use a smart dialogue system.
2016 is going to be a remarkable year of social robots. Both start-up companies and large organizations come to this market, and power the skills of social robots. In order to build a chatbot, three issues need to be concerned: question range, answer type, and conversation length[8].
2.2.1 Close Domain VS. Open Domain
In a close domain system, inputs and outputs are somewhat limited, and the robot is only expected to answer a certain range of problems. Customer support and food ordering robots are two typical close domain chatbots. For a food ordering systems, it can only answer questions like “What is the best food in your restaurant?”, it is not expected to talk about sports or politics.
As for open domain systems, user can possibly talk about all kinds of topics and expect a relevant response. Apple Siri, Amazon Alexa, and Google assistant are three famous open domain chatbots. Although users can chat with them in any topic, responses are not always satisfactory. The infinite number of topics definitely makes this a hard problem, which is still a hot research topic currently.
2.2.2 Retrieval Based Answer VS. Generative Based Answer
Retrieval based answering requires a pre-defined knowledge base, and pick an appropriate response based on user’s question and context. As you can see, retrieval based answering robot’s performance highly depends on the size of the database.
Generative based answering normally build the response from scratch. If answer couldn’t be found in database, it will try to fetch results from web search engine and then “translate” the large response list into a single sentence or a small paragraph and send back to user. Search for relevant information is easy, but convert a long result list into appropriate response is hard.
2.2.3 Long Conversation VS. Short Conversation
The length of conversation plays an important role in building chatbot. If you wish to support longer conversation, it will also be much harder to build such agent. One study used ALICE system to help Chinese university students practise English skill shows that 62% of users chatted for less than 10 lines, and 8.5% of the time ALICE bot can’t find the pattern maching user’s input and had to rely on root-level default response[9]. For all conversational chatbots, one thing is common: maintain dialogue for a sustainable period of time is hard.
A common solution to keep track of what has been said and what information has been exchanged is to embed the conversation into a vector or a dialogue tree. However, doing that with long conversation is still challenging.
2.3 Question Analysis and Natural Language Processing
Natural language processing (NLP) is a field of computer science and linguistics concerned with the interaction between computer and human using natural languages[10]. Many challenges in NLP involve: natural language understanding, enabling computers to derive meaning from human or language input, automatic summarization, etc[11].
In this project, we will mainly focus on Question Analysis problem, which is essential because a good understanding of the question will help us to select a good answer from the knowledge database. Currently, question understanding has two major solutions: simple rule-based expression match, or intent recognition classifiers using machine learning algorithms.
2.3.1 Rule-based understanding
One of the most famous rule-based matching method is AIML[12], which is an XML-based description language used to develop natural language software agents. It was the basis for world famous ALICE bot and many other modern chatbots. AIML is a rule-based language, which means the program contains a set of rules. Each of these rules consist two parts: condition and action. The system works by selecting one rule which has its condition matched user’s input, and then executing that rule’s action. A simple AIML file shows below:
<aiml>
<category>
<pattern>Hi, My name is *</pattern>
<template>Hi, Nice to meet you
<set name=”userName”><star/></set>
</template>
</category>
</aiml>
This file contains a single greeting rule. For example, if user inputs “Hi, My name is Jerry”, then the system will reply “Hi, Nice to meet you Jerry”.
Rule-based understanding allows programmers to create human natural language interface while keeping the implementation simple, easy to understand and maintainable. The learning curve of rule-based method is pretty flat, and very easy to collaborate within groups. Therefore, rule-based understanding methods are still very popular.
2.3.2 Intent Recognition
For conversational agents, another widely used approach to understand user’s query is intent recognition. It tries to detect the intent behind user’s utterance and extract relevant entity[13]. What is an intent? An intent can be schedule a meeting, book flight ticket, find location, etc. What are entities? The entities are elements that contains essential meaning to execute actions, such as the time for the meeting, the flight number for ticket booking, and the location name to find.
With the development of machine learning and natural language processing techniques, a number of language understanding platforms provide service using intent recognition and entity extraction method, such as Api.ai, Wit.ai, Amazon Alexa, Microsoft LUIS, etc. Amazon Alexa, for example, requires to define a voice interface[14] that specifies a mapping from user’s input to intents that the system can handle. A voice interface contains two inputs: intent schema in JSON format, and some sample utterances as labelled data. Below is an example of how to build an intent to reply course description to user’s query.
Firstly, a “CourseDescriptonIntent” needs to be created, and you can think slot as entity.
{
“intent”: “CourseDescriptionIntent”
“slots”: [
{
“name”: “courseName”,
“type”: “COURSE”
}
],
}
Secondly, entity “COURSE” needs to be defined (e.g. data mining, software process).
COURSE artificial intelligence | the software process | data mining
And lastly, some labelled data needs to be feed into the system.
CourseDescriptionIntent tell me something about {courseName}
CourseDescriptionIntent tell me about {courseName}
CourseDescriptionIntent what is the course description of {courseName}
Now, if user inputs “what is the course description of artificial intelligence?”, the system will understand this query asks for brief description of artificial intelligence course.
Although intent recognition sounds complicate, it is nothing but a combination of tokenization, part-of-speech tagging, phrase correlation, named-entity extraction. Therefore, developers can definitely implement their own intent recognition classifier using natural language processing packages like Stanford CoreNLP[15] for Java, or NLTK[16] for Python.
In this project, Api.ai is integrated as one of the question understanding tool, details will be covered in section .
2.4 Model Driven Development and Template
The model-driven approach can increase development productivity and quality by describing important aspects with human-friendly abstractions and by generating common fragments using templates[17]. Normally model-driven methodology is used in application development, for example student management system can be easily written in this way. Model of such system will contain student, courses, grades, and relationships between each class. Figure shows basic idea of model-driven development.
Figure
How model-driven development is used in developing question answering system? As described in previous section, define intent needs to feed in entities. And since different platform have different entity format, copy and paste is also very time consuming. Hence, creating a template based on platform’s format and then generating entity file from database models will be a great solution.
2.5 Speech Recognition and Speech Synthesis
The task of speech recognition is to convert speech into a sequence of words by computer program. As the most natural communication modality for humans, the ultimate dream of speech recognition is to enable people to communicate more naturally and effectively. In the early 2000s, speech recognition was still dominated by traditional approaches such as Hidden Markov Models combined with feedforward neural networks[18]. With the successful deep feedforward neural network[19] in large vocabulary speech recognition in 2010, almost all major commercial speech recognition systems (e.g. Skype Translator, Google Now, Apple Siri, Microsoft Cortana, etc.) nowadays are based on deep learning methods[20].
The task of speech synthesis is to convert text to speech, in other words, let machines talk like human beings. Kurzweil predicted in 2005 that as the cost-performance ratio caused speech synthesizers to become cheaper and more accessible[21]. Now various types of voice cloud services are available for developers at no cost. More and more programs integrate with speech service, and change the interaction ways from text commands to voice input.
In this project, UQFact is powered by Iflytek’s world leading interactive voice engine[22], which provides simple and powerful natural voice control ability.
Chapter 3: Design and Implementation
In this chapter, we are going to take a look at how UQFact is built. I will break the complex system into pieces and go through each of them in detail.
3.1 Web Crawler
To build UQ knowledge base, two web crawlers have been created. One is used to fetch data from static web page, the other is used to fetch information from dynamic web page. All UQ courses, programs, locations, faculties, and some general knowledge data have been crawled. Python is the programming language used to create web crawler, and PyCharm IDE is the development environment.
3.1.1 Static Web Page Crawler
PyQuery is the core library used to retrieve useful data from web documents. It allows you to make jquery queries on xml documents[23], which helps people save a lot of time writing regular expressions. You can simply parse in a url, and then use jquery like api to analyse the HTML content. Below is a code snippet of using PyQuery to fetch all UQ Schools’ information:
Figure
The logic flow is fairly simple: first get all url links of UQ schools, and then iterate each link and extract information from each HTML page. (Note: This is not a complete code file, data manipulation and insert into database table part is not listed.)
3.1.2 Dynamic Web Page Crawler
A lot of modern web pages (e.g. UQ future student website below) use AJAX request and heavy JavaScript, which means the content of part of the web page can be updated without reloading the whole page. Furthermore, these update content functions normally invoke when user clicks button or tab, and therefore we need some ways to simulate end-user’s behaviour in browser.
Figure UQ Program web page
Selenium[24] allows for automated control of real browsers on real operating systems, which means there is almost no difference from real user interaction. The obvious downside of Selenium is that it requires a full graphical desktop and a real browser. As a result, the execution time will be longer.
On the opposite side of this spectrum is PhantomJS[25], a headless browser running a WebKit engine with full JavaScript access. It is easy to set up, runs on any machine, and is significantly faster.
Now, PhantomJS is binding to Selenium, and Selenium can control it in the same way that it controls any other browsers. Therefore, we can combine the benefit of these two: Selenium provides powerful apis and PhantomJS provides a faster headless browser. Below is a code snippet of fetching UQ Program’s information:
Figure
(Note: This is not a complete code file, it just shows how to use Selenium and PhantomJS to fetch information from heavy JavaScript website)
3.2 Question Understanding
To understand user’s question better, we combined
3.2 Answering Service
3.2.1 Structured Facts
UQ facts are defined into different types (e.g. LOCATION, COURSE, PROGRAM, SCHOOL), and stored in specific data table. These information are used to answer questions that have clear intentions. For example, “Where is School of Education?” is a question asking for the location of School of Education; “What is course description of Engineering Design?” is a question asking for summary of Engineering Design course. How to fetch each field from user’s query will be discussed in section .
3.2.2 General Questions and Answers
To answer more general wider range queries, we found UQ Answers website, which has 428 frequently asked questions and its answers. Figure X is a screenshot of UQ Answers.
Figure
For each question and answer, there is a static page available, which is fairly easy to use traditional web crawler to fetch information. We stored these questions and answers into “general_question” table, which has four columns: id, question, answer, keywords. A screenshot of this table is shown below:
Figure
Keyword column stores the keywords extracted from each question. For question “In what month do semesters at UQ start?”, the extracted keywords are “month, semester, start”. How to extract these keywords will be explained in section 3, Keywords Extraction and Sentence Similarity.
3.2.3 User Information
Since domestic students and international students are quite different when enrolled in program, a “user” table is created to store the information whether student is an international student. This table has three columns: id, device_id, nationality. How to A screenshot of this table is shown below:
Figure
3.3 Android Development
UQFact is a native android application developed using Java programming language. It provides a nice user interface and allows user to interact with the robot easily. The development environment for UQFact android application is done in Android Studio as shown in Figure
Figure
3.3.1 User Interface
Native android application’s user interfaces are written in XML format. Below is an example of the main conversation layout in UQFact:
Figure
Android SDK supports a lot of default application theme, UI widgets, layouts, etc. In this project, all user interfaces are designed using Android SDK libraries. The final app contains three main interfaces shows in figure , a user guide page, a send feedback page, and a main conversation page.
Figure
The user manual page and the send feedback page are fairly simple. The first one only contains a text view inside a scroll view (in case the text will not overflow in small screen phones). The second one contains two input edit text fields to collect user’s email and feedback, and a “SEND FEEDBACK” button to invoke send email function.
The main conversation page is a little complex. It consists of a list view to display all conversations between user and robot, a “CLICK TO SPEAK” button to record user’s speech when the button is keep pressed, an edit text allows user to change the result of speech recognition if the text doesn’t match the actual voice input, a “GO” button to send query text to web server.
3.3.2 Iflytek Voice Service
Iflytek cloud voice service[22] is used to develop our voice interaction function. It helps convert voice speech to text and convert text to voice. Below are some key steps to integrate voice service (speech recognition plus speech synthesis) into UQFact application.
- Create a developer account on http://open.voicecloud.cn
- Create an application in this account to get the app_id
- Download iflytek’s Android SDK from developer console, and import Msc.jar and libmsc.so to android project libs folder (Note: iflytek’s SDK updates frequently, the latest version may have some difference)
- Since this is a cloud service, we need to obtain internet permission for this android application. And to record user’s voice input, we need to obtain the record permission as well. “AndroidManifest.xml” file is the place to change application permissions, figure shows UQFact’s manifest file and lists all permissions.
Figure
- The voice service will integrate into main conversation activity, so we need to initialize speech utility (app_id is the value you got from step 2) when we create this activity use the following code:
//SpeechUtility initialize
SpeechUtility.createUtility(MainActivity.this, “appid=56b0d819”);
- Voice synthesis is easier to integrate. A private function is created to abstract speech synthesis logic, and later we can just use this function and parse in text input. Create the SpeechSynthesizer object and then set some key parameters including speaker agent, speed and sound of speech, and the speech engine type will make it work. A full list of SpeechConstant parameters can found from downloaded SDK documents.
Figure
- Speech recognition is a bit complicated since we need to create a RecognizerListener and also deal with the logic when to invoke the listener. To write a ReconizerListener we need to override six functions: onBenginOfSpeech function tells you SDK internal recorder is ready to accept user’s voice speech, onError function tells you there is some problem (most likely no permission), onVolumneChanged function tells you user is speaking and his volume, onEvent function allows you to handle some special speech event, onResult function allows you to extract recognized text message from JSON object, and finally onEndOfSpeech function indicates the end of speech recognition. In most situations, developers only need to care about onResult(RecognizerResult results, Boolean isLast) function, which is the core function to retrieve recognized text information. By default, RecognizerResult is in JSON format contains the following fields:
JSON field | Full Name | Type | Meaning |
sn | sentence | number | which sentence |
ls | last sentence | boolean | is last sentence |
bg | begin | number | start |
ed | end | number | end |
ws | words | array | words |
cw | Chinese word | array | Chinese word |
w | word | string | single word |
sc | score | number | score |
Here is an example of RecognizerResult:
{“sn”:1,”ls”:false,”bg”:0,”ed”:0,”ws”:[
{“bg”:0,”cw”:[{“sc”:0,”w”:”Hello”}]},
{“bg”:0,”cw”:[{“sc”:0,”w”:” good”}]},
{“bg”:0,”cw”:[{“sc”:0,”w”:” morning”}]}]}
The following code shows how to write a RecognizerListener:
private RecognizerListener mRecognizerListener = new RecognizerListener() {
String voiceResult = “”;
@Override
public void onBeginOfSpeech() {
// sdk internal recorder is ready
System.out.println(“starts speaking“);
}
@Override
public void onError(SpeechError error) {
// error message
System.out.println(error.getPlainDescription(true));
}
@Override
public void onEndOfSpeech() {
// end of speech recognition
System.out.println(“ends“);
}
@Override
public void onResult(RecognizerResult results, boolean isLast) {
String ans = “”;
try {
JSONObject result = new JSONObject(results.getResultString());
JSONArray jsonArray = result.getJSONArray(“ws”);
for (int i = 0; i < jsonArray.length(); i++) {
JSONObject object = jsonArray.getJSONObject(i);
JSONArray array = object.getJSONArray(“cw”);
for (int j = 0; j < array.length(); j++) {
JSONObject obj = array.getJSONObject(j);
String tmp = obj.getString(“w”);
ans = ans + tmp;
}
}
} catch (JSONException e) {
e.printStackTrace();
}
voiceResult += ans;
if (isLast) {
// last sentence
System.out.println(“request string: ” + voiceResult);
requestTextView.setText(voiceResult);
voiceResult = “”;
}
}
@Override
public void onVolumeChanged(int volume, byte[] data) {
// current volume
System.out.println(“Current Volume:” + volume);
}
@Override
public void onEvent(int eventType, int arg1, int arg2, Bundle obj) {
// extension functionality, leave blank
}
};
The last bit to make speech recognition work is to start the listener when user pressing on “Click to Speak” button. UQFact will keep recording user’s voice speech as long as the user press and hold on the listening button, and stop listening when user release the button. The code below shows this logic.
Figure
3.3.3 Sending Email
Feedback is always important so that you can know which functionality is useful for users and which is not. And therefore, a send email feedback feature is developed to help collecting user’s opinion. User can simply type in their email address and feedback, and then click the “SEND FEEDBACK” button to send opinion to UQFact Gmail account.
User will receive a confirmation email as in figure , and UQFact will receive the email contains user’s feedback as in figure .
Figure
Figure
The easiest way to send email in Java is to use Javax Mail API. To use this API, you can either download required jar files from https://code.google.com/archive/p/javamail-android/downloads and import to libs folder, or simply add the following line to your “build.gradle” file.
compile ‘javax.mail:javax.mail-api:1.5.3’
Below is the code used to send emails in background asynchronous task.
Figure
3.3.4 Integrate Api.ai
3.4 Question Understanding – Intent Recognition
Chapter 4: Results and Discussion
The ultimate goal of this research is to make UQFact live and easy to use for all UQ students and staffs. Furthermore, if UQFact is successfully used in real life situation, the development approach can be potentially applied in any organization to build their own chatbot and used to receive visitors at the first occasion when human receptionist is unavailable.
4.2 Improvements
Since this is the first stable version of UQFact system, there are quite a few improvements can be done in the future. I will start with some easy implementation improvements and discuss more advanced improvements later.
4.2.1 User Experience
Bibliography
[1] “Question answering,” Wikipedia, the free encyclopedia. 22-Jul-2016.
[2] D. L. Waltz, “An English Language Question Answering System for a Large Relational Database,” Commun ACM, vol. 21, no. 7, pp. 526–539, Jul. 1978.
[3] L. Hirschman and R. Gaizauskas, “Natural language question answering: the view from here,” Nat. Lang. Eng., vol. 7, no. 4, pp. 275–300, 2001.
[4] S. Lohr, “The age of big data,” N. Y. Times, vol. 11, 2012.
[5] A. M. Turing, “Computing machinery and intelligence,” Mind, vol. 59, no. 236, pp. 433–460, 1950.
[6] A. P. Saygin, I. Cicekli, and V. Akman, “Turing Test: 50 Years Later,” in The Turing Test, J. H. Moor, Ed. Springer Netherlands, 2003, pp. 23–78.
[7] C. Breazeal and B. Scassellati, “A context-dependent attention system for a social robot,” rn, vol. 255, p. 3, 1999.
[8] S. Kojouharov, “Ultimate Guide to Leveraging NLP & Machine Learning for your Chatbot,” Chatbot’s Life, 18-Sep-2016. [Online]. Available: https://chatbotslife.com/ultimate-guide-to-leveraging-nlp-machine-learning-for-you-chatbot-531ff2dd870c. [Accessed: 04-Jun-2017].
[9] J. Jia, “The study of the application of a keywords-based chatbot system on the teaching of foreign languages,” ArXiv Prepr. Cs0310018, 2003.
[10] A. Reshamwala, D. D. Mishra, and P. Pawar, “REVIEW ON NATURAL LANGUAGE PROCESSING,” ResearchGate, vol. 3, no. 1, pp. 113–116, Feb. 2013.
[11] “Natural language processing,” Wikipedia, the free encyclopedia. 20-Aug-2016.
[12] R. S. Wallace, “AIML overview,” ALICE AI Found., 2003.
[13] M. McTear, Z. Callejas, and D. Griol, “Implementing Spoken Language Understanding,” in The Conversational Interface, Springer, 2016, pp. 187–208.
[14] “Define the Interaction Model in JSON and Text – Amazon Apps & Services Developer Portal.” [Online]. Available: https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/defining-the-voice-interface. [Accessed: 04-Jun-2017].
[15] “Stanford CoreNLP – Core natural language software | Stanford CoreNLP.” [Online]. Available: https://stanfordnlp.github.io/CoreNLP/. [Accessed: 04-Jun-2017].
[16] S. Bird, “NLTK: the natural language toolkit,” in Proceedings of the COLING/ACL on Interactive presentation sessions, 2006, pp. 69–72.
[17] S. Sendall and W. Kozaczynski, “Model transformation: the heart and soul of model-driven software development,” IEEE Softw., vol. 20, no. 5, pp. 42–45, Sep. 2003.
[18] H. A. Bourlard and N. Morgan, Connectionist speech recognition: a hybrid approach, vol. 247. Springer Science & Business Media, 2012.
[19] D. Yu, L. Deng, and G. Dahl, “Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition,” in Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2010.
[20] L. Deng and D. Yu, “Deep Learning,” Signal Process., vol. 7, pp. 3–4, 2014.
[21] R. Kurzweil, The singularity is near: When humans transcend biology. Penguin, 2005.
[22] “Speech Engine_IFLYTEK CO.,LTD.” [Online]. Available: http://www.iflytek.com/en/audioengine/index.html. [Accessed: 04-Jun-2017].
[23] G. Pasgrimaud, pyquery: A jquery-like library for python. .
[24] “Selenium – Web Browser Automation.” [Online]. Available: http://www.seleniumhq.org/. [Accessed: 04-Jun-2017].
[25] “PhantomJS | PhantomJS.” [Online]. Available: http://phantomjs.org/. [Accessed: 04-Jun-2017].