A Chatbot is a software backed by Artificial Intelligence and a very sophisticated tool that has the ability to simulate a real-time conversation with a human user, just like any other user. Over the last decade, chatbot technologies have gained a huge level of momentum with technology giants such as Google, Apple, and Microsoft making constant efforts to make their as well as the community of developers better in developing and nurturing Chatbot. A few notable chatbot technologies that should be mentioned are Apple’s Siri released in 2011, Google Assistant in 2016 and Amazon’s Alexa.
The developer communities along with some smart people backed by the technology giants are also doing a generous favor of creating a huge load of tools and environments that are meant to be used to create or mimic such products at an individual level. Today, let us go forth and use such a tool known as AIML (Artificial Language Markup Language) to develop one such Chatbot. This chatbot shall be runnable in the terminal prompt as well as in a GUI or a Web environment. But for the sake of simplicity for beginners, let us stick to the terminal prompt for execution.
Pre-requisite pieces of knowledge:
AIML: Artificial Language Markup Language is an XML based markup language that is meant for making handling Natural Language Processing tasks quite simpler and relevant. Having syntax identical to XML files, it is quite easier to learn and modify. These files are the main source of data for the chatbot application being built and every response it makes to user inputs, are stored systematically in the AIML files. An AIML file always has an extension ‘.aiml’.
std-startup.xml: In developing this application, we shall use a centralized XML file known as std-startup.xml that shall contain a list, mapping all the AIML files in the directory. The sole purpose of this file is to direct the main chatbot program to use the AIML files stored specifically in these files for replying to the user, after satisfying some conditions from the user input.
bot_brain.brn: Like any other AI-based product, a Chatbot needs a huge amount of data processed by a Machine Learning algorithm, followed by training, validating and testing with that data until we receive an accuracy signifying readiness for deployment. For the context of this project, we are going to use Intent Classification, a Machine Learning technique that uses the existing data (AIML files in our case), cleans, lemmatize, and encodes it, followed by training, validating and testing it to match with possible questionnaires and learn from those data, by forming mathematical patterns, which are stored in the file named bot_brain.brn. It is this “brain” file that’ll power our bot and help it to select the type of answer it should present with reference to the user query. To learn about Intent Classification, (code not shown here), click here.
Flask: Flask is a web framework that is developed in Python for Python developers, which makes the task of implementing python programs on the web browser quite easier and faster.
Step 1: Creating an AIML file:
AIML uses the syntax of the common XML files and creation of AIML is quite simple.
Presenting above is the syntax required for creating a basic functional AIML file. The user’s inputs are always compared with the content present inside the <pattern>…</pattern> syntax and the required response is made from the <template>…</template> block.
Writing your own AIML file can be fun but it will need lots of hard work and time. To prevent this, one can always refer to the online directory of AIML files here.
Step 2: Defining the std-startup.xml file;
The std-startup.xml file is an important component of this chatbot program. It contains links to all the AIML files that will be needed to power the chatbot. It contains multiple sub-components.
- <category>…</category>: This creates an atomic unit of the AIML contents under which we shall enter our entire AIML script.
- <pattern>…</pattern>: This syntax is used to denote a section which shall be used by the chatbot to compare with the user’s input in the Chatbot Interface and provide necessary responses.
- <template>…</template>: This is that section of the code where we put together all the responses that the chatbot shall output with respect to the content present in the <pattern>…</pattern>.
- <learn>…</learn>: This section is used for mapping the AIML file which must be required for the chatbot.
Step 3: Creating a Web Application for the Chatbot
Depending upon the requirements and the type of Chatbot we want to build, we can integrate the entire model inside a webpage or another software application. But for the context of this project, we shall stick to creating a Chatbot Web Application. Hence, first of all, we need to create a webpage containing a message box which will enable the user to connect with the Chatbot and send and receive messages, just like in an instant messenger application.
Step 4: Coding the Bot.
Here comes the fun part. Now we shall code the main program which powers the entire chatbot program.
The first half of the code allows the program to determine if the weights calculated after training the Sequence Model needs to be updated or is already present. Once this is done, the entire Chatbot Kernel gets ready for use and so the second half of the code provides the other functionalities for the program like quit or save. The weight data inside the bot_brain.brn keep changing with the number of times this program is executed. In simple words, this file names here as main-raw.py is a training file that trains the model every time it is executed
Step 5: Flask Integration and Running the Model on the Web
This program is basically the backend code for the chatbot. Its main task is to integrate the Flask module with the chatbot along with making the entire Sequence Model for Chatbot to be able to run on the virtual server. It contains the testing phase of the model and also uses the bot_brain.brn weight data.
In order to run the Chatbot on the Web Application created, we need to run this program which will enable the local server to host the page containing our Chatbot. For this, it is highly recommended that the URL used is http://127.0.0.1.5000, as default.
Once the page is up and running, the chatbot can freely connect and chat with the user just like a personal assistant. But the only difference is, with how much data, the bot has been trained so that the output is as close as it should be, although achieving a perfect 100% accuracy is never possible.
The Chatbot created here is clearly for educational purpose and it is bound to produce certain inaccuracy due to underfitting of data And moreover, this short documentation is meant for absolute beginners who want to gain motivation to proceed in the direction of an experienced data scientist.