Very Intelligent Booking Chat Interface. It can answer your related queries on rooms, booking, requests and hotels.
For a training data of less than 1k size, using neural nets or sequential models seemed to be an overkill, so we started with basic featurisation technique โ TfidfVectorizer. We also decided to contest Tf-Idf against average taken over the GloVe(Global Vectors for Word Representations) for each of the words in the questions. We used LabelEncoder to convert the intents to numbers or labels.
For having less training data was that, we were free to explore all types of classifiers like Logistic Regression, k-Nearest Neighbours, Naive Bayes, SVM, SGD Classifier and XGBoost. We also carried out extensive experiments to fine-tune the hyperparameters and achieve their best configuration.
Clearly, SVM and Logistic Regression are the top two classifiers on this training data. Although SVM has the best F1 score and test score, its mean cross-validation score is very low. This shows that it has extreme behaviour on this data. Hence we go for logistic regression.
- Download GloVe vectors from this link. Unzip and keep the file glove.6B.100d.txt in models folder.
- Run
conda env create -f environment.yml
- Run
conda activate botEnv
- Run the bot flask application by
python app.py
- If you face any issues with the existing models, you can train afresh by deleting the .joblib files in the models folder and run
python botmodel.py