Student starter code (30% baseline)
index.html- Main HTML pagescript.js- JavaScript logicstyles.css- Styling and layoutpackage.json- Dependenciessetup.sh- Setup scriptREADME.md- Instructions (below)💡 Download the ZIP, extract it, and follow the instructions below to get started!
By completing this project, you will:
Difficulty Level: Intermediate Estimated Time: 5-7 hours (Part 1: 3-4h, Part 2: 2-3h) Prerequisites: Completed Activities 07 (Classification) and 10 (Text Preprocessing)
Every sentence we speak carries intent! A chatbot uses NLP to:
| User Message | Intent | Bot Response |
|---|---|---|
| "Hi there!" | Greeting | "Hello! How can I help you?" |
| "What time is it?" | TimeQuery | "It's 3:00 PM" |
| "Thank you!" | Thanks | "You're welcome!" |
v1-baseline-100percent.ipynb in Google Colab or Jupyterproject-04-chatbot.ipynbPart 1 - Jupyter Notebook (65% complete):
Part 2 - Streamlit App (files in streamlit/ folder):
chatbot.py with all model functionsintents.json datasetapp.py template with UI structure| TODO | Task | Difficulty | Estimated Time |
|---|---|---|---|
| 1 | Complete text_preprocessing function | Medium | 15 min |
| 2 | Process dataset and fill train_data/train_label | Medium | 15 min |
| 3 | Create vocabulary and Bag of Words | Easy | 10 min |
| 4 | Train all 3 classifiers | Easy | 10 min |
| 5 | Preprocess and predict test sentence | Medium | 10 min |
| 6 | Complete bot_respond function | Hard | 20 min |
| TODO | Task | Difficulty | Estimated Time |
|---|---|---|---|
| 7 | Create sidebar with page navigation | Medium | 15 min |
| 8 | Implement Chatbot page with text input | Medium | 15 min |
You've successfully completed this project when:
def text_preprocessing(sentence):
tokens = nltk.word_tokenize(sentence) # Tokenize
stem_tokens = []
for token in tokens:
stem_tokens.append(stemmer.stem(token.lower())) # Stem
# Remove punctuation, return joined string
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
vectorizer.fit(train_data) # Build vocabulary
train_data_bow = vectorizer.transform(train_data) # Convert to vectors
# Train multiple classifiers
clf_knn.fit(train_data_bow, train_label)
clf_dt.fit(train_data_bow, train_label)
clf_nb.fit(train_data_bow, train_label)
# Predict intent
predicted = clf_nb.predict(user_query_bow) # Returns ['Greeting']
def bot_respond(user_query):
# 1. Preprocess query
# 2. Convert to bag of words
# 3. Predict intent
# 4. Select random response for that intent
return responses[predicted[0]][random_index]
Personalized Responses
<HUMAN> placeholder in responsesExpand Knowledge Base
intents.json (e.g., "FavoriteFood", "Weather")Model Comparison
Context Awareness
Solution: Make sure you declared snowballStemmer = snowball.SnowballStemmer("english")
Solution: Check you're: (1) lowercasing, (2) stemming, (3) removing ALL punctuation
Solution: Ensure predicted intent exists in responses dictionary
Solution: Install with pip install streamlit nltk sklearn pandas numpy
Input: 'We all agreed, it was a magnificent evening.'
Output: 'we all agre it was a magnific even'
Test: "Hello there"
KNN Prediction: Greeting
Decision Tree Prediction: Greeting
Naive Bayes Prediction: Greeting
You: Hello there
Alex: Hi human, please tell me your Alex user
You: What time is it?
Alex: It's 15:30:45
You: Thank you
Alex: You're welcome!
project-04-chatbot/
├── README.md # This file
├── project-04-chatbot.ipynb # Student template (65-70%)
├── v1-baseline-100percent.ipynb # Complete solution
├── intents.json # Training dataset
└── streamlit/
├── chatbot.py # Core chatbot functions
└── app.py # Streamlit UI template
Ready to build your chatbot? Open project-04-chatbot.ipynb and start building!