Student starter code (30% baseline)
index.html- Main HTML pagescript.js- JavaScript logicstyles.css- Styling and layoutpackage.json- Dependenciessetup.sh- Setup scriptREADME.md- Instructions (below)💡 Download the ZIP, extract it, and follow the instructions below to get started!
By completing this project, you will:
Difficulty Level: Intermediate Estimated Time: 2-3 hours Prerequisites: Completed Activity 08 (Clustering)
You own a supermarket mall and have customer data from membership cards:
Goal: Group customers into segments for targeted marketing strategies!
v1-baseline-100percent.ipynb in Google Colab or Jupyterproject-03-mall-segmentation.ipynb✅ Data Download & Loading
✅ Data Exploration (Phase 2)
✅ K-Means Model Setup (Phase 3)
✅ Visualization Template (Phase 4)
| TODO | Task | Difficulty | Estimated Time |
|---|---|---|---|
| 1 | Create and fit KMeans model (n_clusters=5) | Medium | 10 min |
| 2 | Get inertia and cluster centers | Easy | 5 min |
| 3 | Predict clusters for new customers | Medium | 8 min |
Expected Inertia: ~23838.25
| TODO | Task | Difficulty | Estimated Time |
|---|---|---|---|
| 4 | Create cluster scatter plot with legend | Medium | 15 min |
| 5 | Implement Elbow Method (K=1 to 10) | Hard | 15 min |
| 6 | Identify optimal K and rebuild model | Medium | 10 min |
You've successfully completed this project when:
from sklearn.cluster import KMeans
model = KMeans(n_clusters=5, random_state=42)
model.fit(X)
labels = model.labels_
centers = model.cluster_centers_
inertias = []
for k in range(1, 11):
model = KMeans(n_clusters=k, random_state=42)
model.fit(X)
inertias.append(model.inertia_)
# Plot and find the "elbow" point
colors = ['red', 'blue', 'green', 'orange', 'purple']
for i in range(n_clusters):
cluster_data = X[labels == i]
plt.scatter(cluster_data[:, 0], cluster_data[:, 1], c=colors[i], label=f'Cluster {i}')
Once you complete all TODOs, try these:
Challenge One: Different Features
Challenge 2: Kneed Library
kneed.KneeLocator to automatically find optimal K!pip install kneedChallenge 3: 3D Clustering
Challenge 4: Customer Profiling
Solution: Check that you extracted numerical columns only, not 'Gender' or 'CustomerID'
Solution: Try different random_state values or verify feature extraction
Solution: The elbow is usually around K=3-5 for this dataset
Solution: Reduce n_clusters or try different random_state
Inertia of Model with K=5: 23838.24882164186
Cluster Centers:
[[58.44444444, 50.52777778],
[41.48484848, 37. ],
[30.1754386 , 82.35087719],
[25.4 , 52.68571429],
[43.28205128, 11.84615385]]
Customer A (Age: 20, Spending: 42) -> Cluster 3
Customer B (Age: 65, Spending: 81) -> Cluster 0
Customer C (Age: 44, Spending: 100) -> Cluster 2
Customer D (Age: 59, Spending: 23) -> Cluster 4
The elbow point is typically around K=3 or K=4
Ready to begin? Open project-03-mall-segmentation.ipynb and start clustering your customers! 🛒