Practice and reinforce the concepts from Lesson 16
In this activity, you will:
import numpy as np
import pandas as pd
:bulb: Make sure to run the import cell first before starting the exercises. You'll see a green checkmark :emoji: when the cell runs successfully.
Question One: Student Scores Analysis (20 minutes)
Setup
- Download the students_score.csv file here to your computer
Step One: Upload the CSV file
- Copy and paste the following code into a new cell:
python
from google.colab import files uploaded = files.upload()
- Run the cell (press Shift+Enter)
- Click the "Choose files" button that appears
- Select the students_score.csv file from your computer tip After uploading, you should see a message showing the file name and size. If you don't see this, try uploading again.
students_score = pd.read_csv('students_score.csv')
students_score.head()
students_score.columns
students_score.index
students_score.info()
Expected outputs:
Students Gender Mathematics Science English
0 Adam M 87 78 90
1 Bob M 42 52 66
2 Crystal F 87 89 83
3 David M 99 89 83
4 Edmund M 53 70 91
Index(['Students', 'Gender', 'Mathematics', 'Science', 'English'], dtype='object')
RangeIndex(start=0, stop=7, step=1)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Students 7 non-null object
1 Gender 7 non-null object
2 Mathematics 7 non-null int64
3 Science 7 non-null int64
4 English 7 non-null int64
dtypes: int64(3), object(2)
memory usage: 408.0+ bytes
new_students_score = students_score[['Students', 'Mathematics', 'Science', 'English']]
Expected output:
Students Mathematics Science English
0 Adam 87 78 90
1 Bob 42 52 66
2 Crystal 87 89 83
3 David 99 89 83
4 Edmund 53 70 91
5 Fiona 95 89 90
6 Grace 89 90 90
new_students_score.to_csv('new_students_score.csv', index=False)
:bulb: Tip The
index=False
parameter prevents the row numbers from being saved in the CSV file.
files.download('new_students_score.csv')
from google.colab import files
uploaded = files.upload()
sports_medals = pd.read_csv('sports_medals.csv')
sports_medals
sports_medals.columns
sports_medals.index
sports_medals.info()
Expected outputs:
Houses Gold Silver Bronze Total
0 Gryffindor 5 4 5 14
1 Ravenclaw 7 5 3 15
2 Hufflepuff 6 5 4 15
3 Slytherin 2 6 8 16
Index(['Houses', 'Gold', 'Silver', 'Bronze', 'Total'], dtype='object')
RangeIndex(start=0, stop=4, step=1)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Houses 4 non-null object
1 Gold 4 non-null int64
2 Silver 4 non-null int64
3 Bronze 4 non-null int64
4 Total 4 non-null int64
dtypes: int64(4), object(1)
memory usage: 288.0+ bytes
total_sports_medals = sports_medals[['Houses', 'Total']]
Expected output:
Houses Total
0 Gryffindor 14
1 Ravenclaw 15
2 Hufflepuff 15
3 Slytherin 16
total_sports_medals.to_csv('total_sports_medals.csv', index=False)
files.download('total_sports_medals.csv')
:warning: Warning Before submitting:
- Make sure all code cells have been run successfully
- Verify that both CSV files have been downloaded to your computer
- Check that your outputs match the expected outputs shown in the exercise
File upload not working?
Getting errors when reading CSV?
pd.read_csv()
Download not starting?