Practice and reinforce the concepts from Lesson 18
Time estimate: 10 minutes
Instructions:
:information_source: Info :link: Access the Exercise Form Here
Time estimate: 30-40 minutes
:warning: Important Before starting, save a copy of the template to your Google Drive:
:link: Copy the Colab Template Here
Click "File" -> "Save a copy in Drive" to create your own version
:bulb: If you get an import error, try restarting the runtime from the menu: Runtime -> Restart runtime
Step 2: Loading Your Dataset (5 minutes)
- Click the folder icon on the left sidebar in Colab
- Upload your CSV file using the upload button
- Use
pd.read_csv()
to read the file into a DataFrame- Assign it to a variable (e.g.,
df
) tip Common Challenge If your file has special characters, try addingencoding='utf-8'
to your read_csv() function:
df = pd.read_csv('filename.csv', encoding='utf-8')
df.head()
df.columns
df.info()
df.isnull().sum()
df.drop()
:bulb: Helpful Hint To see more rows, use
df.head(10)
ordf.tail()
to see the last rows
df.sort_values('column_name')
df.sort_values('column_name', ascending=False)
df.sort_values(['col1', 'col2'], ascending=[True, False])
df.groupby('column_name').size()
df.groupby('category')['numeric_column'].sum()
df.groupby('category')['numeric_column'].mean()
df.groupby(['col1', 'col2']).size()
grouped_data.sort_values(ascending=False)
:bulb: Tip Use
.reset_index()
after grouping to convert the result back to a regular DataFrame
df[df['column'] > value]
df[(df['col1'] > value1) & (df['col2'] == 'value2')]
filtered_df = df[df['column'] == 'specific_value']
:warning: Remember Always use parentheses around each condition when combining multiple filters!
DataFrame not showing?
df
in a cellColumn not found error?
df.columns
- they are case-sensitivedf.columns = df.columns.str.strip()
Memory error with large datasets?
pd.read_csv('file.csv', usecols=['col1', 'col2'])
:warning: Important - Submit Your Work! Before submitting:
- Review all your code outputs
- Make sure all steps are completed
- Save your Colab notebook (File -> Save)
- Download a copy for your records (File -> Download -> Download .ipynb)
:link: Submit your completed exercise here
Your submission helps track your progress and understanding of pandas data manipulation!