By the end of this lesson, you will be able to:
:information_source: Object Detection is a computer vision technique that helps computers identify and locate objects within images or videos. It's like teaching a computer to "see" and recognize things the way humans do!
Object detection helps us:
:information_source: YOLO (You Only Look Once) is a super-fast object detection algorithm that can find and identify multiple objects in images or videos in real-time. It looks at the entire image just once to detect all objects!
YOLO was created in 2015 by researchers Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi. Here's what makes it special:
:bulb: Remember: YOLO gets its name because it only needs to look once at an image to find all the objects. This makes it much faster than older methods that had to scan the image multiple times!
:emoji: Comparing Haar Cascades and YOLO
Haar Cascades
Haar Cascades work by:
- Using many simple patterns called Haar features
- Scanning the image step by step
- Looking for specific features in each area
YOLO Approach
YOLO is different because it:
- Looks at the entire image at once
- Uses deep learning to understand the whole picture
- Finds all objects in one quick scan
Quick Comparison
Features Haar Cascades YOLO Speed Slower ⏱️ Fast :zap: Accuracy Good for simple objects Better, especially for complex scenes Computer Power Needed Less demanding :computer: Needs more power :emoji:️ :books: YOLO Version History
YOLO has improved over the years! Here are the different versions:
- YOLOv1 (2015) - The original version
- YOLOv2/9000 (2016) - Faster and more accurate
- YOLOv3 (2018) - Better at detecting small objects
- YOLOv4 (2020) - Even more improvements
- YOLOv5, YOLOv6, YOLOv7 (2022) - Multiple versions with different strengths
- YOLOv8 (2023) - The newest and best version! note We'll use YOLOv8 in this lesson because it's:
YOLO helps us in many ways:
:bulb: Want to learn more? Check out this article by Datacamp for extra details!
:computer: Let's Code with YOLOv8!
We'll build our object detection program in three easy steps:
- Install the tools - Get YOLOv8 and OpenCV ready
- Load a trained model - Use a pre-trained YOLO model
- Detect objects - Find objects in images and videos
Step One: Install and Import Ultralytics :package:
First, we need to install the Ultralytics package that contains YOLOv8. Choose the right command for your computer:
- For Windows users:
bashpip install ultralytics
Or
bashpy -m pip install ultralytics
- For Mac users:
bashpython3 -m pip install ultralytics
note Having trouble? Download the requirements.txt file, put it in your project folder, and run:
py -m pip install -r requirements.txt
python3 -m pip install -r requirements.txt
!pip install ultralytics
After installation is complete, import the libraries we need:
from ultralytics import YOLO
import cv2 as cv
Now we'll load a pre-trained YOLO model. Think of this like downloading a brain that already knows how to recognize objects!
# Load trained model
model = YOLO('yolov8n.pt')
Next, let's get the list of objects this model can recognize:
# Load the labels
class_names = model.names
Let's use our YOLO model to find objects in an image! First, we load the image:
img = cv.imread("image.jpg")
Then, we ask YOLO to find objects in the image:
# Detect objects
results = model.predict(source=img)
Now comes the fun part - drawing boxes around the objects YOLO found:
# Draw bounding boxes
for result in results:
boxes = result.boxes
for box in boxes:
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Put classname with confidence
text = class_names[math.floor(box.cls)] + " " + str(math.floor(box.conf * 100)) + "%"
cv.putText(img, text, (x1, y1), cv.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
Finally, let's show the image with all the detected objects:
# Display the resulting frame
cv.imshow('Image', img)
Keep the window open until a key is pressed:
# Keep image window open until user presses a key
cv.waitKey(0)
Here's the complete program that detects objects in an image:
from ultralytics import YOLO
import cv2 as cv
import math
model = YOLO("yolov8n.pt")
class_names = model.names
img = cv.imread("image.jpg")
results = model.predict(source=img)
# Draw bounding boxes
for result in results:
boxes = result.boxes
for box in boxes:
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Put classname with confidence
text = class_names[math.floor(box.cls)] + " " + str(math.floor(box.conf * 100)) + "%"
cv.putText(img, text, (x1, y1), cv.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
# Display the resulting frame
cv.imshow('Image', img)
cv.waitKey(0)
We can also detect objects in videos! YOLO will analyze each frame of the video. First, let's load a video:
# Capture video
video = cv.VideoCapture("video.mp4")
# Capture webcam image
video = cv.VideoCapture(0)
Now we'll process the video frame by frame:
while True:
# Capture frame-by-frame
ret, frame = video.read()
# Quit when no more frame
if not ret:
print("Video Ended")
break
# Run object detector
results = model.predict(source=frame)
# Draw bounding boxes
for result in results:
boxes = result.boxes
for box in boxes:
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Put classname with confidence
text = class_names[math.floor(box.cls)] + " " + str(math.floor(box.conf * 100)) + "%"
cv.putText(frame, text, (x1, y1), cv.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
# Display the resulting frame
cv.imshow('frame', frame)
# stop when Q is pressed
if cv.waitKey(25) == ord("q"):
break
Don't forget to clean up when done:
# When everything done, release the capture
video.release()
cv.destroyAllWindows()
In this lesson, you learned:
YOLO has many real-world uses, from helping doctors analyze medical scans to enabling self-driving cars to "see" the road!
Ready to explore more? Try these AI prompts to deepen your understanding: