Week 5: Diving Into Pose Detection with MediaPipe
April 4, 2024
Hello again, and welcome back to my senior project blog! This week, I’m thrilled to share with you my pose detection program for the virtual personal trainer app. I now have a program capable of detecting pose landmarks from input images using the MediaPipe framework. Let’s dive into the details of setting up the project environment and understanding the code.
Setting Up the Environment
Before diving into the code, let’s talk about the setup. Ensuring compatibility and a smooth running environment is crucial for any project.
Ensure 3.9 < Python < 3.12
I started by creating a new directory with a virtual environment named mediapipe-env in my terminal:
mkdir mediapipe
cd mediapipe
python3 -m venv mediapipe-env
.\mediapipe-env\Scripts\activate
After activating the environment, move your desired input image into the directory. The next step will be to install MediaPipe and download a specific pose landmarker model:
pip install mediapipe
wget -o pose_landmarker.task https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_heavy/float16/1/pose_landmarker_heavy.task
Pose detection program
MediaPipe and OpenCV Initialization: The program begins with importing necessary libraries from MediaPipe and OpenCV, setting the stage for pose detection and image manipulation:
import cv2
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import numpy as np
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision
Pose Detection Setup: Using MediaPipe’s PoseLandmarker and the downloaded pose_landmarker.task model, the program is set to detect human poses:
def draw_landmarks_on_image(rgb_image, detection_result):
pose_landmarks_list = detection_result.pose_landmarks
annotated_image = np.copy(rgb_image)
# Looping through the detected poses to visualize.
for idx in range(len(pose_landmarks_list)):
pose_landmarks = pose_landmarks_list[idx]
# Drawing the pose landmarks.
pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
pose_landmarks_proto.landmark.extend([
landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
])
solutions.drawing_utils.draw_landmarks(
annotated_image,
pose_landmarks_proto,
solutions.pose.POSE_CONNECTIONS,
solutions.drawing_styles.get_default_pose_landmarks_style())
return annotated_image
# Creating an PoseLandmarker object
base_options = python.BaseOptions(model_asset_path='pose_landmarker.task')
options = vision.PoseLandmarkerOptions(
base_options=base_options,
output_segmentation_masks=True)
detector = vision.PoseLandmarker.create_from_options(options)
Loading and Processing the Input Image: An input image is loaded and processed to detect human poses:
# Loading the input image
image = mp.Image.create_from_file("imagen.jpg")
# Detecting pose landmarks from the input image.
detection_result = detector.detect(image)
Visualization: The detected landmarks are then visualized on the image using OpenCV, displaying the pose landmarks:
# Visualizing pose landmarks
annotated_image = draw_landmarks_on_image(image.numpy_view(), detection_result)
cv2.imshow("image", cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))
cv2.waitKey(0)
# closing all open windows
cv2.destroyAllWindows()
Looking Ahead
As we move forward, the next step is comparing our input image to a reference. Our aim is to not just detect poses but to analyze them in the context of exercise form, providing users with personalized recommendations for improvement.
Stay tuned for more updates as I continue developing this program. Your support is greatly appreciated.
See you in the next post!
Leave a Reply
You must be logged in to post a comment.