Week 5: Diving Into Pose Detection with MediaPipe

April 4, 2024

Hello again, and welcome back to my senior project blog! This week, I’m thrilled to share with you my pose detection program for the virtual personal trainer app. I now have a program capable of detecting pose landmarks from input images using the MediaPipe framework. Let’s dive into the details of setting up the project environment and understanding the code.

Setting Up the Environment

Before diving into the code, let’s talk about the setup. Ensuring compatibility and a smooth running environment is crucial for any project.

Ensure 3.9 < Python < 3.12

I started by creating a new directory with a virtual environment named mediapipe-env in my terminal:

mkdir mediapipe

cd mediapipe

python3 -m venv mediapipe-env

.\mediapipe-env\Scripts\activate

After activating the environment, move your desired input image into the directory. The next step will be to install MediaPipe and download a specific pose landmarker model:

pip install mediapipe

wget -o pose_landmarker.task https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_heavy/float16/1/pose_landmarker_heavy.task

Pose detection program

MediaPipe and OpenCV Initialization: The program begins with importing necessary libraries from MediaPipe and OpenCV, setting the stage for pose detection and image manipulation:

import cv2

from mediapipe import solutions

from mediapipe.framework.formats import landmark_pb2

import numpy as np

import mediapipe as mp

from mediapipe.tasks import python

from mediapipe.tasks.python import vision

Pose Detection Setup: Using MediaPipe’s PoseLandmarker and the downloaded pose_landmarker.task model, the program is set to detect human poses:

def draw_landmarks_on_image(rgb_image, detection_result):

  pose_landmarks_list = detection_result.pose_landmarks

  annotated_image = np.copy(rgb_image)

 

  # Looping through the detected poses to visualize.

  for idx in range(len(pose_landmarks_list)):

    pose_landmarks = pose_landmarks_list[idx]

 

    # Drawing the pose landmarks.

    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()

    pose_landmarks_proto.landmark.extend([

      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks

    ])

    solutions.drawing_utils.draw_landmarks(

      annotated_image,

      pose_landmarks_proto,

      solutions.pose.POSE_CONNECTIONS,

      solutions.drawing_styles.get_default_pose_landmarks_style())

  return annotated_image

 

# Creating an PoseLandmarker object

base_options = python.BaseOptions(model_asset_path='pose_landmarker.task')

options = vision.PoseLandmarkerOptions(

    base_options=base_options,

    output_segmentation_masks=True)

detector = vision.PoseLandmarker.create_from_options(options)

Loading and Processing the Input Image: An input image is loaded and processed to detect human poses:


# Loading the input image

image = mp.Image.create_from_file("imagen.jpg")

 

# Detecting pose landmarks from the input image.

detection_result = detector.detect(image)

Visualization: The detected landmarks are then visualized on the image using OpenCV, displaying the pose landmarks:

#  Visualizing pose landmarks

annotated_image = draw_landmarks_on_image(image.numpy_view(), detection_result)

cv2.imshow("image", cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))

 

cv2.waitKey(0)

 

# closing all open windows

cv2.destroyAllWindows()

Looking Ahead

As we move forward, the next step is comparing our input image to a reference. Our aim is to not just detect poses but to analyze them in the context of exercise form, providing users with personalized recommendations for improvement.

Stay tuned for more updates as I continue developing this program. Your support is greatly appreciated.

See you in the next post!

View more of Ajay A.'s posts.

Week 5: Diving Into Pose Detection with MediaPipe

Reader Interactions

Leave a Reply Cancel reply