Python - Facial and hand recognition using MediaPipe Holistic

Last Updated : 4 Jan, 2023

What is MediaPipe:

Object Detection is one of the leading and most popular use cases in the domain of computer vision. Several object detection models are used worldwide for their particular use case applications. Many of these models have been used as an independent solution to a single computer vision task with its own fixed application. Combining several of these tasks into a single end-to-end solution, in real-time, is exactly what MediaPipe does.

MediaPipe is an open-source, cross-platform Machine Learning framework used for building complex and multimodal applied machine learning pipelines. It can be used to make cutting-edge Machine Learning Models like face detection, multi-hand tracking, object detection, and tracking, and many more. MediaPipe basically acts as a mediator for handling the implementation of models for systems running on any platform which helps the developer focus more on experimenting with models, than on the system.

Possibilities with MediaPipe:

Human Pose Detection and Tracking High-fidelity human body pose tracking, inferring a minimum of 25 2D upper-body landmarks from RGB video frames
Face Mesh 468 face landmarks in 3D with multi-face support
Hand Tracking 21 landmarks in 3D with multi-hand support, based on high-performance palm detection and hand landmark model
Holistic Tracking Simultaneous and semantically consistent tracking of 33 pose, 21 per-hand, and 468 facial landmarks
Hair Segmentation Super realistic real-time hair recoloring
Object Detection and Tracking Detection and tracking of objects in the video in a single pipeline
Face Detection Ultra-lightweight face detector with 6 landmarks and multi-face support
Iris Tracking and Depth Estimation Accurate human iris tracking and metric depth estimation without specialized hardware. Tracks iris, pupil, and eye contour landmarks.
3D Object Detection Detection and 3D pose estimation of everyday objects like shoes and chairs

MediaPipe Holistic:

Mediapipe Holistic is one of the pipelines which contains optimized face, hands, and pose components which allows for holistic tracking, thus enabling the model to simultaneously detect hand and body poses along with face landmarks. one of the main usages of MediaPipe holistic is to detect face and hands and extract key points to pass on to a computer vision model.

Detect face and hands using Holistic and extract key points

The following code snippet is a function to access image input from system web camera using OpenCV framework, detect hand and facial landmarks and extract key points.