A Practical Hybrid Active Learning Approach for Human Pose Estimation

Sinan Kaplan, Joni Juvonen and Lasse Lensu

Human Pose Estimation

  • Estimate certain keypoints on human body.

Problem

  • Keypoint annotation is costly.
  • Amount of data required for the task is large.

Solution

  • Apply Active Learning (AL) to train a model iteratively.

AIMS OF THE STUDY

  • Apply Active Learning for Human Pose Estimation in an online environment.
  • Propose a cost-effective hybrid sampling strategy: uncertainty and diversity.

An Overview of The Proposed Method

  • A hybrid approach with uncertainty and diversity sampling.

Baseline Method and Evaluation Metric

  • Random Sampling as a baseline method.
  • Person count accuracy:
      \[\begin{aligned} \texttt{PC_ACC} = \dfrac{\texttt{Number of detected person}}{\texttt{Total number of person}} \\ \end{aligned} \]

Uncertainty Sampling Module

  • Model-based uncertainty sampling.

Feature Extraction Module

  • Taking advantage of Transfer Learning.

Diversity Sampling Module

  • Use approximate nearest neighbors to reduce sampling cost.

Experiments

Data
  • Provided by PintaWorks Oy.
  • Environment dependent variations in the data:
    • camera angle, lightning.
  • Consists of grayscale images with 368x368 size.
  • Applied augmentations:
    • rotation, translation, scaling, blurring, brightness and contrast

Model

Training Details

  • Tensorflow stack is used.
  • 5 training iterations conducted for each method (AL and the baseline method):
    • At each iteration 1K samples are selected by the proposed AL strategy and annotated by Oracle(human).
  • Hardware:
    • NVIDIA GeForce GTX 1060 6GB
    • CUDA 10.1 and cuDNN 7.6

Validation of AL Strategy

  • Samples with high and low heatmaps (confidence scores).

Validation of AL Strategy

  • Selected samples to be annotated from COCO-val.

Tests

  • The Comparison of Methods on Test Set
Test Time Augmentation(TTA) Tests

DISCUSSION

  • Pros:
    • The proposed method improves the pose model significantly.
    • The AL method is able to select diverse samples.
  • Cons:
    • Adversarial samples in which some objects resemble a human shape.
    • Person size (small) and occlusions for pose model.
FUTURE WORK
  • Improve uncertainty sampling module:
    • Eliminate adversarial samples
    • Proposal: use average of Test Time Augmentations
  • Improve Diversity Sampling Module:
    • Hierarchical clustering
    • Combine local features and visual features- possibly image hashes.

Thank you!

Sinan Kaplan

sinan.kaplan@pintaworks.fi