Gloria Zen

Home  Research  Publications  Datasets 

>> EMD    

Earth Mover's Prototypes:
A Convex Learning Approach for Discovering Activity Patterns in Dynamic Scenes

We present a novel approach for automatically discovering spatio-temporal patterns in complex dynamic scenes. Similarly to recent non-object centric methods, we use low level visual cues to detect atomic activities and then construct clip histograms. Differently from previous works, we formulate the task of discovering high level activity patterns as a prototype learning problem where the correlation among atomic activities is explicitly taken into account when grouping clip histograms. Interestingly at the core of our approach there is a convex optimization problem which allows us to efficiently extract patterns at multiple levels of detail. The effectiveness of our method is demonstrated on publicly available datasets.

Download paper: [cvpr2011.pdf], video demo: [demo.avi], presentation: [slides].

Data and code for convex learning: [].

Unsupervised Convex Clustering


  • The [] code contains the implementation of unsupervised convex clustering based on:
    i) EMD with L1 distance over bins as ground distance (EMD-L1)
    ii) L1 distance (L1).
  • In order to use this code, you need to install the GLPK (GNU Linear Programming Kit) package, available at:
  • Sample data from the Junction dataset (40 clips) is provided

Problem formulation

Given a set of histograms ,
we aim to learn N representative prototypes ,
where D is the number of atomic activities.

This task is formalized as a convex optimization problem:

Loss: similarity between prototype pi and associated histogram hi
Regularization: smoothness among neighboring prototypes ( pi is fused into pj )
ηij = {0, 1} indicates prototype's neighborhood.

An efficient variation of the Earth's Mover Distance (EMD) is adopted in the Loss. This allows to consider similarity between atomic activities when comparing histograms.

In practise...

Given a set of histograms, it is possible to obtain a multiscale clustering by simply varying the value of parameter λ.
(see sample figure below, with N=4 and D=8)