Self-Supervised Ethogram Discovery

Wild Griffon vulture (Gyps fulvus) at Gamla Nature Reserve, Israel

Project Summary

As animal-attached tags (‘biologgers’) have gotten smaller and lighter, there has been an explosion of data collected with microphones, accelerometers, and other sensors from wild animals.

  1. Build and test self-supervised models for the automatic discovery and labeling of behavioral motifs in animal body motion.
  2. Make the data and code publicly available for others to use and contribute to, in the expectation that this will accelerate further research in the field of computational ethology.
“Crowned Elephant Seal“ ”by Etienne Pauthenet

In Detail:

To understand an animal’s behavior, scientists construct an inventory of what types of actions it performs. This inventory, called an ‘ethogram’, is then used to classify observed actions. One can then quantify how frequently an animal performs a specific type of action, and how this rate varies with other factors.

Minke Whale by Ari Friedlaender

Goals and Objectives

This project has three main objectives:

  1. The development of an open-source, self-supervised ML model that discovers biologically meaningful behavioral motifs from unlabeled motion data.
  2. The public release of these data and model code, to serve as a foundation for future research in this area.


In order to build a collection of benchmark biologger datasets, we have identified some open-source data sources (e.g. Jeantet et al., 2020; Vehkaoja et al., 2022) and are working closely with our partners, Dr. Christian Rutz and Dr. Ari Friedlaender, to source datasets from their labs and among researchers in the ethology community.


By developing a public benchmark for our ethogram discovery task, we will push research at the intersection of ML and biology to adopt common benchmarks and evaluation metrics and set a common standard for all future models for ethogram discovery. Just as common benchmarks have accelerated progress in computer vision (Russakovsky et al., 2015), we expect this to have a similar effect in computational ethology.

Works Cited

Baevski, A., Schneider, S., & Auli, M. (2019). vq-wav2vec: Self-supervised learning of discrete speech representations. arXiv preprint arXiv:1910.05453.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Earth Species Project

Earth Species Project

We are an open-source collaborative and nonprofit dedicated to decoding non-human language.