Rocío Joo


Summarized scientific path Summarized scientific path

With a background in statistics, my research has been mainly focused on methods for the study of movement, mostly in fisheries and quite recently on seabirds (I’m excited to start collaborating on raccoons now, and not only for fieldwork!).

From the interactions not only with fisheries scientists and ecologists, but also with biologists, geologists, oceanographers, statisticians, computer scientists and signal processing experts, I have learned that interdisciplinarity is the key to move forward in science, and that multiple efforts have to be made to make communication flow and projects work. As a product of my travels (undergraduate and graduate training between France and Peru, postdocs in France and the US, the later in the Seabirdsound multinational project with people from The Netherlands, the UK and South Africa), I have also learn to consider the differences in cultures when working in a multinational team.

Hearing ranges Hearing ranges

More details on my past and current research projects, groups, publications and students can be found below.

Recent available presentations

  • Navigating through the R packages for movement. useR!2019 Conference. Toulouse, France. July 9th to 12th, 2019. Check it out here.

  • Reviewing a decate of movement ecology for conservation. Greater Everglades Ecosystem Restoration Conference. Coral Springs, USA. April 22th to 25th, 2019. Check it out here.

  • Metrics for describing dyadic joint movement. International Statistical Ecology Conference, July 2nd to 6th, 2018. Check it out here.

More presentations can be found here

Recent available publications

  • Joo, Boone, Clay, Patrick, Clusella-Trullas, Basille. Navigating through the R packages for movement. Journal of Animal Ecology (accepted). Pre-print here

  • Joo, Etienne, Bez, Mahevas. Metrics for describing dyadic movement: a review. 2018. Movement ecology 6 (26). Available here

Research topics

Here are some of the topics I’ve been working on:

Machine learning and stochastic processes to identify behaviors within trajectories

  • Tracking devices allow following the movement of humans and animals, in many cases around the globe. Because we can often record movement but not the behaviors behind them, we expect that the movement (or the absence of movement) patterns will provide information to identify these behaviors. For fisheries, identifying the behavioral modes or activities performed during fishing trips leads to better quantification of the spatial effort deployed, which can serve as inputs for management decisions and even changing the fishing quota (Poos et al. 2010, Hornborg et al. 2017). In ecology, identification of behaviors in animal trajectories allows assessing changes in migration or foraging strategies and deployment of effort. If those changes are associated with recognizable anthropogenic or environmental effects, this could lead to changes in conservation strategies. In this section, I present several statistical methods I used to identify behavioral modes in fisheries and animal tracking data. The succession of works here are presented chronologically, starting with my undergraduate internship at Instituto del Mar del Perú (IMARPE).

  • IMARPE had two data sources to help characterize the distribution of fishing spatial effort: on one hand, geolocalization data from the vessel monitoring system (VMS, one record per hour) available for the whole fishing fleet (tracking every single fishing trip), and on the other hand, data from an on-board observers program sampling ~1% of the fishing trips performed by the fleet (recording time and location of fishing, searching and cruising activities, and description of catches). In many fisheries in the world, a speed threshold is used to detect fishing activities from VMS data, leading to an overestimation of fishing sets. For the case of the Peruvian anchovy fishery, the overestimation was ~ 182% (Bertrand et al. 2008). Instead of relying on a speed threshold, I calibrated artificial neural networks with the on-board observers data to be able to identify fishing sets based on the VMS data (time of the day, speed and turning angle were derived from the data). This new method reduced the overestimation to 1% (Joo 2009 bachelor thesis, and Joo et al. 2011).

  • In order to infer the sequences of activities performed during fishing trips (i.e. fishing, searching and cruising; Fig. 1), and not only fishing sets, I compared two types of models: on one hand, hidden Markov and semi-Markov models (HMMs and HSMMs, respectively), and on the other, machine learning discriminative methods such as random forests, support vector machines and artificial neural networks (Joo 2013 PhD thesis, and Joo et al. 2013). Here too, parameter estimation and model validation were performed using on-board data. The HSMMs had the best performance (80% of accuracy). The scientific contributions of this work were mainly: 1) for the first time, hidden semi-Markov models were applied to fishing trajectories, 2) first comparison of models to infer behavioral modes (i.e. activities) in movement ecology (animals/humans), 3) with ~3000 tracks with groundtruthed (observed) data, this was the largest sample in the literature for objectively validating behavioral mode identification in fisheries and ecology.

Fig.1: Inferred behavioral modes. Left panel: fishing vessel track (Joo et al. 2013). Right panel: seabird track (Clay et al. In prep). Fig.1: Inferred behavioral modes. Left panel: fishing vessel track (Joo et al. 2013). Right panel: seabird track (Clay et al. In prep).

  • Since it is actually unusual to have groundtruthed data of behavioral modes (e.g. on-board data in fisheries science, or video data in animal ecology), Markovian models are commonly used in a non-supervised framework, with Expectation-Maximization algorithms for parameter estimation. Using the on-board observer data collected by IMARPE, I evaluated HMMs within supervised and non-supervised frameworks, and showed that, for the Peruvian anchovy fishing trips, the supervised method (i.e. parameter estimation using groundtruthed data) provided better results than the non-supervised one (+10% accuracy for fishing and +25% for searching activities). Results are sensitive to the number of tracks with on-board data (presented at ISEC as Joo et al. 2014). These findings highlight the importance of on-board observers programs in fisheries to accurately estimate fishing spatial effort, and for animal ecology, that there should be more effort into recording groundtruthed behavioral data.

  • I collaborated with colleagues from Mexico and Brazil to adapt some of the above-mentioned methods to artisanal fisheries in their countries, to identify fishing activities from GPS tracks: random forests for the Mexican Yucatan fishery (presented in an international conference on coastal fisheries as Torres et al. 2014; article in prep.) and HSMMs for several fisheries in the North-East of Brazil (Santos Da Silva 2017, supervision of Bachelor thesis, main advisor M. Thomé de Souza). I also collaborated in the AMPED project for marine protected areas (D. Kaplan, IRD), where there was a need to characterize strategies for fishing tuna associated with changing the location of floating artificial devices (FADs; deployed to attract the fish). I implemented a hybrid HSMM-random-forest model on trajectory data from FADs to identify segments of their tracks corresponding to being on vessels, and matched those with the VMS data to identify these vessels. The model provided 97% of accuracy. Part of that work was published in Mauffroy et al. (2015).

  • I am currently working within the Seabirdsound team investigating the role of infrasound and meteorological conditions in seabird navigation, leading the movement ecology axis of the project. In a fist stage of the project, we investigate the role of wind in the behavior of wandering albatrosses during their foraging trips. Wandering albatrosses are soaring birds, flying great distances almost without flapping their wings. Thus, wind should play a great role in their movement. We are fitting HMMs to GPS tracks to assess the effect of wind speed and direction in behavioral changes (between directed flight, searching and resting; Fig. above). Paper to come!

  • Jenicca Poongavanan, our lab’s new master student, has just started working with me on a generalized version of HSMMs that could be fit different types of tracking data, with an R package as a potential end product.

Multivariate analysis and geostatistics for fishing spatial effort

A mix of multivariate methods (e.g. principal components, coinertia, hierarchical clustering) allowed us to identify tactics and strategies in the Peruvian industrial anchovy fishery, and their associations to abiotic and biotic conditions (e.g. fish spatial distribution and abundance, sea surface temperature, oxycline depth, El Niño Southern Oscillation) as well as characteristics of the vessels and the fishers themselves (Joo et al. 2014 and 2015). For those studies, data from multiple sources were used (on-board observers, raw VMS data, behavioral modes obtained with HSMMs, acoustic data from scientific surveys and satellite data). It required interdisciplinary work with oceanographers, biologists and ecologists to understand the data and the processes in the ecosystem. Two undergraduate students, Marissela Pozada and Omar Salcedo, did their interships within the framework of this research.

Fig.2: Idealized 3D representation of ecological conditions in a given scenario, and associations with fishers spatial behavior (Joo et al. 2014). Fig.2: Idealized 3D representation of ecological conditions in a given scenario, and associations with fishers spatial behavior (Joo et al. 2014).

Spatial effort from a predator can be indicative of the relative abundance and spatial distribution of their prey. With that in mind, I used the behavioral mode information from anchovy fishers to build a proxy of anchovy presence. The activities were aggregated in space and time (cells of ~25 km, 1 month) and used for variogram fitting and kriging interpolation to obtain anchovy presence maps (Joo 2013 PhD thesis). Spatial descriptors were used to compare these maps to maps made from acoustic biomass of anchovy from concomitant scientific surveys (Camasca 2015, bachelor thesis under my supervision). Positive spatial covariations between the two random fields were obtained at coarse scales, but no conclusive results were observed for fine scales (article in prep.).

A generalized approach to the random walk debate for human and animal movement

There have been (and still are) multiple debates about the choice of the best random walk model for trajectories of organisms; mainly between Brownian motion and Lévy walk supporters. Generally speaking, the methodological (empirical) part of the debate consisted in comparing the goodness-of-fit of the tail of the distribution of step lengths to Gaussian or power law distributions (corresponding to Brownian and Lévy, respectively). Instead of comparing two possibilities, in Bertrand et al. (2015)), we proposed an approach that allowed the most plausible model emerge from the data, fitting the tail of the distribution of step lengths to a Generalized Pareto distribution that included Gaussian and power law as particular cases. The method was applied to Peruvian fishing vessel and seabird tracking data. The estimated parameters of scale and diffusion from the Generalized Pareto distribution provided information on the foraging strategies of each individual.

Fig.3: Generalized Pareto Distribution, a continuum from Exponential-Poisson to Power-Lévy walk patterns: parameter k of the Generalized Pareto distribution defines a continuum of distributions from light-tailed (k<0) to heavy-tailed (k>0.5). (Bertrand et al. 2015). Fig.3: Generalized Pareto Distribution, a continuum from Exponential-Poisson to Power-Lévy walk patterns: parameter k of the Generalized Pareto distribution defines a continuum of distributions from light-tailed (k<0) to heavy-tailed (k>0.5). (Bertrand et al. 2015).

Metrics for assessing dyadic movement

From a review of collective movement literature, I identified that most data-driven works have relied on metrics that were not measuring what was intended, probably due to an absence of investigations on the theoretical properties of those metrics. During my postdoc at IFREMER, my collaborators and I focused on pairwise joint-movement behavior, where individuals move together during at least a segment of their path. We investigated the adequacy of twelve metrics introduced in the literature for assessing joint movement by analysing their theoretical properties and confronting them with contrasting case scenarios. Two criteria were taken into account for review of those metrics: 1) practical use, and 2) dependence on parameters and underlying assumptions. We also evaluated the ability of each metric to assess specific aspects of joint-movement behavior: proximity (closeness in space-time) and coordination (synchrony) in direction and speed. We found that some metrics are better suited to assess proximity and others are more sensitive to coordination (Joo et al. 2018).

Fig.4: Representation of metrics in terms of their distance relative to proximity and coordination obtaining by studying their mathematical properties. (Joo et al. 2018). Fig.4: Representation of metrics in terms of their distance relative to proximity and coordination obtaining by studying their mathematical properties. (Joo et al. 2018).

Based on this review, we selected a few metrics and used them to identify dyadic (i.e. pairwise) behavior in fishing vessels using VMS data from several fishing fleets around the world: pelagic and demersal trawlers in the English Channel and the Celtic Sea, pelagic purse-seiners in the Pacific Ocean, and pelagic purse-seiners in the Indian Ocean. For each case study, we applied the same standard statistical method to identify types of interactions (presented at ICES as Joo et al. 2018; article in prep.).

The field of movement ecology has experienced unprecedented growth in the last decade: technological advances have enabled a wide range of sensors to be used by ecologists, and analytical and programming tools have been developed to aid data processing and analysis. Aiming at a comprehensive view of the state of the field, I am currently leading a synthetic and quantitative review of the scientific publications in movement ecology. We focused on the ten-year time span between the formal introduction of the movement ecology paradigm by Nathan et al. (2008) and the end of 2018. We searched the Web of Science to select publications in the field of movement ecology by applying a hierarchy of keyword-based filters and obtained a database of 4417 peer-reviewed papers. We then used a text mining approach to extract text from the publications and assess the number of papers investigating the components of the paradigm (motion, navigation, internal state and external factors), the biologging devices used, the focal taxon, the analytical methods applied and the software used. This work aims at providing an overview of what has been achieved, the directions that are being taken in the field, and the questions, species or methods that are being neglected (presented at GRC as Joo et al. 2019; article in prep.).

In our review, we found that the most used programming software for movement analysis is the free open source R software. I, then, identified 59 R packages created for processing or analysis of tracking data. My collaborators and I reviewed and described each package based on a workflow centered around tracking data, broken down in three stages: pre-processing, post-processing and analysis (data visualization, track description, path reconstruction, behavioral pattern identification, space use characterization, trajectory simulation and others) (Joo et al. preprint). Links between packages were assessed through a network graph analysis and showed that one third of the packages worked on isolation, reflecting a fragmentation in the R movement-ecology programming community. We also provided recommendations for users to choose packages and for developers to maximize the usefulness of their contribution and strengthen the links between the programming community. Throughout this investigation we came to the realization that there is still a need for standardized R data classes for trajectory data, and we are currently working on a project to create a package for that purpose. The team of this project is mainly composed by developers of some of the R packages reviewed, and we plan to extent the discussions on the structure of the classes to the large community of developers and users, so that it can be relevant for all different types of tracking data.

Fig.5: Proportion of movement ecology papers in each year that used each software (only the 5 most used softwares are shown). (Joo et al. in prep\). Fig.5: Proportion of movement ecology papers in each year that used each software (only the 5 most used softwares are shown). (Joo et al. in prep).

Research Groups and Projects




comments powered by Disqus