Skip to content

An extension of the probGLS algorithm to allow for global use and deal with formatting issues between different device manufacturers.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

danielpetterson/probGLSAlgorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

probGLSAlgorithm

Motivation

The primary motivation behind the development of this package was to eliminate friction points, streamline data formatting and expand the capabilities of the existing packages that are used to estimate locations via light-level loggers/global location sensor (GLS) tags. Due to the hardware limitations of some models of GLS tags, namely their inability to record light intensity beyond a small range, the algorithm used in probGLS (Merkel et al. 2016) provides more reliable estimates than the methods available in other packages for tracking species that exist in areas of low shading. This method is used to great effect by the SEATRACK project to map the distribution and flight patterns of a variety of seabirds native to the North Atlantic and neighbouring areas. Last summer I worked with the Department of Conservation to assess the viability of using this method to estimate the location and frequency of artificial light events for a variety of species native to New Zealand. While the algorithm has been shown to provide accurate estimates under the right conditions, limitations in the functionality and applying it to species that had a much larger range than previously used led to issues.

Issues with existing packages

Land mask

The land mask is essentially a filter that doesn’t allow location estimates to be generated within the boundaries specified. In the traditional usage of this it means that location estimates will not exist on any land mass but this can also be reversed so that estimates only occur on land. probGLS::prob_algorithm allows for some preset extensions to the land mask and users are able to add the Mediterranean, Black, Baltic and Caspian seas. One of the goals of this package was to expand that functionality and allow for users to input custom land mask augmentations. probGLSAlgorithm::modify.land.mask features the following additional preset options:

  • Arctic Ocean
  • North Atlantic Ocean
  • South Atlantic Ocean
  • North Pacific Ocean
  • South Pacific Ocean
  • Southern Ocean

In addition to these the function also takes in custom coordinates as defined by minimum longitude, maximum longitude, minimum latitude, maximum latitude. The output of probGLSAlgorithm::GLS.prob.algorithm includes a map which displays the coverage of the land mask applied for visual confirmation of the area that the custom land mask includes should it be altered. The land mask is important due to the nature of the algorithm which takes the geographic median point of each new set of location estimates as the next point in each track/iteration and the most probable path is calculated as the median path of all tracks/iterations. During the aforementioned study we had issues with location estimates occuring on the other side of North America which is impossible and occured because there was no way to restrict estimate locations besides narrowing the bounding box, the expected range of the birds.

The attached image displays the issue clearly. The bounding box needed to allow for tracking around the southern tip of South America where the birds were known to have previously traveled but this also allowed for the estimates to be generated off the east coast of North America. With the ability to fine tune the allowed area via probGLSAlgorithm::modify.land.mask this is no longer an issue and skewing is less problematic.

Differences in formatting between devices and manufacturers

The frequency of recordings and recorded values differ between devices and its not uncommon for studies to use a mixture of models at times when availability was limited due to legal and manufacturing issues. The read.x series of functions was written to simplify converting the raw data into the format required for location estimation. This may be especially useful because documentation on older models is difficult to come by and often lacking.

Development Process

Initial concept

The package was inspired by my previous work tracking seabirds. This work seeks to remedy a few issues that may become apparent to someone trying to do similar work outside of Eurasia. I had planned to include a much larger number of read.x functions for specific device models but realised that I could condense them quite easily. Some devices haven’t been available for sale in over a decade and with the non-user replaceable batteries it was unlikely that there was a need to cover their specific formatting. I wanted the main function GLS.prob.algorithm to be easily swappable with probGLS::prob_algorithm to minimise the work required if a user wanted to migrate from the original version to take advantage of some of the functionality.

Implementation of ideas

This part was relatively straightforward with the exception of the sliding bounding box in the main function which I couldn’t manage to implement in a time/processing efficient manner. A much quicker fix was to subset the output of GLS.prob.algorithm and generate a new geographic median value that only took into account locations within a given hemisphere over a user-specified time frame. This functionality is found in the geo.median.track function. This method is less than ideal as instead of using the median values of \(n\) iterations the median is often calculated from \(<<n\) values.

Deployment

The package has been updated and can be found here but users still need to download a handful of auxiliary data files that were not included due to their size and the requirement that the data files match the time covered in the recordings.

Difficulties with the package development

Lack of open-access data

Data for package testing needed to be generated based on samples as it wasn’t possible to find publicly available sources of data that fit the requirements. This is partially due the the very niche purpose of this type of equipment and partially due to it not often being used to track species that have a large latitudinal range. Reference data sets are available in probGLS but are small and already formatted for input into prob_algorithm.

Note: The read.sensor function will process C65-SUPER tag data and will probably work on other models by the same company but files were not available for testing.

Dependencies

I attempted to minimise the number of dependencies used unless they showed a clear benefit such as terra/tidyterra which are significantly faster than the raster alternatives. A handful of operations were problematic to get working in base R so dplyr was required. This seemed a reasonable trade off as the tidyverse has continued support and most R users would already have dplyr installed. I opted to use the base R piping function instead of the more commonly seen magrittr pipe which does mean that a more recent version of R is required.

Feel free to reach out to me if you have any questions.

Merkel, Benjamin, Richard A Phillips, Sébastien Descamps, Nigel G Yoccoz, Børge Moe, and Hallvard Strøm. 2016. “A Probabilistic Algorithm to Process Geolocation Data.” Movement Ecology 4 (1): 1–11.

About

An extension of the probGLS algorithm to allow for global use and deal with formatting issues between different device manufacturers.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages