The primary motivation behind the development of this package was to eliminate friction points, streamline data formatting and expand the capabilities of the existing packages that are used to estimate locations via light-level loggers/global location sensor (GLS) tags. Due to the hardware limitations of some models of GLS tags, namely their inability to record light intensity beyond a small range, the algorithm used in probGLS
(Merkel et al. 2016) provides more reliable estimates than the methods available in other packages for tracking species that exist in areas of low shading. This method is used to great effect by the SEATRACK project to map the distribution and flight patterns of a variety of seabirds native to the North Atlantic and neighbouring areas. Last summer I worked with the Department of Conservation to assess the viability of using this method to estimate the location and frequency of artificial light events for a variety of species native to New Zealand. While the algorithm has been shown to provide accurate estimates under the right conditions, limitations in the functionality and applying it to species that had a much larger range than previously used led to issues.
The land mask is essentially a filter that doesn’t allow location estimates to be generated within the boundaries specified. In the traditional usage of this it means that location estimates will not exist on any land mass but this can also be reversed so that estimates only occur on land. probGLS::prob_algorithm
allows for some preset extensions to the land mask and users are able to add the Mediterranean, Black, Baltic and Caspian seas. One of the goals of this package was to expand that functionality and allow for users to input custom land mask augmentations. probGLSAlgorithm::modify.land.mask
features the following additional preset options:
- Arctic Ocean
- North Atlantic Ocean
- South Atlantic Ocean
- North Pacific Ocean
- South Pacific Ocean
- Southern Ocean
In addition to these the function also takes in custom coordinates as defined by minimum longitude, maximum longitude, minimum latitude, maximum latitude. The output of probGLSAlgorithm::GLS.prob.algorithm
includes a map which displays the coverage of the land mask applied for visual confirmation of the area that the custom land mask includes should it be altered. The land mask is important due to the nature of the algorithm which takes the geographic median point of each new set of location estimates as the next point in each track/iteration and the most probable path is calculated as the median path of all tracks/iterations. During the aforementioned study we had issues with location estimates occuring on the other side of North America which is impossible and occured because there was no way to restrict estimate locations besides narrowing the bounding box, the expected range of the birds.
The attached image displays the issue clearly. The bounding box needed to allow for tracking around the southern tip of South America where the birds were known to have previously traveled but this also allowed for the estimates to be generated off the east coast of North America. With the ability to fine tune the allowed area via probGLSAlgorithm::modify.land.mask
this is no longer an issue and skewing is less problematic.
The frequency of recordings and recorded values differ between devices and its not uncommon for studies to use a mixture of models at times when availability was limited due to legal and manufacturing issues. The read.x
series of functions was written to simplify converting the raw data into the format required for location estimation. This may be especially useful because documentation on older models is difficult to come by and often lacking.
The package was inspired by my previous work tracking seabirds. This work seeks to remedy a few issues that may become apparent to someone trying to do similar work outside of Eurasia. I had planned to include a much larger number of read.x
functions for specific device models but realised that I could condense them quite easily. Some devices haven’t been available for sale in over a decade and with the non-user replaceable batteries it was unlikely that there was a need to cover their specific formatting. I wanted the main function GLS.prob.algorithm
to be easily swappable with probGLS::prob_algorithm
to minimise the work required if a user wanted to migrate from the original version to take advantage of some of the functionality.
This part was relatively straightforward with the exception of the sliding bounding box in the main function which I couldn’t manage to implement in a time/processing efficient manner. A much quicker fix was to subset the output of GLS.prob.algorithm
and generate a new geographic median value that only took into account locations within a given hemisphere over a user-specified time frame. This functionality is found in the geo.median.track
function. This method is less than ideal as instead of using the median values of \(n\) iterations the median is often calculated from \(<<n\) values.
The package has been updated and can be found here but users still need to download a handful of auxiliary data files that were not included due to their size and the requirement that the data files match the time covered in the recordings.
Data for package testing needed to be generated based on samples as it wasn’t possible to find publicly available sources of data that fit the requirements. This is partially due the the very niche purpose of this type of equipment and partially due to it not often being used to track species that have a large latitudinal range. Reference data sets are available in probGLS
but are small and already formatted for input into prob_algorithm
.
Note: The read.sensor
function will process C65-SUPER tag data and will probably work on other models by the same company but files were not available for testing.
I attempted to minimise the number of dependencies used unless they showed a clear benefit such as terra
/tidyterra
which are significantly faster than the raster
alternatives. A handful of operations were problematic to get working in base R so dplyr
was required. This seemed a reasonable trade off as the tidyverse has continued support and most R users would already have dplyr
installed. I opted to use the base R piping function instead of the more commonly seen magrittr
pipe which does mean that a more recent version of R is required.
Feel free to reach out to me if you have any questions.