The Modelling of Movement of Multiple Animals that Share Behavioural Features

Abstract

In this work, we propose a model that can be used to infer the behaviour of multiple animals. Our proposal is defined as a set of hidden Markov models that are based on the sticky hierarchical Dirichlet process, with a shared base-measure, and a step and turn with an attractive point (STAP) emission distribution. The latent classifications are representative of the behaviour assumed by the animals, which is described by the STAP parameters. Given the latent classifications, the animals are independent. As a result of the way we formalize the distribution over the STAP parameters, the animals may share, in different behaviours, the set or a subset of the parameters, thereby allowing us to investigate the similarities between them. The hidden Markov models, based on the Dirichlet process, allow us to estimate the number of latent behaviours for each animal, as a model parameter. This proposal is motivated by a real data problem, where the global positioning system (GPS) coordinates of six Maremma Sheepdogs have been observed. Among the other results, we show that four dogs share most of the behaviour characteristics, while two have specific behaviours.

attractive-point, directional-persistence, Maremma Sheepdogs, STAP density

1 INTRODUCTION

Movement data are often based on a time-series of two-dimensional spatial coordinates recorded using a global positioning system (GPS) device attached to an animal. Since the first paper by Dunn and Gipson (1977), the statistical models used to analyse such data have become increasingly popular, and are used to understand different aspects of the movement of animals, ranging from habitat selection (Hebblewhite & Merrill, 2008) to behaviour analysis (Anderson & Lindzey, 2003; Maruotti et al., 2016; Mastrantonio, 2018; Merrill & David Mech, 2000); for a detailed review, the reader may refer to Hooten et al. (2017).

Three major categories of movement-description models can be identified in the behaviour modelling framework: biased random walks (BRWs), correlated random walks (CRWs), and bias and correlated random walks (BCRWs). In a BRW, the animal movement is attracted (or biased) toward a point in space, which is called center-of-attraction (see, e.g., Blackwell, 1997; Dunn & Gipson, 1977). The center-of-attraction can be interpreted as a proxy of the home-range (Christ et al., 2008) or it can describe a movement toward a patch of space (McClintock et al., 2012). In a CRW, the movement direction, at any given time, depends on the previous direction. This characteristic is called directional persistence (Jonsen et al., 2005) and it is useful to describe a constant change in direction between consecutive observations. If both directional persistence and attractors are used to describe a movement, the model is a BCRW (Codling et al., 2008; Fortin et al., 2005; McClintock et al., 2012). A movement-description model is generally used as an emission distribution of a mixture-type model, where a latent cluster-membership variable is used to identify the behaviour assumed by an animal. If the observed time-window is wide enough, the use of a mixture-type model is justified by the assumption that an animal exhibits more than one behaviour during the day (Patterson et al., 2008), e.g., sleeping and hunting. The switching between behaviours is often temporally structured and, if formulated in a discrete-time framework, the model is usually the hidden Markov model (HMM) (Langrock et al., 2012; Michelot et al., 2016).

The literature on the modelling of multiple animals is not as extensive as that on single individuals, even though coordinates of different animals are often recorded. Nonetheless, interest in this topic is increasing (see, e.g., Westley et al., 2018) since, as shown by (Jonsen, 2016), the joint modelling of multiple animals often increases the precision of the estimates. By adopting the classification given by Scharf and Buderman (2020), it is possible to model multiple animals using two approaches. In the first approach, called indirect, the parameters that govern the behaviours are seen as random effects across animals, that is, they come from a common distribution, whose parameters must be estimated, and the animals are conditionally independent (see, e.g., Buderman et al., 2018; McClintock et al., 2013; Michelot et al., 2017). However, in the direct appr-oach, the dependence between animals is described by an unobserved graph or social network, see, e.g., Calabrese et al. (2018), Hooten et al. (2018), Milner et al. (2021) and Niu et al. (2020).

We here propose a Bayesian model which can be used to describe multiple animals that share certain movement characteristics, observed over different time-windows. The model is based on the hierarchical Dirichlet process (HDP) (Teh et al., 2006) and it is a generalization of the sticky hierarchical Dirichlet process HMM (sHDP-HMM) of Fox et al. (2011). In the model, given the latent classification and likelihood parameters, the animals are independent and the behaviours are described by the five parameters of the step and turn with an attractive point (STAP) distribution, which is a BCRW emission-distribution that has recently been proposed by Mastrantonio (2020). The main contributions of the present proposal are the possibility of estimating the number of latent behaviours of each animal as model parameters, and of introducing the sharing of parameters between behaviours and animals in HMMs based on Dirichlet processes (DPs). The former is a by-product of the DP modelling, which allows us to avoid the use of information criteria to select the number of behaviours, which has been shown to be problematic in this context (Pohle et al., 2017), or a trans-dimensional Markov chain Monte Carlo (MCMC) algorithm, such as the reversible jump MCMC (RJMCMC), which presents challenges in its implementation (Hastie & Green, 2012). The sharing of parameters is introduced in the lower level of the model hierarchy, where the distribution over the STAP parameters is defined by combining five different DPs. This distribution is discrete, with a countable number of atoms being defined so that they can share some of their multivariate components. This approach is similar to the one proposed by Mastrantonio et al. (2021), which was used to model climate data in a change-point framework. Therefore, the behaviours within or between animals can have the same value of an STAP parameter, and this allows us to investigate similarities and differences between the analysed animals. Other approaches also exist that have the sharing of parameters as one of their characteristics (see, e.g., Jonsen (2016) or Milner et al. (2021)), but they require one to select, a priori, what parameters are allowed to change and the number of values that a parameter can assume. However, in our proposal, everything is done during the model fitting and driven by the information within the data.

Our proposal has been used to model the trajectories of 6 Maremma Sheepdogs, observed in Australia with recorded coordinates every 30 min. These dogs are used all over Europe and Asia to protect livestock from possible predators and, in recent years, also in Australia (see, e.g., van Bommel & Johnson, 2016; Gehring et al., 2017). Maremma Sheepdogs are able to work in synergy with the shepherd to keep the stock together but this is not always possible when the property is too large. For this reason, the dogs are often left alone and are rarely visited by the shepherd. The owner has no supervision over the dogs and it is therefore interesting to analyse and understand their behaviour. The used dataset was taken from the movebank repository (www.movebank.org) and is described in detail in van Bommel and Johnson (2014a) and van Bommel and Johnson (2014b). With our model, we have identified many similarities and some specific features between dogs, that are easy to interpret and which give a better insight into the behaviour of the dogs. Two competitive models have also been estimated on the same data and the results are compared with our proposal.

The paper is organized as follows. We introduce the STAP density in Section 2, and the hierarchical formalization of our proposal in Section 3, while Section 4 contains the results of the real data application. The paper ends with some conclusive remarks in Section 5. The Web-based supporting materials, available on the web page of the journal, contain details of the MCMC algorithm and on the results of the competitive models.

2 THE STAP DISTRIBUTION

With the aim of better understanding the results of the real data application (Section 4) and the formalization of our proposal, we briefly describe the STAP distribution, which was introduced in Mastrantonio (2020), and its parameters; for a more detailed description the reader may refer to Mastrantonio (2020).

We assume we have a time-series of two-dimensional spatial locations $s = {(s_{t_{1}}, \dots, s_{t_{T}})}^{'}$ that represent an animal's path, where $s_{t_{i}} = (s_{t_{i}, 1}, s_{t_{i}, 2}) \in D \subset R^{2}$ ⁠, and $t_{i}$ is a temporal index. The coordinates are recorded without any measurement error and the time difference between consecutive points is constant. In order to formalize the STAP, we introduce the bearing angle

ϕ_{t_{i}} = {atan}^{*} (s_{t_{i + 1}, 2} - s_{t_{i}, 2}, s_{t_{i + 1}, 1} - s_{t_{i}, 1}) \in [- π, π),

and the rotation matrix

R (x) = (\begin{matrix} \cos (x) & - \sin (x) \\ \sin (x) & \cos (x) \end{matrix}),

where ${atan}^{*} (\cdot)$ is the two-argument tangent function (Jammalamadaka & Kozubowski, 2004). The bearing-angle measures the direction of the movement between time $t_{i}$ and $t_{i + 1}$ ⁠, and the rotation matrix can be used to perform a rotation in a two-dimensional space, so that if it multiplies a two-dimensional vector, the vector is rotated anti-clockwise by an angle x. The conditional distribution of $s_{t_{i + 1}}$ is assumed to be second-order Markovian, with the following specification:

\begin{matrix} s_{t_{i + 1}} | s_{t_{i}}, s_{t_{i - 1}} & \sim N (s_{t_{i}} + M_{t_{i}}, V_{t_{i}}), i \in {1, \dots, T - 1}, \\ M_{t_{i}} & = (1 - ρ) τ (μ - s_{t_{i}}) + ρ R (ϕ_{t_{i - 1}}) η, \\ V_{t_{i}} & = R (ρ ϕ_{t_{i - 1}}) \sum R^{'} (ρ ϕ_{t_{i - 1}}), \end{matrix}

(1)

where $μ, η \in R^{2}$ ⁠, τ ∈ (0, 1), ρ ∈ [0, 1], and ∑ is a two-dimensional covariance matrix. The location $s_{t_{1}}$ is fixed, and $s_{t_{0}}$ is another parameters that is needed to compute $ϕ_{t_{0}}$ in the conditional distribution of $s_{t_{2}}$ ⁠. If the path follows Equation (1), we write $s_{t_{i + 1}} | s_{t_{i}}, s_{t_{i - 1}} \sim STAP (θ)$ ⁠, with θ = (μ, η, ∑, τ, ρ).

The movement described by the STAP can have directional-persistence and attraction to a point in space, therefore, the STAP is a BCRW. To better understand these two properties and how they are formalized in Equation (1), we introduce the vector ${\vec{F}}_{t_{i}}$ ⁠, which is a vector with initial and terminal points equal to $s_{t_{i}}$ and $s_{t_{i}} + M_{t_{i}}$ respectively. This vector represents the expected movement between time $t_{i}$ and $t_{i + 1}$ ⁠, since its initial point is the previously observed location $s_{t_{i}}$ and the terminal one is equal to $E (s_{t_{i + 1}} | s_{t_{i}})$ ⁠. If ρ = 0, the STAP reduces to a two-dimensional AR(1) (a BRW), and ${\vec{F}}_{t_{i}}$ points to the spatial location μ, which is therefore the attractor. The length of ${\vec{F}}_{t_{i}}$ is $τ ‖ (μ - s_{t_{i}}) ‖$ ⁠, which shows that τ measures how much of the total distance between the last observation (⁠ $s_{t_{i}}$ ⁠) and the attractor (μ) is expected to be covered or, in other words, how strong the attraction to μ is. If ρ = 1, the STAP reduces to a CRW, based on a normal density. In this case, the direction of ${\vec{F}}_{t_{i}}$ is the same as the direction of $R (ϕ_{t_{i - 1}}) η$ ⁠, which depends on the previous bearing angle $ϕ_{t_{i - 1}}$ ⁠, and thus induces a directional-correlation between consecutive points. If ρ ∈ (0, 1), ${\vec{F}}_{t_{i}}$ is a weighted mean between its value in a BRW and a CRW, with weights given by (1 − ρ) and ρ respectively. The covariance matrix of the conditional distribution of $s_{t_{i + 1}}$ is fixed in a BRW (⁠ $Cov (s_{t_{i + 1}} | s_{t_{i}}) = \sum$ ⁠), while it rotates with the bearing-angle for any ρ > 0 (⁠ $Cov (s_{t_{i + 1}} | s_{t_{i}}) = R (ρ ϕ_{t_{i - 1}}) \sum R^{'} (ρ ϕ_{t_{i - 1}})$ ⁠): for more details, the reader may refer to Mastrantonio (2020).

We show examples of STAP densities, and the associated BRW (ρ = 0) and CRW (ρ = 1) in Figure 1, to better understand the differences between the BRW, CRW and BCRW. The dashed arrow in the figure is the movement between times $t_{i - 1}$ and $t_{i}$ ⁠, the solid arrow is ${\vec{F}}_{t_{i}}$ ⁠, and the ellipse is a contour of the conditional distribution of $s_{t_{i + 1}}$ with a constant density containing 95% of the total probability mass. From 1 (a), we can see that the direction of ${\vec{F}}_{t_{i}}$ and the ellipse change according to $ϕ_{t_{i - 1}}$ in a CRW. However, ${\vec{F}}_{t_{i}}$ and the ellipse are independent from the previous direction in a BRW (Figure 1b), and ${\vec{F}}_{t_{i}}$ points to the spatial attractor $μ = {(0, 0)}^{'}$ ⁠. When ρ ∈ (0, 1), the ellipse and ${\vec{F}}_{t_{i}}$ are dependent on both the previous direction and the spatial attractor, see Figures 1c and d.

FIGURE 1

Graphical representation of the conditional distribution of $s_{i + 1}$ ((a) CRW, (b) BRW, (c), (d) BCRW), for different possible values of $s_{i}$ and the previous directions. The dashed arrow represents the movement between $s_{i - 1}$ and $s_{i}$ ⁠. The solid arrow is ${\vec{F}}_{i}$ ⁠, while the ellipse is the area containing 95% of the probability mass of the conditional distribution of $s_{i + 1}$ ⁠. $μ = {(0, 0)}^{'}$ ⁠, $η = {(0, 6)}^{'}$ ⁠, τ = 0.25, $\sum = (\begin{matrix} 0.2 & 0 \\ 0 & 1 \end{matrix})$ in all figures, and the central dot is the location μ, which is the attractor in the biased random walks and the bias and correlated random walks [Colour figure can be viewed at https://dbpia.nl.go.kr]

Open in new tab Download slide

3 THE PROPOSED MODEL

In this section, we introduce the components of the model and how they are used to introduce the characteristics of our proposal. We extend the notation of the previous section to describe the path of m animals, and to allow changes in the behaviour to be considered.

We indicate the path of the j th animal with $s_{j} = (s_{j, t_{j, 1}}, s_{j, t_{j, 2}}, \dots, s_{j, t_{j, T_{j}}})$ ⁠, where j = 1, …, m, and $T_{j} \equiv (t_{j, 1}, t_{j, 2}, \dots, t_{j, T_{j}})$ is the set of temporal points, equally spaced in time, where the position of the j th dog is recorded. The sets $T_{j}$ and $T_{j^{'}}$ can contain different time-points, but the time difference must be constant across animals, that is, $t_{j, i + 1} - t_{j, i} = c$ for all j = 1, …, m and $i = 1, \dots T_{j} - 1$ ⁠. We introduce a discrete random variable $z_{j, t_{j, i}} \in N$ to represent the animal behaviour at time $t_{j, i}$ ⁠, where $z_{j, t_{j, i}} = k$ indicates that animal j follows behaviour k at time $t_{j, i}$ ⁠. Given the behaviour assumed by each animal, the paths are independent and

s_{j, t_{j, i + 1}} | s_{j, t_{j, i}}, s_{j, t_{j, i - 1}}, z_{j, t_{j, i}} \sim STAP (θ_{z_{j, t_{j, i}}})

where $θ_{k} = (μ_{k}, η_{k}, \sum_{k}, τ_{k}, ρ_{k})$ ⁠. In other words, if the j th animal is following the k th behaviour at time $t_{j, i}$ (i.e. $z_{j, t_{j, i}} = k$ ⁠), the path is described by the set $θ_{k}$ of STAP parameters. It should be noted that the k th behaviours are represented by the same set of parameters $θ_{k}$ for all animals.

Let $s = {s_{j}}_{j = 1}^{m}$ ⁠, $z_{j} = {z_{j, t}}_{t \in T_{j}}$ ⁠, $z = {z_{j}}_{j = 1}^{m}$ ⁠, and $θ = {θ_{k}}_{k \in N}$ ⁠, then the model we propose is

f (s | θ, z) = \prod_{j = 1}^{m} \prod_{i = 1}^{T_{j} - 1} f (s_{j, t_{j, i + 1}} | s_{j, t_{j, i}}, s_{j, t_{j, i - 1}}, θ_{z_{j, t_{j, i}}}),

(2)

s_{j, t_{j, i + 1}} | s_{j, t_{j, i}}, s_{j, t_{j, i - 1}}, z_{j, t_{j, i}}, θ_{z_{j, t_{j, i}}} \sim STAP (θ_{z_{j, t_{j, i}}}), s_{j, t_{j, 0}} \sim Unif (D),

(3)

\begin{matrix} z_{j, t_{j, i}} | z_{j, t_{j, i - 1}}, π_{j, z_{j, t_{j, i - 1}}} & \sim Multinomial (1, π_{j, z_{j, t_{j, i - 1}}}), \\ z_{j, t_{j, 0}} & \sim Geom (ε), \end{matrix}

(4)

π_{j, l} | α, ν, β \sim DP (α + ν, \frac{α β + ν δ_{l}}{α + ν}),

(5)

{β_{k}}_{k \in N} = C_{1} (β_{μ}^{*}, β_{η}^{*}, β_{\sum}^{*}, β_{τ}^{*}, β_{ρ}^{*}),

(6)

{θ_{k}}_{k \in N} = C_{2} (μ^{*}, η^{*}, \sum^{*}, τ^{*}, ρ^{*}),

(7)

\begin{matrix} β_{μ}^{*} | γ_{μ} \sim Gem (γ_{μ}), β_{η}^{*} | γ_{η} & \sim Gem (γ_{η}), β_{\sum}^{*} | γ_{\sum} \sim Gem (γ_{\sum}), \\ β_{τ}^{*} | γ_{τ} & \sim Gem (γ_{τ}), β_{ρ}^{*} | γ_{ρ} \sim Gem (γ_{ρ}), \end{matrix}

(8)

\begin{matrix} μ_{p}^{*} | H_{μ} \sim H_{μ}, η_{p}^{*} | H_{η} & \sim H_{η}, \sum_{p}^{*} | H_{\sum} \sim H_{\sum}, \\ τ_{p}^{*} | H_{τ} & \sim H_{τ}, ρ_{p}^{*} | H_{ρ} \sim H_{ρ}, \end{matrix}

(9)

where $p \in N$ ⁠, $l \in N$ ⁠, and $i = 1, \dots, T_{j} - 1$ ⁠. A full description of the components of the model and how they are used to introduce the main novelties of our model is given below.

The DPs. In order to simplify the description of the lower levels of the model hierarchy, we use χ as a variable that can be μ, η, ∑, ν, or ρ, and it is used when whatever is described can be applied to any of the five parameters. In Equation (9) the values of the STAP parameter $χ_{p}$ are sampled from the distribution $H_{χ}$ ⁠, and the infinite-dimensional vector of probabilities $β_{χ}^{*} = {β_{χ, p}^{*}}_{p \in N}$ ⁠, associated with parameter χ, is Gem distributed (Gnedin et al., 2001) with scaling parameter $γ_{χ}$ ⁠. The scaling parameter can easily be interpreted with the stick-breaking representation of the GEM distribution, defined as follows:

\begin{matrix} β_{χ, 1}^{*} & \sim B (1, γ_{χ}), \\ \frac{β_{χ, p}^{*}}{1 - \sum_{h = 1}^{p - 1} β_{χ, h}^{*}} & \sim B (1, γ_{χ}), p \neq 1 . \end{matrix}

(10)

From Equation (10), we see that the smaller $γ_{χ}$ is, and the smaller is the number of elements of $β_{χ}$ that contains most of the probability mass with $\lim_{γ_{χ} \to 0} β_{χ, 1}^{*} = 1$ and $\lim_{γ_{χ} \to 0} β_{χ, p}^{*} = 0$ for all p≠1. The vectors $β_{χ}^{*}$ and $χ^{*} = {χ_{p}^{*}}_{p \in N}$ can be used to define the discrete distribution

G_{χ} = \sum_{p \in N} β_{χ, p}^{*} δ_{χ_{p}^{*}},

(11)

where $δ_{\cdot}$ is the Dirac delta function. Equation (11) is a draw from a $DP (γ_{χ}, H_{χ})$ ⁠, and thus we can equivalently describe $β_{χ}^{*}$ and $χ_{p}$ as the components of a sample from $DP (γ_{χ}, H_{χ})$ ⁠, or as Equations (8) and (9). The vectors $χ^{*}$ and $β_{χ}^{*}$ contain, respectively, the values that the parameters can assume (⁠ $χ_{p}^{*}$ ⁠) and the ‘base’ probability (⁠ $β_{χ_{p}}^{*}$ ⁠) that a particular value of the parameter is selected in a behaviour (see Equation (15) below).

The functions $C_{1} (\cdot)$ and $C_{2} (\cdot)$ ⁠. The function $C_{1} (\cdot)$ and $C_{2} (\cdot)$ (Equations (6) and (7)) introduce the sharing of parameters between behaviours, which is one of the novelties of our proposal. The role of function $C_{2} (\cdot)$ is to produce the set of STAP parameters ${θ_{k}}_{k \in N}$ ⁠, where we remind the reader that $θ_{k} = (μ_{k}, η_{k}, \sum_{k}, τ_{k}, ρ_{k})$ ⁠. The set ${θ_{k}}_{k \in N}$ is comprised of all the possible combinations of the 5 STAP parameters, without repetitions. This means that $θ_{k} \neq θ_{k^{'}}$ ⁠, if $k \neq k^{'}$ ⁠, but we can have a subset of elements that has the same value, for example, $τ_{k} \equiv τ_{k^{'}}$ ⁠. Hence, since each behaviour selects its STAP parameters in $θ = {θ_{k}}_{k \in N}$ ⁠, different behaviours can share parameters, even though they are described by a different $θ_{k}$ ⁠.

Function $C_{1} (\cdot)$ is closely related to $C_{2} (\cdot)$ since, if we introduce the new variables $λ_{μ, k}$ ⁠, $λ_{η, k}$ ⁠, $λ_{\sum, k}$ ⁠, $λ_{τ, k}$ and $λ_{ρ, k}$ that represent what parameter is in $θ_{k}$ ⁠, that is,

μ_{k} = μ_{λ_{μ, k}}^{*}, η_{k} = η_{λ_{η, k}}^{*}, \sum_{k} = \sum_{λ_{\sum, k}}^{*}, τ_{k} = τ_{λ_{τ, k}}^{*}, ρ_{k} = ρ_{λ_{ρ, k}}^{*},

(12)

we can associate a value $β_{k}$ to $θ_{k}$ which is computed as

β_{k} = β_{μ, λ_{μ, k}}^{*} β_{η, λ_{η, k}}^{*} β_{\sum, λ_{\sum, k}}^{*} β_{τ, λ_{τ, k}}^{*} β_{ρ, λ_{ρ, k}}^{*} .

The set ${β_{k}}_{k \in N}$ is the output of $C_{1} (\cdot)$ and it is a probability vector, since $\sum_{k \in N} β_{k} = 1$ and $β_{k} \in (0, 1)$ ⁠. We can define the discrete distribution

G_{0} = \sum_{k \in N} β_{k} δ_{θ_{k}},

(13)

where, similarly to Equation (11), its atoms contain all the possible values that $θ_{k}$ can assume and $β_{k}$ is connected to the expected value of the probability of selection $θ_{k}$ as the vector of parameter in a behaviour (see Equation (14) below). This way to define the distribution $G_{0}$ is closely related to the shared base-distribution of the change-point model of Mastrantonio et al. (2021).

Behaviour switching. Let $Π_{j}$ be the matrix that has $π_{j, l} = {π_{j, l, k}}_{k \in N}$ as lth row. Matrix $Π_{j}$ rules the switching between the behaviours of animal j (Equation (4)) and if the jth animal is following behaviour l at time $t_{j, i - 1}$ ⁠, the probability of switching to behaviour k is given by the element of $Π_{j}$ in row l and column k. Hence, the time evolution of $z_{j, t_{j, i}}$ is modeled by a discrete first-order Markov process, which defines an HMM with transition matrix $Π_{j}$ and initial state $z_{j, t_{j, 0}}$ ⁠. The initial state is drawn from a Geometric distribution with parameter ε, which is defined as the number of Bernoulli trials needed to have one success. The row $π_{j, l}$ is DP distributed (see Equation (5)) and the expected value of $π_{j, l, k}$ is equal to

E (π_{j, l, k} | α, ν, β) = \frac{α β_{k} + ν δ (l, k)}{α + ν}

(14)

(see Fox et al., 2011), where δ(l, k) is equal to 1 if l = k, and 0 otherwise. From Equation (14), we can see that the kth element of β is associated with the expected value of $π_{j, l, k}$ ⁠, for all the animals (j = 1, …, m) and $l \in N$ ⁠. Hence, a larger $β_{k}$ increases the probability of switching from any behaviour l to the kth, described by $θ_{k}$ ⁠. However, Equation (14) can also be stated as

E (π_{j, l, k} | α, ν, β) = \frac{α β_{μ, λ_{μ, k}}^{*} β_{η, λ_{η, k}}^{*} β_{\sum, λ_{\sum, k}}^{*} β_{τ, λ_{τ, k}}^{*} β_{ρ, λ_{ρ, k}}^{*} + ν δ (l, k)}{α + ν},

(15)

which highlights how the value $β_{χ, p}^{*}$ is connected to all $π_{j, l, k}$ ⁠, with $l, k \in N$ and j = 1, …, m, so that $λ_{χ, k} = p$ ⁠. Therefore, a larger $β_{χ, p}^{*}$ increases the expected values of all these probabilities, and for this reason we call $β_{χ}^{*}$ the ‘base probability’ as $χ_{p}^{*}$ ⁠. The variable α is the scaling parameter of the DP of Equation (5), which has the same interpretation of $γ_{χ}$ ⁠, while ν is a weight that is added to the self transitions $π_{j, l, l}$ to increase their expected value, see Equation (14), which in turn is used to reduce the tendency of the HDP-HMM to create redundant behaviours, that is, behaviours with similar parameter vectors. For a more detailed description of parameters α and ν, the reader may refer to Fox et al. (2011).

It should be noted that, in most applications, see, e.g., McClintock et al. (2012) and Leos-Barajas et al. (2017), $z_{j, t_{j, i}} \in {1, 2, \dots, K^{*}}$ is assumed, where $K^{*}$ indicates the maximum number of behaviours, while we have $z_{j, t_{j, i}} \in N$ in this work, since we define the HMM using DPs. Thus, the model assumes an infinite and countable number of possible behaviours for each animal, but, since we have a finite number of observed time-points, only a finite number $K_{j}$ of them can be ‘occupied’; these are generally called ‘non-empty states’ (Frühwirth-Schnatter & Malsiner-Walli, 2019), or, in this context, ‘non-empty behaviours’. The random variable $K_{j}$ is used to estimate the number of latent behaviours of the jth animal. Parameters α, ν and γ (through β) determine the number of non-empty states $K_{j}$ ⁠, since they are responsible with the total mass associated with each of the $Π_{j}$ elements.

The emission-distribution. The model specification is concluded with the emission distribution, which is given by Equations (2) and (3). It should be noted that, given the latent behaviours, we consider the animals independent but, since they share the same set of atoms ${θ_{k}}_{k \in N}$ ⁠, the behaviours of the different animals can be described by the same STAP distribution. From Equation (12), we know that $θ_{k}$ and $θ_{k^{'}}$ can have common components and therefore, when an animal changes behaviour, it is not necessary for all the parameters to change and, more importantly, we can also identify the features that two animals share for different behaviours, for example, two behaviours can have the same attractive-point (⁠ $μ_{p}^{*}$ ⁠), even though the strength of attraction (⁠ $τ_{p}^{*}$ ⁠) is different. This feature is one of the main novelties of our proposal, and, although other approaches have a similar characteristic (see, e.g., Jonsen (2016) or Milner et al. (2021)), they do not allow the number of latent behaviours to be estimated, which is the other main novelty of our proposal, the possibility of evaluating, during model fitting, what parameters are allowed to change, or the number of values that a parameter can assume. The sharing of a subset of the parameters between states is new in the context of HMMs based on DPs.

Connection to the sHDP-HMM. To conclude this section, we would like to show that our proposal can be considered as a generalization of the sHDP-HMM. The model of Fox et al. (2011) is defined for a single time-series, and $G_{0}$ is a draw from a DP. It is easy to see that, if we consider only one animal and use only one multivariate parameter in Equation (9), for example, $θ_{p}^{*} = (μ_{p}^{*}, η_{p}^{*}, \sum_{p}^{*}, τ_{p}^{*}, ρ_{p}^{*})$ ⁠, with the associated vector of probability $β_{θ}^{*}$ ⁠, the distribution $G_{0}$ is a draw from a DP, since $β_{k} = β_{θ, p}^{*}$ and $θ_{k} = θ_{p}^{*}$ ⁠. Hence, the model reduces to a sHDP-HMM.

4 REAL DATA APPLICATION

We have the recorded coordinates of 6 dogs, taken every 30 min at the Heatherlie property in Australia, between 2012-11-10 15:30 and 2012-08-02 15:30. The data¹ consist of 4801 observations for each dog, with less than 1% of missing points. To facilitate the specification of the prior the coordinates are centered using the bivariate sample mean and scaled with a common standard deviation, computed using both the X and Y coordinates, to maintain the relative scale between the two coordinates; the recorded locations are shown in Figure 2. The dogs are called Woody, Sherlock, Alvin, Rosie, Bear, and Lucy. Rosie and Lucy are female, while the other four are male, and Woody, Sherlock, Bear and Lucy form a cohesive group, which is responsible for livestock protection, while Rosie, due to her advanced age, is solitary, and Alvin suffers from social exclusion, which restricted his movement (van Bommel & Johnson, 2016).

FIGURE 2

The observed spatial locations of the six dogs

Open in new tab Download slide

Maremma Sheepdogs originate from Europe, and have been used for centuries to protect livestock from potential predators (Gehring et al., 2017). They are trained to live with the livestock from birth and, as a result, they develop a strong bond with them and an instinct to protect them. They can be fence-trained, but are generally allowed to move freely. The use of livestock guardian dogs is relatively new outside Europe, especially in Australia, and, due to their effectiveness, interest in their use is increasing (van Bommel & Invasive Animals Cooperative Research Centre, 2010; van Bommel & Johnson, 2016). Since the extension of properties in Australia can be as much as several thousand hectares, it is hard for the owner to supervise the dogs (van Bommel & Johnson, 2012) and to have information about their behaviour (van Bommel & Invasive Animals Cooperative Research Centre, 2010).

4.1 Comparison of the model and implementation details

We compare the predictive performances between our proposal (M1) and two competitive models (M2 and M3), using the integrated completed likelihood (ICL) (Biernacki et al., 2000) and the deviance information criteria (DIC) ${DIC}_{5}$ and ${DIC}_{7}$ (Celeux et al., 2006). In the first competitive model (M2), we assume that only the entire vector of STAP parameters can be shared between animals, which means that each time-series is an sHDP-HMM with a share-based distribution. In terms of model formalization, we assume

\begin{matrix} {β_{k}}_{k \in N} | γ & \sim Gem (γ), \\ θ_{k} | H_{μ}, H_{η}, H_{\sum}, H_{τ}, H_{ρ} & \sim H_{μ} \times H_{η} \times H_{\sum} \times H_{τ} \times H_{ρ}, \end{matrix}

for the set of atoms and weights of $G_{0}$ (Equation (13)). In the second competitor (M3), each animal follows the model of Mastrantonio (2020), which means they are completely independent, and there is no sharing of parameters. Hence, we substitute $G_{0}$ with the animal-specific distribution

G_{0, j} = \sum_{k \in N} β_{j, k} δ_{{\tilde{θ}}_{j, k}},

where

\begin{matrix} {β_{j, k}}_{k \in N} | γ_{j} & \sim Gem (γ_{j}), \\ {\tilde{θ}}_{j, k} | H_{μ}, H_{η}, H_{\sum}, H_{τ}, H_{ρ} & \sim H_{μ} \times H_{η} \times H_{\sum} \times H_{τ} \times H_{ρ}, \\ {\tilde{θ}}_{j, k} & = (μ_{j, k}, η_{j, k}, \sum_{j, k}, τ_{j, k}, ρ_{j, k}), \end{matrix}

and then

s_{j, t_{j, i + 1}} | s_{j, t_{j, i}}, s_{j, t_{j, i - 1}}, z_{j, t_{j, i}}, {\tilde{θ}}_{j, z_{j, t_{j, i}}} \sim STAP ({\tilde{θ}}_{j, z_{j, t_{j, i}}}) .

By changing the way distribution $G_{0}$ is defined, we aim to show that the main feature of our proposal, that is, the sharing of sets and subsets of parameters, improves the model fitting and leads to a better description of the data.

The models are implemented assuming $H_{μ} \equiv N (0, 20 I)$ ⁠, $H_{η} \equiv N (0, 20 I)$ ⁠, $H_{τ} \equiv U (0, 1)$ ⁠, and $H_{\sum} \equiv I W (3, I)$ ⁠. The distribution $H_{ρ}$ is assumed to be a mixture of a U(0,1) and two bulks of probability on 0 and 1, with the 3 mixture weights equal to 1/3. This allows $ρ_{k}$ (in M1 and M2) and $ρ_{j, k}$ (in M3) to be, a posteriori, equal to 0 or 1 with a greater probability than 0, which allows us to detect a pure CRW or BRW behaviour. We assume $α + ν, γ_{μ}, γ_{η}, γ_{\sum}, γ_{ρ}, γ_{τ}, γ, γ_{1}, \dots, γ_{m} \sim G (0.01, 0.01)$ and ν/(α + ν) ∼ B(1, 1), which allows us to easily sample from their full conditionals, see Fox et al. (2011) and Section A of the Web-based Supporting Materials. All the distributions are chosen to be weakly informative. The domain $D$ is a square [−5, 5] × [−5, 5] and the parameter ε of the Geometric distribution is equal to 0.00001, which means that the distribution over the initial state, $z_{j, t_{j, 0}}$ ⁠, is approximately uniform over the positive integers. Posterior estimates are obtained with 15,000 iterations, burnin 7500, thin 3, and thus 2500 samples are available for posterior inference. Convergence has been checked by means of a visual inspection of the posterior chains and using the $\hat{R}$ statistics (Gelman et al., 2013) Details on the MCMC algorithm, implemented in Julia 1.3 (Bezanson et al., 2017), can be found in the Web-based Supporting Materials, Section A, and the codes used to replicate the results, tables, and figures are available at https://github.com/GianlucaMastrantonio/multiple_animals_movement_model.

In Table 1, we can see that the three indices indicate that our model is the one with the best fit,² model M2 is the second, while M3 is always the last. Therefore, the joint modelling of the six dogs improves the performances of the model (since M2 is always preferable to M3), but the sharing of a subset of parameters also leads to a better description of the data (since M1 is better than M2). We provide a description of the results obtained with M2 and M3 in the Web-based Supporting Materials, Section B.

TABLE 1

Open in new tab

Information criteria for the proposed model (M1), sHDP-HHMMs with a common $G_{0}$ (M2), sHDP-HHMMs with animal-specific $G_{0, j}$ (M3). The model selected by each index is indicated in bold

	M1	M2	M3
ICL	209593	201880	192145
DIC5	−457502	−441264	−407823
DIC7	−417544	−400786	−376037

TABLE 1

Open in new tab

Information criteria for the proposed model (M1), sHDP-HHMMs with a common $G_{0}$ (M2), sHDP-HHMMs with animal-specific $G_{0, j}$ (M3). The model selected by each index is indicated in bold

	M1	M2	M3
ICL	209593	201880	192145
DIC5	−457502	−441264	−407823
DIC7	−417544	−400786	−376037

4.2 Description and interpretation of the output

Using the algorithm proposed by Wade and Ghahramani (2018), we find a representative behaviour ${\hat{z}}_{j, t_{j, i}}$ associated with each animal and time, that we indicate as MAP behaviour. We indicate the kth behaviour of the jth dog based on ${\hat{z}}_{j, t_{j, i}}$ as $B_{jk}$ ⁠, and let $n_{j, k}$ be the number of times we have ${\hat{z}}_{j, t_{j, i}} = k$ ⁠, without any loss of generality, we assume $n_{j, 1} > n_{j, 2} > \dots$ ⁠, that is, the behaviours are ordered with respect to the number of times they are observed. It should be noted that $B_{jk}$ is not the same as $B_{j^{'} k}$ ⁠, if $j \neq j^{'}$ ⁠, and therefore, to avoid confusion, we indicate the vector of STAP parameters for $B_{jk}$ with $θ_{j, k} = (μ_{j, k}, η_{j, k}, \sum_{j, k}, τ_{j, k}, ρ_{j, k})$ ⁠. For easiness of interpretation, we only discuss behaviours that have been observed, on average, at least once a day (⁠ $n_{j, k} > 100$ ⁠), thus obtaining then 3 behaviours for each dogs, with the exception of Rosie (dog 4) that has 2 behaviours; see Tables B.1–B.6 in the Web-based Supporting Materials, where the posterior means (⁠ $\hat{}$ ⁠) and credible intervals (CIs) for the STAP parameters, $n_{j, k}$ ⁠, and the transition probabilities for all the dogs and behaviours are shown. Using similar pictures to the ones used in Figure 1, we show a graphical description of the behaviours found by the model in Figures 3 and 4; the behaviours are represented on different spatial scales. From the model output, we computed the posterior mean of the variable $δ (χ_{j, k}, χ_{j^{'}, k^{'}})$ ⁠, which is the posterior probability that a STAP parameter assumes the same value in $B_{jk}$ and $B_{j^{'} k^{'}}$ ⁠. These probabilities are depicted in Figure 5 for all possible combinations of behaviours and animals. To take into account that identifiability for $μ_{j, k}$ and $η_{j, k}$ is only granted if $ρ_{j, k} < 1$ and $ρ_{j, k} > 0$ ⁠, respectively (see Equation (1)), we assume $δ (μ_{j, k}, μ_{j^{'}, k^{'}}) = 0$ ⁠, if $ρ_{j, k} = 1$ or $ρ_{j^{'}, k^{'}} = 1$ ⁠, and $δ (η_{j, k}, η_{j^{'}, k^{'}}) = 0$ if $ρ_{j, k} = 0$ or $ρ_{j^{'}, k^{'}} = 0$ ⁠, for $(i, j) \neq (i^{'}, j^{'})$ ⁠.

FIGURE 3

Graphical representation of the conditional distribution of $s_{j, t_{j, i + 1}}$ for the first three dogs ((a)(b)(c) Woody, (d)(e)(f) Sherlock, (g)(h)(i) Alvin), for different possible values of $s_{j, t_{j, i}}$ and previous directions. The images has been obtained by using the posterior values that maximize the data likelihood of each animal, given the representative clusterization ${\hat{z}}_{j, t_{j, i}}$ ⁠. The dashed arrow represents the movement between $s_{j, t_{j, i - 1}}$ and $s_{j, t_{j, i}}$ ⁠. The solid arrow is ${\vec{F}}_{j, t_{j, i}}$ ⁠, while the ellipse is an area containing 95% of the probability mass of the conditional distribution of $s_{j, t_{j, i + 1}}$ ⁠. The asterisk represents the attractor, and it is only shown for behaviours that have posterior values of $ρ_{j, k} < 0.9$ and $τ_{j, k} > 0.1$ [Colour figure can be viewed at https://dbpia.nl.go.kr]

Open in new tab Download slide

FIGURE 4

Graphical representation of the conditional distribution of $s_{j, t_{j, i + 1}}$ for the last three dogs ((a)(b) Rosie, (c)(d)(e) Bear, (f)(g)(h) Lucy), for different possible values of $s_{j, t_{j, i}}$ and previous directions. The images has been obtained by using the posterior values that maximize the data likelihood of each animal, given the representative clusterization ${\hat{z}}_{j, t_{j, i}}$ ⁠. The dashed arrow represents the movement between $s_{j, t_{j, i - 1}}$ and $s_{j, t_{j, i}}$ ⁠. The solid arrow is ${\vec{F}}_{j, t_{j, i}}$ ⁠, while the ellipse is an area containing 95% of the probability mass of the conditional distribution of $s_{j, t_{j, i + 1}}$ ⁠. The asterisk represents the attractor, and it is only shown for behaviours that have posterior values of $ρ_{j, k} < 0.9$ and $τ_{j, k} > 0.1$ [Colour figure can be viewed at https://dbpia.nl.go.kr]

Open in new tab Download slide

Graphical representation of the posterior mean of δ(·,·), which represents the probability of the parameters in its argument ((a) μj,k, (b) ηj,k, (c) τj,k, (d) ∑j,k, (e) ρj,k) having the same value for two behaviours [Colour figure can be viewed at https://dbpia.nl.go.kr]

FIGURE 5

Graphical representation of the posterior mean of δ(·,·), which represents the probability of the parameters in its argument ((a) $μ_{j, k}$ ⁠, (b) $η_{j, k}$ ⁠, (c) $τ_{j, k}$ ⁠, (d) $\sum_{j, k}$ ⁠, (e) $ρ_{j, k}$ ⁠) having the same value for two behaviours [Colour figure can be viewed at https://dbpia.nl.go.kr]

Open in new tab Download slide

Similarities between the MAP behaviours. One of the assumptions of our proposal is that the temporal evolution of the behaviours are independent. To have a posterior confirmation that this hypothesis is true, we computed the Adjusted Rand Index (Gates & Ahn, 2017) for the MAP behaviours of each pair of animals. The Adjusted Rand Index, which is a measure of similarity (or agreement) between two clusterizations, has a value close to one, if there is a strong agreement, while its value is close to zero (even negative) if the clusterization is very dissimilar. It should be noted that we are able to compute the index because the animals are observed in the same temporal-window. The results in Table 2 show that the values of the index are very low, with the exception of dogs 1 and 6, where the index is 0.414. We can conclude that our hypothesis is reasonable.

TABLE 2

Open in new tab

The Adjusted Rand Index for all the pairs of dogs

	Woody	Sherlock	Alvin	Rosie	Bear	Lucy
Woody	1.000	0.094	0.006	0.030	0.181	0.414
Sherlock	0.094	1.000	−0.004	0.015	0.125	0.081
Alvin	0.006	−0.004	1.000	0.047	−0.002	0.003
Rosie	0.030	0.015	0.047	1.000	0.021	0.021
Bear	0.181	0.125	−0.002	0.021	1.000	0.159
Lucy	0.414	0.081	0.003	0.021	0.159	1.000

	Woody	Sherlock	Alvin	Rosie	Bear	Lucy
Woody	1.000	0.094	0.006	0.030	0.181	0.414
Sherlock	0.094	1.000	−0.004	0.015	0.125	0.081
Alvin	0.006	−0.004	1.000	0.047	−0.002	0.003
Rosie	0.030	0.015	0.047	1.000	0.021	0.021
Bear	0.181	0.125	−0.002	0.021	1.000	0.159
Lucy	0.414	0.081	0.003	0.021	0.159	1.000

TABLE 2

Open in new tab

The Adjusted Rand Index for all the pairs of dogs

	Woody	Sherlock	Alvin	Rosie	Bear	Lucy
Woody	1.000	0.094	0.006	0.030	0.181	0.414
Sherlock	0.094	1.000	−0.004	0.015	0.125	0.081
Alvin	0.006	−0.004	1.000	0.047	−0.002	0.003
Rosie	0.030	0.015	0.047	1.000	0.021	0.021
Bear	0.181	0.125	−0.002	0.021	1.000	0.159
Lucy	0.414	0.081	0.003	0.021	0.159	1.000

	Woody	Sherlock	Alvin	Rosie	Bear	Lucy
Woody	1.000	0.094	0.006	0.030	0.181	0.414
Sherlock	0.094	1.000	−0.004	0.015	0.125	0.081
Alvin	0.006	−0.004	1.000	0.047	−0.002	0.003
Rosie	0.030	0.015	0.047	1.000	0.021	0.021
Bear	0.181	0.125	−0.002	0.021	1.000	0.159
Lucy	0.414	0.081	0.003	0.021	0.159	1.000

The dogs in the cohesive group. We can clearly see, from Figures 3 and 4, that the four dogs that form a cohesive group (dog 1 Woody, dog 2 Sherlock, dog 5 Bear and dog 6 Lucy) have similar behaviours. In behaviour $B_{j 1}$ ⁠, the length of ${\vec{F}}_{j, t_{j, i}}$ is ≈0, which means that the distribution of $s_{j, t_{j, i + 1}}$ is centered on the previous location (⁠ $s_{j, t_{j, i}}$ ⁠), there is no a preferable direction, since the ellipses are very close to a circle, the speed is very low (see the size of the ellipses), and there is no attractor. Hence, the movement is only determined by the covariance matrix $\sum_{j, 1}$ ⁠, which is the same for all the first behaviours (⁠ $B_{j 1}$ ⁠), see Figure 5d. The first behaviour of the cohesive group can easily be interpreted as boundary-patrolling- or scent-marking-behaviour, which is a common behaviour in this dog breed, and it has already been observed and with similar movement characteristics (Mastrantonio, 2020; McGrew & Blakesley, 1982).

As in $B_{j 1}$ ⁠, the length of ${\vec{F}}_{j, t_{j, i}}$ is ≈0 in $B_{j 2}$ ⁠, there is no attractor but, since the ellipses rotate according to the previous directions, there is directional persistence. The major and minor axes of the ellipses have different lengths, with the major one in the same direction as $ϕ_{j, t_{j, i - 1}}$ ⁠, which means that we can expect to observe more movements on a straight line, that is, in the same direction as the previous bearing angle $ϕ_{j, t_{j, i - 1}}$ ⁠, or in direction $ϕ_{j, t_{j, i - 1}} - π$ ⁠. The strength of directional persistence, measured by parameter $ρ_{j, 2}$ ⁠, is very similar for all the $B_{j 2}$ ⁠, as we can see from Figure 5e, where the probability values are close to 1. The movement speed in $B_{j 2}$ increases compared to $B_{j 1}$ ⁠, and $B_{j 2}$ is fully characterized by $\sum_{j, 2}$ and $ρ_{j, 2}$ ⁠. The strong directional persistence, the movement along a straight line, and the higher speed can lead to $B_{j 2}$ being interpreted as a defending-behaviour, where the dog defends the territory and livestock from predators, that is, mostly wild dogs and foxes that are present in the area (see Brook et al., 2012; Walton et al., 2017), or an explore-behaviour (van Bommel & Invasive Animals Cooperative Research Centre, 2010; Mastrantonio, 2020).

The third behaviour is a BRW, since the CIs of $ρ_{j, 3}$ are very close to 0 and the ellipses are independent of the previous direction (see Figures 3 and 4). The CIs of $τ_{j, 3}$ ⁠, which are in [0.1, 0.28], indicate a moderate attraction to $μ_{j, 3}$ ⁠, see Tables B.1, B.2, B.5 and B.6 in the Web-based Supporting Materials. The four dogs have the same spatial attractor, $μ_{j, 3}$ ⁠, the same covariance matrix, $\sum_{j, 3}$ ⁠, the same parameter $ρ_{j, k}$ and, with the exception of the second dog, the same $τ_{j, k}$ ⁠, as we can see from Figures 3–5. The spatial attractor, due to the large variance of the movement (the ellipses size), can be considered as a tendency of these dogs to move to the central patch of the space and, since we can see from figure c2 of van Bommel and Johnson (2012) that the attractor is close to where the livestock is, it is easy to interpret this behaviour as the dog attending livestock.

It is interesting to note that the parameters that define the three behaviours, that is, $\sum_{j, k}$ in $B_{j 1}$ ⁠, (⁠ $\sum_{j, k}, ρ_{j, k}$ ⁠) in $B_{j 2}$ ⁠, (⁠ $μ_{j, k}, τ_{j, k}, \sum_{j, k}, ρ_{j, k}$ ⁠) in $B_{j 3}$ ⁠, have a high probability of being the same between dogs, see Figure 5, which means that they behave in a similar manner. In transition matrix terms, we can see, from Tables B.1, B.2, B.5, B.6 in the Web-based Supporting Materials, that ${\hat{π}}_{j, 1, 2} > {\hat{π}}_{j, 1, 3}$ and ${\hat{π}}_{j, 3, 2} > {\hat{π}}_{j, 3, 1}$ ⁠. Then, after patrolling (⁠ $B_{j 1}$ ⁠), it is more probable that the dog begins to explore the space, or to defend the property from predators spotted during patrolling (⁠ $B_{j 2}$ ⁠), than to guard livestock (⁠ $B_{j 3}$ ⁠). After attending livestock, it is more probable that the dog switches to $B_{j 2}$ ⁠.

The socially excluded dog and the old one. The socially excluded dog (j = 3) has 3 behaviours, one of which, $B_{31}$ ⁠, is different from all the other dogs’ behaviours, while the other two, $B_{32}$ and $B_{33}$ ⁠, are similar to the ones of the cohesive group, for example, $B_{32}$ is similar to $B_{j 2}$ and $B_{33}$ is similar to $B_{j 1}$ ⁠, with j = 1, 2, 5, 6, see Figures 3 and 4. $B_{33}$ is only characterized by the covariance matrix $\sum_{j, 3}$ ⁠, since like $B_{j 1}$ in the cohesive group, ${\vec{F}}_{3, t_{3, i}}$ is ≈0 and there is no attractor or directional persistence. Therefore, due to the high probability of $\sum_{3, 3}$ having the same value as $\sum_{j, 3}$ ⁠, with j = 1, 2, 5, 6, see Figure 5d, $B_{33}$ may be interpreted in the same way as the first behaviours of the cohesive group, that is, a boundary-patrolling behaviour.

$B_{32}$ is similar to the second behaviour of the cohesive group in terms of covariance matrix, see Figure 5d, but there is a lack of directional persistence, and a slight bias toward an attractor located in the central area, see the direction of ${\vec{F}}_{3, t_{3, i}}$ ⁠. With this behaviour, the dog is exploring but, at the same time, staying in the proximity of the sheep paddock, see Figure 2c.

The CI of $τ_{3, 1}$ is very close to 1 and $B_{31}$ therefore has a strong attraction to the coordinates ${\hat{μ}}_{3, 1} = {(0.575, - 381)}^{'}$ ⁠, as we can see from Table B.3 of the Web-based Supporting Materials. The CIs of $μ_{3, 1, 1}$ and $μ_{3, 1, 2}$ are very small and equal to (0.575, 0.576) and (−0.381, 0.380), respectively, which means that the attractor is well localized in space. We can see from figure c2 of van Bommel and Johnson (2012), that ${\hat{μ}}_{3, 1}$ is close to the owner's homestead. This behaviour cannot be interpreted as the dog attending livestock since, once it reaches the spatial attractor, it does not move very much, that is, the sizes of the ellipses are very small. Hence, this is probably a behaviour in which the dog stays close to the owner's house, and rests.

For the last dog, that is the old one, the model has found only two behaviours. $B_{41}$ has the same characteristics as the first behaviour of the cohesive group, with the same values of the covariance matrix, while $B_{42}$ is similar to the second behaviour of the cohesive group, with the same covariance matrix and similar directional persistence, see Figures 3–5. Hence, they can be interpreted as the first and second behaviours of the cohesive group. It is interesting to note that, since this dog is very old, she is no longer able to attend livestock, which is the activity that requires the most energy (higher speed), but she is still working, that is, checking the boundaries (⁠ $B_{41}$ ⁠), exploring the space, and defending the territory (⁠ $B_{42}$ ⁠).

5 FINAL REMARKS

In this work, we have proposed a new HMM which can be used to model trajectory-tracking data of multiple animals and which, according to the classification given by Scharf and Buderman (2020), is part of the indirect approach. Our model allows subsets of parameters to be shared between animals and behaviours, and the number of latent behaviours to be selected during the model fitting. The emission distribution is the STAP, but other distributions can be used by changing the model formalization accordingly. The model was used to help understand the behaviour of 6 Maremma Sheepdogs, observed in a property in Australia. The results show that there are many common features between the animals, such as the attractive point, and most of them share the same number of behaviours as well as the same parameter values. The obtained results are easily interpretable, and the rich output offers an insight into the similarities between animals.

As a possible extension, we are currently exploring the use of covariates to model the probability that behaviours share parameters, and we are working on a different formalization that makes the model able to detect whether different animals tend to follow the same behaviours at the same time-points.

DATA AVAILABILITY STATEMENT

The dataset is freely available from the movebank repository https://www.datarepository.movebank.org/handle/10255/move.395

SUPPORTING INFORMATION

Additional supporting information may be found in the online version of the article at the publisher's website.

The dataset is freely available from the movebank repository https://www.datarepository.movebank.org/handle/10255/move.395

It should be noted that a higher ICL and a lower DIC indicate a better fit.

ACKNOWLEDGEMENTS

The authors thank the Editor-in-Chief, the Associate Editor and the two anonymous reviewers for their comments that have greatly improved the manuscript. The work of the author has partially been developed under the MIUR grant Dipartimenti di Eccellenza 2018—2022 (E11G18000350001), conferred to the Dipartimento di Scienze Matematiche—DISMA,

REFERENCES

Anderson

C.R.

Lindzey

F.G.

(

2003

)

Estimating cougar predation rates from GPS location clusters

The Journal of Wildlife Management

(

307

–

316

Google Scholar

Crossref

WorldCat

Bezanson

Edelman

Karpinski

Shah

V.B.

(

2017

)

Julia: a fresh approach to numerical computing

SIAM Review

(

–

Google Scholar

Crossref

WorldCat

Biernacki

Celeux

Govaert

(

2000

)

Assessing a mixture model for clustering with the integrated completed likelihood

IEEE Transactions on Pattern Analysis and Machine Intelligence

(

719

–

725

Google Scholar

Crossref

WorldCat

Blackwell

(

1997

)

Random diffusion models for animal movement

Ecological Modelling

100

(

–

102

Google Scholar

OpenURL Placeholder Text

WorldCat

van Bommel

Invasive Animals Cooperative Research Centre

. (

2010

Guardian dogs: best practice manual for the use of livestock guardian dogs

Invasive Animals Cooperative Research Centre

van Bommel

Johnson

C.N.

(

2012

)

Good dog! Using livestock guardian dogs to protect livestock from predators in Australia's extensive grazing systems

Wildlife Research

(

220

–

229

Google Scholar

Crossref

WorldCat

van Bommel

Johnson

(

2014a

)

Data from: where do livestock guardian dogs go?

Movement Patterns of Free-Ranging Maremma Sheepdogs

, https://doi.org/10.5441/001/1.pv048q7v

van Bommel

Johnson

C.N.

(

2014b

)

Where do livestock guardian dogs go? Movement patterns of free-ranging maremma sheepdogs

PLoS ONE

(

–

Google Scholar

Crossref

WorldCat

van Bommel

Johnson

C.N.

(

2016

)

Livestock guardian dogs as surrogate top predators? How Maremma sheepdogs affect a wildlife community

Ecology and Evolution

(

6702

–

6711

Brook

L.A.

Johnson

C.N.

Ritchie

E.G.

(

2012

)

Effects of predator control on behaviour of an apex predator and indirect consequences for mesopredator suppression

Journal of Applied Ecology

(

1278

–

1286

Google Scholar

Crossref

WorldCat

Buderman

F.E.

Hooten

M.B.

Alldredge

M.W.

Hanks

E.M.

Ivan

J.S.

(

2018

)

Time-varying predatory behavior is primary predictor of fine-scale movement of wildland-urban cougars

Movement Ecology

(

Calabrese

J.M.

Fleming

C.H.

Fagan

W.F.

Rimmler

Kaczensky

Bewick

et al. (

2018

)

Disentangling social interactions and environmental drivers in multi-individual wildlife tracking data

Philosophical Transactions of the Royal Society B: Biological Sciences

373

(

1746

20170007

Google Scholar

Crossref

WorldCat

Celeux

Forbes

Robert

C.P.

Titterington

D.M.

Futurs

Rhône-alpes

(

2006

)

Deviance information criteria for missing data models

Bayesian Analysis

651

–

674

Google Scholar

OpenURL Placeholder Text

WorldCat

Christ

Hoef

J.V.

Zimmerman

D.L.

(

2008

)

An animal movement model incorporating home range and habitat selection

Environmental and Ecological Statistics

(

–

Google Scholar

Crossref

WorldCat

Codling

E.A.

Plank

M.J.

Benhamou

(

2008

)

Random walk models in biology

Journal of the Royal Society Interface

(

813

–

834

Dunn

J.E.

Gipson

P.S.

(

1977

)

Analysis of radiotelemetry data in studies of home range

Biometrics

(

–

101

Google Scholar

Crossref

WorldCat

Fortin

Morales

J.M.

Boyce

M.S.

(

2005

)

Elk winter foraging at fine scale in Yellowstone National Park

Oecologia

145

(

334

–

342

Google Scholar

Crossref

WorldCat

Fox

E.B.

Sudderth

E.B.

Jordan

M.I.

Willsky

A.S.

(

2011

)

A sticky HDPHMM with application to speaker diarization

The Annals of Applied Statistics

(

1020

–

1056

Google Scholar

Crossref

WorldCat

Frühwirth-Schnatter

Malsiner-Walli

(

2019

)

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Advances in Data Analysis and Classification

(

–

Gates

A.J.

Ahn

Y.-Y.

(

2017

)

The impact of random models on clustering similarity

Journal of Machine Learning Research

(

–

Google Scholar

OpenURL Placeholder Text

WorldCat

Gehring

T.M.

VerCauteren

K.C.

Cellar

A.C.

(

2017

)

Good fences make good neighbors: implementation of electric fencing for establishing effective livestock-protection dogs

Human-Wildlife Interactions

(

106

–

111

Google Scholar

OpenURL Placeholder Text

WorldCat

Gelman

Carlin

J.B.

Stern

H.S.

Rubin

D.B.

(

2013

Bayesian data analysis

, 3rd edition.

Boca Raton, FL

Chapman and Hall/CRC

Gnedin

Kerov

(

2001

)

A characterization of GEM distributions

Combinatorics, Probability and Computing

213

–

217

Google Scholar

Crossref

WorldCat

Hastie

D.I.

Green

P.J.

(

2012

)

Model choice using reversible jump Markov chain Monte Carlo

Statistica Neerlandica

(

309

–

338

Google Scholar

Crossref

WorldCat

Hebblewhite

Merrill

(

2008

)

Modelling wildlife and uman relationships for social species with mixed-effects resource selection models

Journal of Applied Ecology

(

834

–

844

Google Scholar

Crossref

WorldCat

Hooten

Johnson

McClintock

Morales

(

2017

Animal movement: statistical models for telemetry data

Boca Raton, FL

CRC Press

Hooten

M.B.

Scharf

H.R.

Hefley

T.J.

Pearse

A.T.

Weegman

M.D.

(

2018

)

Animal movement models for migratory individuals and groups

Methods in Ecology and Evolution

(

1692

–

1705

Google Scholar

Crossref

WorldCat

Jammalamadaka

S.R.

Kozubowski

T.J.

(

2004

)

New families of wrapped distributions for modeling skew circular data

Communications in Statistics - Theory and Methods

(

2059

–

2074

Google Scholar

Crossref

WorldCat

Jonsen

(

2016

)

Joint estimation over multiple individuals improves behavioural state inference from animal movement data

Scientific Reports

(

20625

Jonsen

I.D.

Flemming

J.M.

Myers

R.A.

(

2005

)

Robust state-space modeling of animal movement data

Ecology

(

2874

–

2880

Google Scholar

Crossref

WorldCat

Langrock

King

Matthiopoulos

Thomas

Fortin

Morales

J.M.

(

2012

)

Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions

Ecology

(

2336

–

2342

Leos-Barajas

Gangloff

E.J.

Adam

Langrock

van Beest

F.M.

Nabe-Nielsen

et al. (

2017

)

Multi-scale modeling of animal movement and general behavior data using hidden Markov Models with hierarchical structures

Journal of Agricultural, Biological and Environmental Statistics

(

232

–

248

Google Scholar

Crossref

WorldCat

Maruotti

Punzo

Mastrantonio

Lagona

(

2016

)

A time-dependent extension of the projected normal regression model for longitudinal circular data based on a hidden Markov heterogeneity structure

Stochastic Environmental Research and Risk Assessment

1725

–

1740

Google Scholar

Crossref

WorldCat

Mastrantonio

(

2018

)

The joint projected normal and skew-normal: a distribution for poly-cylindrical data

Journal of Multivariate Analysis

165

–

Google Scholar

Crossref

WorldCat

Mastrantonio

(

2020

)

Modelling animal movement with directional persistence and attractive points

arXiv

. 2012.03248.

Mastrantonio

Jona Lasinio

Pollice

Teodonio

Capotorti

(

2021

)

A Dirichlet process model for change-point detection with multivariate bioclimatic data

Environmetrics

e2699

Google Scholar

OpenURL Placeholder Text

WorldCat

McClintock

B.T.

King

Thomas

Matthiopoulos

McConnell

B.J.

Morales

J.M.

(

2012

)

A general discrete-time modeling framework for animal movement using multistate random walks

Ecological Monographs

(

335

–

349

Google Scholar

Crossref

WorldCat

McClintock

B.T.

Russell

D.J.F.

Matthiopoulos

King

(

2013

)

Combining individual animal movement and ancillary biotelemetry data to investigate populationlevel activity budgets

Ecology

(

838

–

849

Google Scholar

Crossref

WorldCat

McGrew

J.C.

Blakesley

C.S.

(

1982

)

How Komondor dogs reduce sheep losses to coyotes

Journal of Range Management

(

693

–

696

Google Scholar

OpenURL Placeholder Text

WorldCat

Merrill

S.B.

David Mech

(

2000

)

Details of extensive movements by Minnesota wolves (Canis lupus)

The American Midland Naturalist

144

(

428

–

433

Google Scholar

Crossref

WorldCat

Michelot

Langrock

Patterson

T.A.

(

2016

)

moveHMM: an R package for the statistical modelling of animal movement data using hidden Markov models

Methods in Ecology and Evolution

(

1308

–

1315

Google Scholar

Crossref

WorldCat

Michelot

Langrock

Bestley

Jonsen

I.D.

Photopoulou

Patterson

T.A.

(

2017

)

Estimation and simulation of foraging trips in land-based marine predators

Ecology

(

1932

–

1944

Milner

J.E.

Blackwell

P.G.

Niu

(

2021

)

Modelling and inference for the movement of interacting animals

Methods in Ecology and Evolution

(

–

Google Scholar

Crossref

WorldCat

Niu

Frost

Milner

J.E.

Skarin

Blackwell

P.G.

(

2020

)

Modelling group movement with behaviour switching in continuous time

Biometrics

–

. Available from: https://doi.org/10.1111/biom.13412

Patterson

Thomas

Wilcox

Ovaskainen

Matthiopoulos

(

2008

)

State-space models of individual animal movement

Trends in Ecology & Evolution

(

–

Google Scholar

Crossref

WorldCat

Pohle

Langrock

van Beest

F.M.

Schmidt

N.M.

(

2017

)

Selecting the number of states in hidden Markov models: pragmatic solutions illustrated using animal movement

Journal of Agricultural, Biological and Environmental Statistics

(

270

–

293

Google Scholar

Crossref

WorldCat

Scharf

H.R.

Buderman

F.E.

(

2020

)

Animal movement models for multiple individuals

WIREs Computational Statistics

e1506

Google Scholar

Crossref

WorldCat

Teh

Y.W.

Jordan

M.I.

Beal

M.J.

Blei

D.M.

(

2006

)

Hierarchical Dirichlet processes

Journal of the American Statistical Association

101

(

476

1566

–

1581

Google Scholar

Crossref

WorldCat

Wade

Ghahramani

(

2018

)

Bayesian cluster analysis: point estimation and credible balls (with discussion)

Bayesian Analysis

(

559

–

626

Google Scholar

Crossref

WorldCat

Walton

Samelius

Odden

Willebrand

(

2017

)

Variation in home range size of red foxes Vulpes vulpes along a gradient of productivity and human landscape alteration

PLoS ONE

(

–

Google Scholar

Crossref

WorldCat

Westley

P.A.H.

Berdahl

A.M.

Torney

C.J.

Biro

(

2018

)

Collective movement in ecology: from emerging technologies to conservation and management

Philosophical Transactions of the Royal Society B: Biological Sciences

373

(

1746

20170004

Google Scholar

Crossref

WorldCat

This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Download all slides

Month:	Total Views:
March 2023	20
April 2023	2
May 2023	7
June 2023	15
July 2023	12
August 2023	14
September 2023	11
October 2023	10
November 2023	22
December 2023	17
January 2024	16
February 2024	6
March 2024	14
April 2024	14
May 2024	15
June 2024	11
July 2024	30
August 2024	9
September 2024	27
October 2024	14
November 2024	13
December 2024	20
January 2025	14
February 2025	16
March 2025	37
April 2025	24
May 2025	9

Article Contents

The Modelling of Movement of Multiple Animals that Share Behavioural Features

Abstract

1 INTRODUCTION

2 THE STAP DISTRIBUTION

3 THE PROPOSED MODEL

4 REAL DATA APPLICATION

4.1 Comparison of the model and implementation details

4.2 Description and interpretation of the output

5 FINAL REMARKS

DATA AVAILABILITY STATEMENT

SUPPORTING INFORMATION

ACKNOWLEDGEMENTS

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

The Modelling of Movement of Multiple Animals that Share Behavioural Features Open Access

Abstract

1 INTRODUCTION

2 THE STAP DISTRIBUTION

3 THE PROPOSED MODEL

4 REAL DATA APPLICATION

4.1 Comparison of the model and implementation details

4.2 Description and interpretation of the output

5 FINAL REMARKS

DATA AVAILABILITY STATEMENT

SUPPORTING INFORMATION

ACKNOWLEDGEMENTS

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

The Modelling of Movement of Multiple Animals that Share Behavioural Features