Extension of the glmm.hp package to zero-inflated generalized linear mixed models and multiple regression

Abstract

glmm.hp is an R package designed to evaluate the relative importance of collinear predictors within generalized linear mixed models (GLMMs). Since its initial release in January 2022, it has been rapidly gained recognition and popularity among ecologists. However, the previous glmm.hp package was limited to work GLMMs derived exclusively from the lme4 and nlme packages. The latest glmm.hp package has extended its functions. It has integrated results obtained from the glmmTMB package, thus enabling it to handle zero-inflated generalized linear mixed models (ZIGLMMs) effectively. Furthermore, it has introduced the new functionalities of commonality analysis and hierarchical partitioning for multiple linear regression models by considering both unadjusted R² and adjusted R². This paper will serve as a demonstration for the applications of these new functionalities, making them more accessible to users.

摘要

glmm.hp包对零膨胀广义线性混合模型与多元回归的扩展

glmm.hp是一个专为评估广义线性混合模型(GLMMs)中共线预测变量的相对重要性而开发的R包。自从其于2022年1月发布以来，已迅速在生态学界获得认可和流行。然而，先前的glmm.hp包仅限于处理仅来源于lme4和nlme包的GLMMs。最新的glmm.hp包增加了新功能。首先，它整合了从glmmTMB包获得的结果，使其能够有效地处理零膨胀广义线性混合模型。此外，最新的glmm.hp包添加了基于原始R²和校正R²的普通多元回归的共性分析和层次分割的功能。本文将展示这些新功能，更方便广大的研究人员使用。

commonality analysis, GLMMs, hierarchical partitioning, marginal R², multiple regression, relative importance, variance partitioning, zero-Inflated model

共性分析, ，广义混合效应模型, ，层次分割, ，边际R², ，多元回归, 相对重要性, ，方差分解, ，零膨胀模型

INTRODUCTION

Generalized linear mixed models (GLMMs) are widely used in modern ecological research due to their flexibility in handling non-normal distribution and hierarchical structured data (Bolker et al. 2009). However, a challenge in utilizing GLMMs is assessing the relative importance of correlated predictors (referred to as fixed effects) with respect to response variables (Stoffel et al. 2017, 2021). To address this challenge, Lai et al. (2022a) introduced a specialized R package named ‘glmm.hp’, which enables researchers to quantify the individual contributions of predictors in GLMMs by decomposing the commonly used Nakagawa marginal R² (Nakagawa and Schielzeth 2013; Nakagawa et al. 2017).

The glmm.hp package extends the ‘average shared variance’ methodology developed by Lai et al. (2022b) for canonical analyses to GLMMs. The core idea of this method is to equally allocate the shared variance caused by collinear explanatory variables. The individual R² for each variable will be composed of both unique and allocated shared R². In this way, the sum of individual R² for each variable equals the total R². Notably, this method yields similar results to other established techniques documented in the literature, such as the ‘averaging over orderings’ (Kruskal and Majors 1989; Lindeman et al. 1980), ‘hierarchical partitioning’ (Chevan and Sutherland 1991) and ‘dominance analysis’ (Budescu 1993), which are frequently used in multiple linear regression analysis (Bi 2012). However, compared to these above complex derivation procedures, the method of ‘average shared variance’ is more intuitive and easily comprehensible (Lai et al. 2022b).

The glmm.hp package was initially launched on the R official website (https://cran.r-project.org/web/packages/glmm.hp/index.html) in January 2022. As of the time of this writing, the package has accumulated more than 12 000 downloads, as reported on the R package monitoring website (www.datasciencemeta.com/rpackages). A search on Google Scholar shows the package has been utilized in more than 30 research papers. These findings highlight the increasing recognition and adoption of glmm.hp package within the community of ecologists.

An article introducing the principles and operational procedures of the glmm.hp package was published in the sixth issue of this journal in 2022 (Lai et al. 2022a). In the version (0.0-3) available at that time, glmm.hp package was primarily designed for GLMMs models only from the lme4 package (Bates et al. 2015) and nlme package (Pinheiro et al. 2020). Since the publication of the paper, we have continuously enhanced the package’s capabilities. Specifically, we integrated the outcomes of glmmTMB package (Brooks et al. 2017) into the glmm.hp package within GLMMs models. Additionally, we incorporated the results from ordinary multiple linear regression into glmm.hp package, obtained through the lm() functions in base package. In this article, we illustrate these new functionalities of glmm.hp package (version 0.1-0) through illustrative case studies.

WORKING EXAMPLE

glmm.hp() working example for glmmTMB()

Zero-inflated generalized linear mixed models (ZIGLMMs) are an extension of GLMMs that address the issue of excessive zero values in count data (Zeileis et al. 2008). In many ecological cases, count data may exhibit more zeros than the expected standard Poisson or negative binomial distribution (Harrison 2014). ZIGLMMs account for this excess of zeros by considering two processes: one process for the excess zeros (zero-inflation) and another process for the remaining counts (Brooks et al. 2017).

When it comes to fit ZIGLMMs using the glmmTMB package, researchers can use its flexible framework to model both the count portion and the zero-inflation portion of the data. The glmmTMB package provides the ability to specify different distributions (e.g. Poisson, negative binomial) and link functions for each part of the model. To learn more about the features of the glmmTMB package, you can consult the help documentation provided with glmmTMB package (https://cran.r-project.org/web/packages/glmmTMB/index.html).

We demonstrate the capabilities of the glmm.hp() function when applied to the output generated by the glmmTMB() function. To do so, we utilize a dataset containing information on the abundance of salamanders, which is readily available within the glmmTMB package. This dataset comprises count data representing the abundance of salamanders, recorded on four separate occasions across 23 different stream sites. Some of these sites have been affected by coal mining activities, and the observations encompass various salamander species and life stages (Price et al. 2016).

Here, we fit ZIGLMMs to evaluate the response of abundance of salamanders (count data) to coal mining (‘mined’ variable) and species (‘spp’ variable), while sample site is set as the random effect and chose Poisson distribution. In this case, the aim is to compare the relative importance of coal mining (mined) and species (spp) on the abundance of salamanders.

The glmm.hp() function relies on the r.squaredGLMM() function from the MuMIn package (Bartoń 2022) to compute the marginal R² of GLMMs. For Poisson-distributed GLMMs, the r.squaredGLMM() function provides three types of R²: ‘delta’, ‘lognormal’ and ‘trigamma’. The differences among them mainly results from variations in denominator in the calculation of the R² (Nakagawa and Schielzeth 2013; Nakagawa et al. 2017). For more details, one can refer to the help documentation of the r.squaredGLMM() function in the MuMIn package (Bartoń 2022). Typically, one tends to favor selecting the highest R². Hence, we also plot the decomposition for the highest R² (i.e. ‘lognormal’ type) here, located in the second row. Consequently, we set argument ‘n = 2’ in the plot() generic function (Fig. 1). Under all three types of R², it is evident that coal mining (mined) has a greater impact on the abundance of salamanders compared to species (spp). It’s important to note that our objective here is to illustrate the process of using glmm.hp() function working the output of glmmTMB package.

Figure 1:

The relative importance of individual predictors on the abundance of salamanders (count data) in the dataset by glmm.hp() for output of glmmTMB().

Open in new tab Download slide

glmm.hp() working example for lm()

For ordinary multiple linear regression, there are several commonly used R packages for conducting R² decomposition (including commonality analysis and hierarchical partitioning). For instance, the yhat package is dedicated to commonality analysis (Nimon et al. 2013), while the hier.part package (Walsh and Mac Nally 2013), relaimpo package (Grömping 2006) and dominanceanalysis package (Navarrete and Soares 2020), are employed for hierarchical partitioning. However, it’s worth noting that all of these packages exclusively perform the unadjusted R² decomposition. In ecological research, it is a standard practice to employ the adjusted R², since unadjusted R² is biased (Peres-Neto et al. 2006). In order to address this limitation, the current glmm.hp package has expanded its capabilities to encompass both commonality analysis and hierarchical partitioning (through setting the ‘commonality’ argument) for ordinary multiple linear regression. It also provides options for both unadjusted R² and adjusted R² through the ‘type’ argument in glmm.hp() function. These options allow us to explore the sources of negative values that may occasionally arise during the decomposition of adjusted R². When decomposing adjusted R², individual components may yield negative values, which could be attributed to suppressor variables or the use of adjusted R² (Nimon and Oswald 2013; Peres-Neto et al. 2006; Ray-Mukherjee et al. 2014). If negative values appear during the version of adjusted R² but disappear in the version of unadjusted R², it can be deduced that these negative values are a result of the adjusted R², as exemplified in the current case.

To illustrate the application of glmm.hp() function to ordinary multiple linear regression (i.e. lm() in R), we utilize the built-in dataset ‘mtcars’ in R (R core team 2022). The data was sourced from the 1974 ‘Motor Trend US’ magazine and encompasses fuel consumption along with 10 aspects of automobile design and performance for 32 automobiles. In this case, we investigate the relative importance of car weight (wt), number of carburetors (carb) and number of cylinders (cyl) on gasoline efficiency (miles per gallon, mpg).

Results from commonality analysis (Fig. 2 and 3) or hierarchical partitioning (Fig. 4) indicate that car weight (wt) has the most impact on gasoline efficiency, followed by number of cylinders (cyl) and lastly number of carburetors (carb). It’s important to note that for the sake of convenience in demonstration, the build-in ‘mtcars’ dataset was used, and this model may lack practical significance.

Commonality analysis of three variables on gasoline efficiency based on adjusted R2 (default) by glmm.hp(), common variance between ‘wt’ and ‘carb’ is a negative value (−0.002).

Figure 2:

Commonality analysis of three variables on gasoline efficiency based on adjusted R² (default) by glmm.hp(), common variance between ‘wt’ and ‘carb’ is a negative value (−0.002).

Open in new tab Download slide

Commonality analysis of three variables on gasoline efficiency based on unadjusted R2 by glmm.hp() (setting argument type = ‘R2’), common variance between ‘wt’ and ‘carb’ change from negative (−0.002) in adjusted R2 scenario to positive value (0.002) in unadjusted R2 scenario, hence it can be inferred that the negative values is caused by the adjusted R2.

Figure 3:

Commonality analysis of three variables on gasoline efficiency based on unadjusted R² by glmm.hp() (setting argument type = ‘R2’), common variance between ‘wt’ and ‘carb’ change from negative (−0.002) in adjusted R² scenario to positive value (0.002) in unadjusted R² scenario, hence it can be inferred that the negative values is caused by the adjusted R².

Open in new tab Download slide

The relative importance of individual variables on gasoline efficiency based on adjusted R2 through hierarchical partitioning via the glmm.hp().

Figure 4:

The relative importance of individual variables on gasoline efficiency based on adjusted R² through hierarchical partitioning via the glmm.hp().

Open in new tab Download slide

DISCUSSION

The glmmTMB package provides a versatile platform for fitting complex models like ZIGLMMs, which are particularly useful for analysing count data with excessive zeros (Brooks et al. 2017). Its flexibility in handling various response distributions and random structures makes it a valuable tool for researchers in fields such as ecology and biology (Douma and Weedon 2019). The glmm.hp package incorporates the decomposition of output of glmmTMB, greatly expanding the functionalities of glmm.hp (), while providing valuable insights for interpreting glmmTMB() output.

The current glmm.hp package has the capability to simultaneously perform commonality analysis and hierarchical partitioning for ordinary multiple regression models. Furthermore, it can also decompose both the unadjusted R² and the adjusted R². This enhancement not only addresses the limitations of commonly used packages like ‘yhat’ for commonality analysis (Nimon et al. 2013), as well as packages for hierarchical partitioning such as ‘hier.part’ (Walsh and Mac Nally 2013), ‘relaimpo’ (Grömping 2006) and ‘dominanceanalysis’ (Navarrete and Soares 2020), which do not support to decompose adjusted R². The advantages of decomposing adjusted R² are 2-fold: firstly, it provides an unbiased estimation of R² and is widely utilized in the field of ecology (Peres-Neto et al. 2006). Secondly, by comparing the results between the unadjusted and adjusted R², researchers can determine whether the negative values observed in commonality analysis or hierarchical partitioning are a result of using adjusted R². These new capability enhances the precision and reliability of regression model analysis, making glmm.hp package a valuable tool for researchers in various domains.

According to our findings from the Google Scholar search, as of the time of writing this paper, it is evident that the glmm.hp package has gained substantial recognition within the academic community. It has been utilized in more than 30 peer-reviewed research papers for the purpose of partitioning marginal R² in GLMMs. These publications span a wide spectrum of scientific disciplines, illustrating the versatility and applicability of the glmm.hp package. A selection of these fields and associated references include: plant ecology (e.g. Gu et al. 2022; Guo et al. 2022; Wan et al. 2023; Yan et al. 2023; Yang et al. 2023; Zhang et al. 2022), animal ecology (e.g. Ao et al. 2022; Liu et al. 2023; Wang et al. 2023), environmental science (e.g. Agusto et al. 2022; Wu et al. 2023), agriculture (e.g. Sha et al. 2023), microbiology (e.g. Chen et al. 2023; Fu et al. 2022), conservation biology (e.g. Le et al. 2023; Tobisch et al. 2023). The broad adoption of the glmm.hp package across these diverse domains underscores its status as a preferred and trusted tool among researchers in ecology and related fields.

In the future, we are dedicated to continuously enhancing the capabilities of glmm.hp package and optimizing its analytical speed to meet the evolving needs of the researchers. We encourage all researchers who incorporate the glmm.hp package into their studies to provide proper attribution by citing this article. Citation information can be easily obtained by typing the following command: citation(‘glmm.hp’). This practice helps acknowledge and support the ongoing development and maintenance of this valuable analytical resource for the scientific community.

Funding

The work was supported by the National Natural Science Foundation of China (32271551) and the Metasequoia funding of Nanjing Forestry University.

Conflict of interest statement. The authors declare that they have no conflict of interest.

REFERENCES

Agusto

Qin

Thibodeau

, et al. . (

2022

)

Fiddling with the blue carbon: Fiddler crab burrows enhance CO₂ and CH₄ efflux in saltmarsh

Ecol Indic

144

109538

Month:	Total Views:
November 2023	29
December 2023	98
January 2024	322
February 2024	138
March 2024	195
April 2024	151
May 2024	184
June 2024	157
July 2024	138
August 2024	132
September 2024	180
October 2024	257
November 2024	306
December 2024	264
January 2025	211
February 2025	181
March 2025	246
April 2025	218

Article Contents

Extension of the glmm.hp package to zero-inflated generalized linear mixed models and multiple regression

Abstract

摘要

INTRODUCTION

WORKING EXAMPLE

glmm.hp() working example for glmmTMB()

glmm.hp() working example for lm()

DISCUSSION

Funding

REFERENCES

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only