A Smoothing-Based Goodness-of-Fit Test of Covariance for Functional Data

Abstract

Functional data methods are often applied to longitudinal data as they provide a more flexible way to capture dependence across repeated observations. However, there is no formal testing procedure to determine if functional methods are actually necessary. We propose a goodness-of-fit test for comparing parametric covariance functions against general nonparametric alternatives for both irregularly observed longitudinal data and densely observed functional data. We consider a smoothing-based test statistic and approximate its null distribution using a bootstrap procedure. We focus on testing a quadratic polynomial covariance induced by a linear mixed effects model and the method can be used to test any smooth parametric covariance function. Performance and versatility of the proposed test is illustrated through a simulation study and three data applications.

functional data analysis, Functional principal components analysis, hypothesis testing, linear mixed effects models, longitudinal data analysis

Issue Section:

Biometric Methodology > Biometric Methodology > Discussion Papers

1 Introduction

Functional data have become increasingly common in fields such as medicine, agriculture, and economics. Functional data usually consist of high frequency observations collected at regular intervals, see Ramsay and Silverman (2002, 2005) for an overview of methods and applications. By comparison, longitudinal data typically consist of repeated observations collected at a few time points varying across subjects. In recent years, functional data methods have been successfully extended and applied to longitudinal data (James et al., 2000; Yao et al., 2005). While these methods are more flexible, their estimation and interpretation are more cumbersome than longitudinal methods and require more sampling units or observations for accurate and reliable estimates. Thus, it is natural to question if such flexibility is truly necessary. This article focuses on comparing longitudinal data methods with functional data methods. For example, we consider the case of testing if a simple linear mixed effects model is sufficient for longitudinal data or if a more complex functional data model is required.

This work is motivated by the CD4 cell count dataset from the Multicenter AIDS Cohort Study (Kaslow et al., 1987). CD4 count is a key indicator for AIDS disease progression, and understanding its behavior over time is critical for monitoring HIV+ patients. The dataset is highly sparse, with 5 to 11 irregularly-spaced observations per subject. CD4 counts have been extensively analyzed using longitudinal data methods, e.g., semiparametric and linear random effects models (Taylor et al., 1994; Zeger and Diggle, 1994; Fan and Zhang, 2000). Recently, functional data methods have also been applied to this data (Yao et al., 2005; Goldsmith et al., 2013; Xiao et al., 2018). While the nonparametric functional data methods are highly flexible and better adapt to subject-specific patterns, they are more difficult to implement and interpret compared to the parametric approaches. Therefore it is of interest to test whether the simpler longitudinal methods are sufficient for the data. To the authors’ best knowledge, no formal testing procedure exists for this application.

The inherent difference between functional and traditional longitudinal data methods is in the correlation model between repeated observations. For functional methods, the covariance within a subject is assumed to be smooth with an unknown nonparametric form. The covariance can be estimated by smoothing the sample covariance (Besse and Ramsay, 1986; Yao et al., 2005; Xiao et al., 2018) or constructing a reduced rank approximation by estimating basis functions from smoothed sample curves (James et al., 2000; Peng and Paul, 2009). In contrast, longitudinal data approaches typically assume a simple parametric covariance structure with a few parameters, such as autoregressive or exponential (see Diggle et al. (2002) for an overview), or induced by a random effects model (Laird and Ware, 1982).

Existing work on testing parametric versus nonparametric functions is limited to density and regression functions for the response variable, but has been extended to settings such as semiparametric and functional models; see González-Manteiga and Crujeiras (2013) for a recent review. Hardle and Mammen (1993) propose a smoothing-based goodness-of-fit statistic for regression functions, derive the asymptotic normal distribution, and develop a “wild” bootstrap algorithm for finite samples. Comparisons have also been applied to functional regression for model diagnostics and evaluating assumptions (Chiou and Muller, 2007; Bucher et al., 2011) and testing functional coefficients (Swihart et al., 2014; McLean et al., 2015; Kong et al., 2016). The proposed method is an extension of smoothing-based methods to test the form of the covariance function.

For high-dimensional multivariate data, where observation points are regular and balanced (same for all subjects), a number of methods exist to test an identity or spherical covariance matrix against an unstructured alternative (Ledoit and Wolf, 2002; Bai et al., 2009). Recently, Zhong et al. (2017) developed a general goodness-of-fit test that can be applied to many common parametric covariances. However, these methods are ill-suited for the comparison between functional and longitudinal data models because they (a) fail to account for the underlying smoothness of the process and (b) require data observed at fixed time points for all subjects, i.e., a (fixed) common design The CD4 dataset has an irregular design where time points differ for each subject, so cannot be tested with these approaches. Note that the random design, where observed time points are independent between and within the subjects, is a special case of the irregular design. Common or random designs are typically assumed in theoretical studies of functional data (Cai and Yuan, 2011).

The objective of this article is to develop a testing procedure for comparing parametric longitudinal versus nonparametric functional data covariance models applied to repeated measured data with irregular and/or highly frequent sampling design. Note that longitudinal data with only a few repeated measurements per subject with a regular sampling design is not within the scope of this article. Selecting an adequate covariance model is critical, because model misspecification can bias estimation and inference, while an unnecessarily complex model can slow computation and interfere with model interpretation. We propose a goodness-of-fit test based on the difference between the estimated parametric and nonparametric covariances, inspired by Hardle and Mammen (1993). Compared to Zhong et al. (2017) for high-dimensional multivariate data, our test statistic can be evaluated using a more flexible modeling approach that accounts for general designs and exploits the underlying smoothness of repeated observations. However, deriving the distribution of the test statistic is challenging and we use bootstrapping to approximate the null distribution. To demonstrate performance and versatility of the proposed test, we present a simulation study and three data applications.

The remainder of this article is organized as follows. Section 2 presents the statistical model and hypothesis test, Section 3 details the proposed test, and Section 4 describes our implementation. Section 5 outlines extensions to general smooth covariance functions. Section 6 presents a simulation study. Section 7 details three applications to diffusion tensor imaging, child growth, and CD4 cell count. Finally, Section 8 summarizes the article and discusses limitations of the proposed test.

2 Statistical Framework

Consider functional or longitudinal data where i denotes the subject index, j denotes the visit index, and is the measurement for the i-th subject at time ⁠. Here, n is the number of subjects and the number of observations for the i-th subject, which can vary across subjects. Assume that is a closed and compact domain. Data are often observed with noise, so we posit the model

(1)

Here is a smooth mean function, is a zero-mean Gaussian random function independent between subjects, and is Gaussian white noise independently and identically distributed with zero mean and variance ⁠, independent of ⁠.

Let be the covariance function of ⁠. Assume that is a smooth, positive semidefinite bivariate function defined on ⁠.

We are interested in the form of the covariance, and would like to test the hypothesis that has a known parametric form against a general alternative. Motivated by the CD4 dataset, which has previously been fit with a linear random intercept and slope model, we focus on the quadratic polynomial function

(2)

where (⁠⁠) are unknown parameters. Because this covariance is induced by the linear random effects model ⁠, where are random effects with zero mean and ⁠, ⁠, and ⁠, testing is equivalent to testing if a linear random (or mixed) effects model is sufficient for the data. Note that this is a specific case of the general linear random effects model for random effects with zero mean and variance and known functions ⁠, which has covariance function ⁠, where ⁠. While we focus on (2), the proposed test can be easily adapted for the more general random effects case or any smooth parametric covariance with finite parameters, as discussed in Section 5. Ideally, scientific or expert knowledge about the underlying process should guide the choice of ⁠. If such information is unavailable, a commonly used and interpretable structure would be preferred.

Formally, the hypothesis test can be written as

(3)

Under the null hypothesis, the covariance has a specific parametric form with finite parameters. Under the alternative hypothesis, the covariance function is assumed only to be smooth and positive semidefinite. This flexibility may better capture heterogeneity across subjects but is hard to estimate and interpret compared to a parametric model. Therefore, it is desirable to test goodness-of-fit for these two types of models. In the following section, we propose a distance-based goodness-of-fit test for (3) that can be applied to functional data with either a dense common or sparse irregular sampling design.

3 Smoothing-based Test

We propose a test statistic based on the distance between the covariance functions estimated under the null and alternative hypotheses, respectively. In the remainder of this section, we describe covariance estimation under the null and alternative hypotheses, and then introduce our test statistic. The smooth mean can be estimated non-parametrically with spline smoothing (Ruppert et al., 2003; Wood, 2003), allowing us to consider only the de-meaned data for modeling ⁠. See Section 4 for details.

3.1 Null Model

Under the null hypothesis, is a quadratic polynomial covariance, corresponding to

(4)

Here, is a linear random effects model with subject-specific random intercepts and slopes, and ⁠, respectively. Let be the -length vector of de-meaned observations for the i-th subject observed at times ⁠, and be the corresponding covariance matrix, where is a -length vector of ones and is a identity matrix. Then the unknown parameters in model (4) can be estimated by maximizing the log-likelihood ⁠, where is the determinant of the matrix ⁠, using an expectation-maximization (EM) or Newton–Raphson algorithm, as outlined in Lindstrom and Bates (1988).

3.2 Alternative Model

Under the alternative hypothesis, the covariance function has a smooth, nonparametric form. Approximate by smoothing the sample covariance using tensor product regression splines as ⁠, where are a sequence of cubic B-spline basis functions defined over and are coefficients estimated by minimizing the least squares expression

(5)

under the natural symmetry constraint that ⁠. Denote the estimated alternative covariance as ⁠.

The measurement error, ⁠, in equation (1) can be estimated following Yao et al. (2005) and Goldsmith et al. (2013) by averaging the distance between the diagonals of the raw sample covariance, i.e., for ⁠, and ⁠. To mitigate boundary effects, only the middle 50% of is considered (Staniswalis and Lee, 1998; Yao et al., 2005).

3.3 Test Statistic

Using the estimated null and alternative covariances, and ⁠, the proposed test statistic is the Hilbert–Schmidt norm distance

(6)

where for bivariate function f and is the smoothed null covariance estimate using tensor-product B-splines. That is, replace with in the least squares expression (5) to estimate so ⁠. Using the smoothed null eliminates the bias from nonparametric function estimation and is common practice for nonparametric regression tests; see, e.g., Hardle and Mammen (1993). A large indicates that the null parametric covariance approximates the true covariance poorly. The null distribution of is difficult to derive as estimation of the alternative is based on second moments of the observed responses. Moreover, even in settings where the null distribution of distance-based test statistic is available, Hardle and Mammen (1993) show that the test statistic converges slowly and recommends bootstrapping instead. In the next section, we propose a wild bootstrap algorithm (Wu, 1986) for the null distribution of following Hardle and Mammen (1993). Note that one may also consider an empirical version of the proposed test statistic evaluated at the paired time points (see Web Appendix A for an example); we focus on (6) throughout this article.

3.4 Approximate Null Distribution of T_n via a Wild Bootstrap

Denote the l-th bootstrap sample as ⁠, where for the original time points ⁠. Let be the estimated smooth mean function, be subject trajectories generated from the estimated null model in (4), and be simulated residuals using the estimated measurement error in Section 3.2. The test statistic, ⁠, can be calculated from the resulting bootstrap sample, and the process is repeated to obtain an approximation of the null distribution of ⁠. If the observed statistic is large compared to the null approximation, then reject ⁠. This “wild” bootstrap procedure (Wu, 1986) is outlined in Algorithm 1 and is valid in the regression function setting (Hardle and Mammen, 1993).

Open in new tab

Algorithm 1 Parametric bootstrap for null distribution of
1: fordo
2: Generate from for ⁠, where is the estimated parameter matrix under the null hypothesis in (4).
3: Sample for and ⁠, where is the measurement error estimated under the alternative model in Section 3.2.
4: Define the l-th bootstrap dataset as ⁠.
5: Estimate and subtract the mean function for the bootstrap data, ⁠.
6: Fit the l-th bootstrap dataset with model (4) and estimate ⁠.
7: Fit the l-th bootstrap dataset with model (5) and calculate ⁠.
8: Calculate the test statistic ⁠.
9: end for
10: Calculate p-value⁠, where is an indicator function with value 1 if the condition is true, and 0 otherwise.

Algorithm 1 Parametric bootstrap for null distribution of
1: fordo
2: Generate from for ⁠, where is the estimated parameter matrix under the null hypothesis in (4).
3: Sample for and ⁠, where is the measurement error estimated under the alternative model in Section 3.2.
4: Define the l-th bootstrap dataset as ⁠.
5: Estimate and subtract the mean function for the bootstrap data, ⁠.
6: Fit the l-th bootstrap dataset with model (4) and estimate ⁠.
7: Fit the l-th bootstrap dataset with model (5) and calculate ⁠.
8: Calculate the test statistic ⁠.
9: end for
10: Calculate p-value⁠, where is an indicator function with value 1 if the condition is true, and 0 otherwise.

Open in new tab

Algorithm 1 Parametric bootstrap for null distribution of
1: fordo
2: Generate from for ⁠, where is the estimated parameter matrix under the null hypothesis in (4).
3: Sample for and ⁠, where is the measurement error estimated under the alternative model in Section 3.2.
4: Define the l-th bootstrap dataset as ⁠.
5: Estimate and subtract the mean function for the bootstrap data, ⁠.
6: Fit the l-th bootstrap dataset with model (4) and estimate ⁠.
7: Fit the l-th bootstrap dataset with model (5) and calculate ⁠.
8: Calculate the test statistic ⁠.
9: end for
10: Calculate p-value⁠, where is an indicator function with value 1 if the condition is true, and 0 otherwise.

Algorithm 1 Parametric bootstrap for null distribution of
1: fordo
2: Generate from for ⁠, where is the estimated parameter matrix under the null hypothesis in (4).
3: Sample for and ⁠, where is the measurement error estimated under the alternative model in Section 3.2.
4: Define the l-th bootstrap dataset as ⁠.
5: Estimate and subtract the mean function for the bootstrap data, ⁠.
6: Fit the l-th bootstrap dataset with model (4) and estimate ⁠.
7: Fit the l-th bootstrap dataset with model (5) and calculate ⁠.
8: Calculate the test statistic ⁠.
9: end for
10: Calculate p-value⁠, where is an indicator function with value 1 if the condition is true, and 0 otherwise.

4 Implementation

First, estimate the smooth mean using thin plate regression splines (Wood, 2003) using the gam function in the R package mgcv (Wood, 2017), and subtract from the data. The null model in (4) is a standard random effects model that can be estimated using the lme function in the R package nlme (Pinheiro et al., 2017). For the least squares expression in (5) to smooth the alternative and null covariance estimates, we use cubic B-splines per axis with equally-spaced interior knots. The choice of 10 B-splines balances performance and computational speed, see Web Appendix B for a sensitivity study. While the number of splines needs only be sufficiently large, additional splines may be needed if the data is known or observed to be highly wiggly. Cross-validation or Aikaike information criterion (AIC) may be used for a formal selection (see Wood (2003) for discussion).

5 Extensions

5.1 Smooth Covariance

Any smooth parametric covariance function can be tested using the proposed procedure, with modification to the null model and bootstrap algorithm. For example, consider the stationary Gaussian or quadratic exponential covariance function ⁠, where ⁠, and (⁠⁠) are parameters to be estimated. The null model can be estimated using likelihood-based methods, and bootstrap data generated as ⁠, where is the estimated mean vector of length ⁠, is the estimated null covariance matrix defined by ⁠, is the square root matrix where ⁠, is an -length vector of independent samples from a standard normal distribution, and is an independent vector of residuals from ⁠.

6 Simulation Study

We conduct a simulation study to evaluate performance of the proposed bootstrap test and two competing methods, described in Section 6.1, for testing the hypothesis in (3) that the covariance has a quadratic polynomial form. Data are generated as

(7)

for subjects and observations per subject. The scalar, ⁠, controls the magnitude of deviation from the null model. The mean, ⁠, is set to 0 and the residuals are distributed ⁠, independent of ⁠. Random intercepts and slopes are sampled from a bivariate normal distribution with zero mean, and ⁠, independent of the non-linear function ⁠, defined below. The are observed on a grid of 80 equally spaced points in ⁠. If ⁠, the subject is observed at all points and if ⁠, observed time points are uniformly sampled for each subject from the 80 possible points. Tuning parameters are selected as described in Section 4. Consider a factorial combination of the following factors:

(1)
Observations per subject (⁠⁠): (a) ⁠, (b) ⁠, (c) ⁠, (d)
(2)
Deviation from the null model:
- (a)
  Quadratic:
- (b)
  Trigonometric:⁠, ⁠.

For each factor combination, we use bootstrap samples per dataset and consider and 500 subjects, and for the setting only. Performance is evaluated in terms of the empirical type I error rate (size) for nominal levels and based on 5000 simulated datasets, and power at the level with 1000 simulated datasets. Results are presented in terms of deviation from the null, defined as ⁠.

6.1 Competing Methods

As discussed in Section 1, we are unaware of any existing methods for testing covariance that can be applied to all functional or longitudinal data settings. In this subsection, we describe two testing methods that can be applied to specific scenarios of the hypothesis test in (3).

6.1.1 Direct Test

Consider the case where covariance under the alternative hypothesis has a known, parametric form so the null model for is nested within the alternative model. In essence, test if a more complex covariance better explains the data than the null covariance. For the quadratic polynomial covariance, an alternative may be ⁠. Then the alternative model can be written as

(8)

Note that this is the model for the quadratic deviation setting in the simulation study. Like the null model, (8) can be estimated using the lme function in the R package nlme (Pinheiro, et al., 2017). The hypothesis test is equivalent to testing if ⁠, or versus ⁠.

Testing zero-value variance components is a non-standard problem because the null hypothesis is on the boundary of the parameter space. Self and Liang (1987) derive the asymptotic null distribution of the likelihood ratio test (LRT) for this setting as a mixture of chi-squared distributions. Crainiceanu and Ruppert (2004) derive the exact finite sample null distribution for the (R)LRT of mixed models with one random effect, and Greven et al. (2008) extend this approach to models with multiple random effects using pseudolikelihood. Because of the limited sample size in our simulation study, we use the finite sample null distribution from Greven et al. (2008), which can be preformed efficiently using the exactRLRT function in the R package RLRsim (Schiepl and Bolker, 2016).

6.1.2 Multivariate Test

The Zhong et al. (2017) test for high-dimensional multivariate data can be applied to functional data with a common design. Consider a repeated measures model ⁠, where is a vector of responses, is a mean vector of length m, and residuals are distributed ⁠. Denote as the parameter vector defining the covariance matrix under the null hypothesis, ⁠. Let be the alternative unstructured covariance.

Based on the squared-Frobenius distance between the null and alternative covariances, ⁠, Zhong et al. (2017) propose the test statistic ⁠, where is an unbiased estimator for and adjusts for errors in the estimation of ⁠. The hypothesis test in (3) can be conducted by testing if is significantly larger than 0. With some assumptions on the covariance structure, the asymptotic normal and fixed sample weighted-chi square null distributions can be determined for any parametric covariance, and we provide derivations for the quadratic polynomial covariance in Web Appendix C. In our simulation study, the multivariate test can only be applied to the dense case, and we use 10,000 samples to approximate the fixed sample null distribution. In Web Appendix D, we also consider performance of the multivariate test in less-ideal settings with small m and unequally-spaced data.

6.1.3 Limitations of the Competing Methods

While both competitors utilize test statistics with known null distributions, these tests only apply to limited scenarios. The direct test applies when the alternative is known, parametric, and a superset of the null model. The multivariate test only applies to data with a common design and assumes an unstructured covariance that does not account for smoothness. Thus, the bootstrap test is expected to be more powerful than the multivariate test for testing functional data.

6.2 Simulation Results

Table 1 reports the empirical type I error rates for all three methods. As the multivariate test requires a common sampling design, it can only be applied to the setting. We report only the fixed-sample weighted chi-squared distribution; results for the asymptotic normal distribution were similar and are presented in Web Appendix D. All three methods have empirical levels close to the nominal levels, although both the bootstrap and direct tests can be slightly conservative for several settings.

Table 1

Open in new tab

Estimated type I error rates for the bootstrap, direct, and multivariate tests at the nominal formula and formula levels based on 5000 datasets, by number of subjects (n) and observations per subject (m). The standard error was 0.003 and 0.004 for formula and formula ⁠, respectively. The multivariate test is applicable for only the dense formula setting

Bootstrap

Direct

Multivariate

α=0.05

α=0.10

α=0.05

α=0.10

α=0.05

α=0.10

100

0.059

0.126

0.044

0.086

n/a

0.045

0.105

0.049

0.098

n/a

0.042

0.093

0.049

0.102

n/a

0.042

0.091

0.045

0.096

0.053

0.103

500

0.047

0.105

0.049

0.096

n/a

0.050

0.103

0.049

0.096

n/a

0.050

0.100

0.043

0.090

n/a

0.044

0.093

0.046

0.096

0.053

0.105

Figure 1 presents simulation results for the quadratic and trigonometric deviations from the null, by number of observations per subject, m. For all methods, power increases with sample size, particularly as data are more densely sampled and the covariance is better estimated. As expected, power depends on how closely the true model matches the specific alternative assumed by the test. The bootstrap test outperforms the multivariate test for all settings because of the more specific form of its alternative, and has higher power than the direct test when the direct test has misspecified the alternative (trigonometric deviation). Conversely, the direct test has higher power when the parametric alternative is correctly specified (quadratic deviation). Both the bootstrap and multivariate tests are better able to detect the trigonometric deviation because the covariance more obviously deviates from the null model. Overall, the bootstrap test performs well in most settings, except when the dataset is small and deviation from the null is small. For example, when and ⁠, the test is underpowered for the quadratic deviation when signal size is small.

Figure 1

Power under the quadratic (top) and trigonometric (bottom) deviations from the null, by number of observations per subject, m. Shown are: bootstrap test (solid), multivariate test (long and short-dash), and direct test (long-dash), for formula (light gray), formula (dark gray), and formula (black) subjects. The multivariate test is not applicable when formula ⁠.

Open in new tab Download slide

In terms of computational speed, the bootstrap test is, unsurprisingly, significantly slower than the competitor methods. A personal laptop with a 2.9 GHz processor took 1–7 min to run a single iteration, compared to 1 and 0.2 s for the direct and multivariate tests, respectively, for the null model with 100 subjects. Reducing the density of inputted data or L number of bootstrap samples can decrease computational time, with some loss of power.

7 Applications

7.1 Diffusion Tensor Imaging

We first consider a dataset of diffusion tensor imaging (DTI) of intracranial white matter microstructure with dense, common sampling design for a group of normal and multiple sclerosis patients. Images of the white matter are depicted with tract profiles shown in Figure 2 and available in the R package refund (Goldsmith et al., 2016); see Reich et al. (2010) for study details. Goldsmith et al. (2011) consider this dataset for modeling multiple sclerosis disease status, concluding that inclusion of the tract profile as a functional predictor improves model performance compared to a subject-specific average of the profile. Note that a subject-specific average is equivalent to the subject-specific intercept in the null model in (4). We evaluate this conclusion formally by testing if a quadratic polynomial covariance is sufficient for modeling the tract profiles, using the bootstrap, multivariate, and direct tests.

We focus on the baseline tract profiles of the corpus callosum (CCA), associated with cognitive function, for multiple sclerosis patients, observed on a dense, regular grid of 93 points. After removing subjects with missing observations, the dataset has profiles from 99 subjects, for a total of 9207 observations. Tuning parameters are selected as described in Section 4. The observed test statistic for the bootstrap test is corresponding to ⁠. The direct test yields an RLRT statistic of 1160.6 corresponding to ⁠. The multivariate test yields an observed test statistic of ⁠, corresponding to for both the weighted chi-squared and asymptotic normal distributions. All three tests support the conclusion that a quadratic polynomial covariance is inadequate for the data, and that a functional method should be used.

Figure 2

(left) Diffusion tensor imaging (DTI) of corpus callosum (CCA) baseline tract profiles from 99 multiple sclerosis patients. (middle) Height measurements (cm) for 215 children from 0 to 729 days after birth. (right) Log-transformed CD4 cell counts from 208 subjects for −18 to 52 months since seroconversion. On each plot, three example trajectories are highlighted in black.

Open in new tab Download slide

7.2 Child Growth Measurements

Next, consider the CONTENTS child growth dataset from Lima, Peru (Xiao et al., 2018). The dataset contains irregularly sampled height measurements for 215 children covering 0 to 729 days after birth, for a total of 8839 observations (20–50 observations per subject, observed at different time points), shown in Figure 2. Subject trajectories predicted using functional principal components analysis, shown in Xiao et al. (2018), exhibit curvature not captured by a linear parametric model, suggesting that a functional approach is necessary for the data. We consider this observation formally by testing the quadratic polynomial covariance for the growth data using the bootstrap and direct tests.

The observed test statistic for the bootstrap test is ⁠, corresponding to ⁠, while the RLRT statistic from the direct test is 2205.8, corresponding to ⁠. Both tests indicate that the parametric quadratic polynomial covariance is not sufficient for the data, and a functional approach should be used instead.

7.3 Cd4 Count Data

Last, we consider the motivating example of CD4 cell counts described in Section 1 by conducting a formal test of the quadratic polynomial covariance using the bootstrap and direct tests. The dataset is available in the R package refund (Goldsmith et al., 2016) and includes cell counts from −18 to 52 months since seroconversion; we log-transform the counts to stabilize variability. We consider only subjects with at least 5 observations and who have log-transformed cell counts greater than 4, for a total of 1402 observations from 208 subjects (5–11 observations per subject). The cleaned and log-transformed data are shown in Figure 2.

Because data are sparser than the settings considered in the full study, we conduct a small simulation study to check the size and power of the tests. Simulated data are generated as ⁠, where is defined below, are the time points in the original dataset, and ⁠, where is the estimated error variance under the alternative model. The random function is generated from a multivariate normal distribution with zero-mean and covariance where and are the estimated covariance matrices from the null and alternative model, respectively, controls the contribution of the null and alternative covariances, and is the matrix generated from the first three eigenfunctions and eigenvalues of (⁠⁠), with magnitude controlled by ⁠. Note that when ⁠, is the null covariance, and when and ⁠, is the alternative covariance. To show how power changes with deviation from the null model, let when ⁠. Since the bootstrap test is likely to be underpowered due to sparsity of the data, we also simulate data with double the number of subjects or double the observations per subject. Additional subjects were generated using the same set of observed time points, while additional observations were added by uniformly sampling from the non-observed time points for each subject.

Table 2 gives the empirical type I error rates based on 5000 simulated datasets, and Figure 3 shows the power from 1000 datasets, for the bootstrap and direct tests. The bootstrap test is underpowered for the true CD4 dataset due to small sample size, and doubling the number observations per subject resolves this problem.

Table 2

Open in new tab

Estimated type I error rates for the bootstrap and direct tests at the nominal formula and 0.10 levels based on 5000 datasets, for data based on the standard CD4 dataset, dataset with double the number of subjects, and dataset with double the observations per subject. The standard error was 0.003 and 0.004 for formula and formula ⁠, respectively

Bootstrap

Direct

α=0.05

α=0.10

α=0.05

α=0.10

Standard dataset

0.057

0.113

0.046

0.010

Double subjects

0.056

0.108

0.047

0.095

Double observations

0.046

0.101

0.050

0.100

Figure 3

Power for the bootstrap (black) and direct (gray) tests for data based on the standard CD4 dataset (solid), dataset with double the number of subjects (short dash), and dataset with double the observations per subject (long dash). The vertical dashed line indicates the effective power of the tests, where data is simulated directly from the estimated alternative covariance (⁠ formula ⁠). From left to right, the settings for formula are formula ⁠, and formula ⁠.

Open in new tab Download slide

The observed test statistic for the bootstrap test is corresponding to ⁠, while the direct test yields an RLRT statistic of 2.704 corresponding to ⁠. While only the direct test indicates that the quadratic polynomial covariance is not sufficient for the data, Figure 3 shows that the bootstrap test is underpowered, suggesting that a more complex covariance may still be necessary for the data.

8 Concluding Remarks

In this article, we propose a smoothing-based goodness-of-fit test of covariance for functional data. We focus on the specific case of testing a quadratic polynomial covariance induced by a linear random intercept and slope model, as motivated by a dataset of CD4 cell counts used to monitor HIV+ patients. Our proposed method can be used to formally test a linear random (or mixed) effects model against a typical functional data approach, and fills a gap in the testing of longitudinal and functional data methods. The proposed bootstrap test can be applied to functional data with either dense common or irregular sampling design, and performs well in simulation studies. Limitations of the method are (a) slow computational speed, and (b) low power for very small datasets with small deviation from the null.

Acknowledgements

The authors thank an associate editor and two reviewers for their comments that greatly improved this article, and Dr. Ping-Shou Zhong for providing code for the multivariate test. ST Chen and L Xiao's research were supported by Grant OPP1148351 and OPP1114097 from the Bill and Melinda Gates Foundation. AM Staicu's research was supported by NSF grant DMS 1454942 and NIH grants 5P01 CA142538-09 and 2R01MH086633. This work represents the opinions of the researchers and not necessarily that of the granting organizations.

References

Bai

Jiang

Yao

, and

Zheng

(

2009

Corrections to LRT on large-dimensional covariance matrix by RMT

Ann Stat

135

–

141

Google Scholar

OpenURL Placeholder Text

WorldCat

Besse

and

Ramsay

J. O.

(

1986

Principal components analysis of sampled functions

Psychometrika

285

–

311

Google Scholar

Crossref

WorldCat

Bücher

Dette

, and

Wieczorek

(

2011

Testing model assumptions in functional regression models

J Multivariate Anal

102

1472

–

1488

Google Scholar

Crossref

WorldCat

Cai

T. T.

and

Yuan

(

2011

Optimal estimation of the mean function based on discretely sampled functional data: Phase transition

Ann Stat

2330

–

2355

Google Scholar

OpenURL Placeholder Text

WorldCat

Chiou

J. M.

and

Müller

H. G.

(

2007

Diagnostics for functional regression via residual processes

Comput Stat Data Anal

4849

–

4863

Google Scholar

Crossref

WorldCat

Crainiceanu

C. M.

and

Ruppert

(

2004

Likelihood ratio tests in linear mixed models with one variance component

J R Stat Soc

165

–

185

Google Scholar

Crossref

WorldCat

Diggle

Haegerty

Liang

K. Y.

, and

Zeger

(

2002

Analysis of Longitudinal Data

Oxford, UK

Oxford University Press

Fan

and

Zhang

J. T.

(

2000

Two-step estimation of functional linear models with application to longitudinal data

J R Stat Soc Series B

303

–

322

Google Scholar

Crossref

WorldCat

Goldsmith

Crainiceanu

C. M.

Caffo

, and

Reich

(

2011

Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis

Neuroimage

431

–

439

Goldsmith

Greven

, and

Crainiceanu

C. M.

(

2013

Corrected confidence bands for functional data using principal components

Biometrics

–

Goldsmith

Schiepl

Huang

Wrobel

Gellar

Harezlack

, et al. (

2016

refund: Regression with Functional Data

. R package version 0.1-16.

González-Manteiga

and

Crujeiras

R. M.

(

2013

An updated review of goodness-of-fit tests for regression models

Test

361

–

411

Google Scholar

Crossref

WorldCat

Greven

Crainiceanu

C. M.

Küchenhoff

, and

Peters

(

2008

Restricted likelihood ratio testing for zero variance components in linear mixed models

J Comput Graph Stat

870

–

891

Google Scholar

Crossref

WorldCat

Hardle

and

Mammen

(

1993

Comparing nonparametric versus parametric regression fits

Ann Stat

1926

–

1947

Google Scholar

OpenURL Placeholder Text

WorldCat

James

G. M.

Hastie

T. J.

, and

Sugar

C. A.

(

2000

Principal component models for sparse functional data

Biometrika

587

–

602

Google Scholar

Crossref

WorldCat

Kaslow

R. A.

Ostrow

D. G.

Detels

Phair

J. P.

Polk

B. F.

, and

Rinaldo

C. R.

(

1987

The multicenter AIDS cohort study: Rationale, organization, and selected characteristics of the participants

Am J Epidemiol

126

310

–

318

Kong

Staicu

A. M.

, and

Maity

(

2016

Classical testing in functional linear models

J Nonparametr Stat

813

–

838

Laird

N. M.

and

Ware

J. H.

(

1982

Random-effects models for longitudinal data

Biometrics

963

–

974

Ledoit

and

Wolf

(

2002

Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size

Ann Stat

1081

–

1102

Google Scholar

Crossref

WorldCat

Lindstrom

M. J.

and

Bates

D. M.

(

1988

Newton-Raphson and EM algorithms for linear mixed-effects models for repeated-measures data

J Am Stat Assoc

1014

–

1022

Google Scholar

OpenURL Placeholder Text

WorldCat

McLean

M. W.

Hooker

, and

Ruppert

(

2015

Restricted likelihood ratio tests for linearity in scalar-on-function regression

Stat Comput

997

–

1008

Google Scholar

Crossref

WorldCat

Peng

and

Paul

(

2009

A geometric approach to maximum likelihood estimation of functional principal components from sparse longitudinal data

J Comput Graph Stat

995

–

1015

Google Scholar

Crossref

WorldCat

Pinheiro

Bates

DebRoy

Sarkar

EISPACK

Heisterkamp

et al. (

2017

nlme: Linear and Nonlinear Mixed Effects Models

. R package version 3.1-131.

Ramsay

J. O.

and

Silverman

B. W.

(

2002

Applied Functional Data Analysis

New York

Springer

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Ramsay

J. O.

and

Silverman

B. W.

(

2005

Functional Data Analysis

New York

Springer

Reich

D. S.

Ozturk

Calabresi

P. A.

, and

Mori

(

2010

Automated vs

.conventional tractography in multiple sclerosis: Variability and correlation with disability.

Neuroimage

3047

–

3056

Ruppert

Wand

M. P.

, and

Carroll

R. J.

(

2003

Semiparametric Regression

Cambridge, UK

Cambridge University Press

Schiepl

and

Bolker

(

2016

RLRsim: Exact (Restricted) Likelihood Ratio Tests for Mixed and Additive Models

R package version 3.1-131

Self

S. G.

and

Liang

K. Y.

(

1987

Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions

J Am Stat Assoc

605

–

610

Google Scholar

Crossref

WorldCat

Swihart

B. J.

Goldsmith

, and

Crainiceanu

C. M.

(

2014

Restricted likelihood ratio tests for functional effects in the functional linear model

Technometrics

483

–

493

Google Scholar

Crossref

WorldCat

Taylor

J. M. G.

Cumberland

W. G.

, and

J. P.

(

1994

A stochastic model for analysis of longitudinal AIDS data

J Am Stat Assoc

727

–

736

Google Scholar

Crossref

WorldCat

Wood

(

2003

Thin plate regression splines

J R Stat Soc Series B

–

114

Google Scholar

Crossref

WorldCat

Wood

(

2017

mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

. R package version 1.8-23.

C. J. J.

(

1986

Jackknife, bootstrap and other resampling methods in regression analysis

Ann Stat

1261

–

1295

Google Scholar

OpenURL Placeholder Text

WorldCat

Xiao

Checkley

, and

Crainiceanu

C. M.

(

2018

Fast covariance estimation for sparse functional data

Stat Comput

511

–

522

Yao

Müller

H. G.

, and

Wang

J. L.

(

2005

Functional data analysis for sparse longitudinal data

J Am Stat Assoc

100

577

–

590

Google Scholar

Crossref

WorldCat

Zeger

S. L.

and

Diggle

P. J.

(

1994

Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters

Biometrics

689

–

699

Zhong

P. S.

Lan

Song

P. X. K.

, and

Tsai

C. L.

(

2017

Tests for covariance structures with high-dimensional repeated measurements

Ann Stat

1185

–

1213

Google Scholar

Crossref

WorldCat

This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
January 2024	17
February 2024	13
March 2024	7
April 2024	4
May 2024	5
June 2024	6
July 2024	6
August 2024	4
September 2024	6
October 2024	11
November 2024	12
December 2024	8
January 2025	4
February 2025	5
March 2025	26
April 2025	8

Article Contents

A Smoothing-Based Goodness-of-Fit Test of Covariance for Functional Data

Abstract

1 Introduction

2 Statistical Framework

3 Smoothing-based Test

3.1 Null Model

3.2 Alternative Model

3.3 Test Statistic

3.4 Approximate Null Distribution of T_n via a Wild Bootstrap

4 Implementation

5 Extensions

5.1 Smooth Covariance

6 Simulation Study

6.1 Competing Methods

6.1.1 Direct Test

6.1.2 Multivariate Test

6.1.3 Limitations of the Competing Methods

6.2 Simulation Results

7 Applications

7.1 Diffusion Tensor Imaging

7.2 Child Growth Measurements

7.3 Cd4 Count Data

8 Concluding Remarks

Acknowledgements

References

Supplementary data

Citations

Views

Altmetric

Email alerts

Related articles in

Citing articles via

Latest

Most Read

Most Cited

Article Contents

A Smoothing-Based Goodness-of-Fit Test of Covariance for Functional Data

Abstract

1 Introduction

2 Statistical Framework

3 Smoothing-based Test

3.1 Null Model

3.2 Alternative Model

3.3 Test Statistic

3.4 Approximate Null Distribution of Tn via a Wild Bootstrap

4 Implementation

5 Extensions

5.1 Smooth Covariance

6 Simulation Study

6.1 Competing Methods

6.1.1 Direct Test

6.1.2 Multivariate Test

6.1.3 Limitations of the Competing Methods

6.2 Simulation Results

7 Applications

7.1 Diffusion Tensor Imaging

7.2 Child Growth Measurements

7.3 Cd4 Count Data

8 Concluding Remarks

Acknowledgements

References

Supplementary data

Citations

Views

Altmetric

Email alerts

Related articles in

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

3.4 Approximate Null Distribution of T_n via a Wild Bootstrap