Actigraphy provides a way to objectively measure activity in human subjects. This paper describes a novel family of statistical methods that can be used to analyze this data in a more comprehensive way.
A statistical method for testing differences in activity patterns measured by actigraphy across subgroups using functional data analysis is described. For illustration this method is used to statistically assess the impact of apneahypopnea index (apnea) and body mass index (BMI) on circadian activity patterns measured using actigraphy in 395 participants from 18 to 80 years old, referred to the Washington University Sleep Medicine Center for general sleep medicine care. Mathematical descriptions of the methods and results from their application to real data are presented.
Activity patterns were recorded by an Actical device (Philips Respironics Inc.) every minute for at least seven days. Functional linear modeling was used to detect the association between circadian activity patterns and apnea and BMI. Results indicate that participants in high apnea group have statistically lower activity during the day, and that BMI in our study population does not significantly impact circadian patterns.
Compared with analysis using summary measures (e.g., average activity over 24 hours, total sleep time), Functional Data Analysis (FDA) is a novel statistical framework that more efficiently analyzes information from actigraphy data. FDA has the potential to reposition the focus of actigraphy data from general sleep assessment to rigorous analyses of circadian activity rhythms.
Activity measured by wrist actigraphy has been shown to be a valid marker of entrained
Polysomnography (PSG) sleep phase and is strongly correlated with entrained endogenous
circadian phase [
In this paper we propose a novel statistical framework, Functional Linear Modeling (FLM),
a subset of Functional Data Analysis (FDA), for analyzing actigraphy data to extract and
analyze circadian activity information through direct analysis of raw activity values
[
Participants were recruited prospectively from the clinic at Washington University in St. Louis Sleep Medicine Center. The sleep center is a multidisciplinary clinic at a tertiary medical facility. Clinic patients with a suspected diagnosis of obstructive sleep apnea (OSA), insomnia, or restless legs syndrome (RLS) were invited to participate. Pregnant women, individuals under age of 18, and patients who report working an evening or overnight shift were excluded from participation due to known biologically different circadian clocks. Clinical covariates such as BMI, comorbidities, concomitant medications, and presenting sleep complaints were collected. Participants underwent an overnight PSG when clinically indicated. These data were collected in accordance with the standards of the American Academy of Sleep Medicine (AASM) and were reviewed by a board certified sleep physician. PSG data were scored according to the AASM Manual for the Scoring of Sleep and Associated Events. This ongoing study has been approved by the Washington University School of Medicine Institutional Review Board.
Activity was measured using Actical devices (Philips Respironics Inc.) which were positioned on the nondominant wrist of subjects at the initial sleep center visit and set to measure activity every minute for 7 days. Three hundred and ninety five patients have been recruited, of which 305 have apnea and/or BMI measured. This subgroup comes from a larger NIH funded study currently recruiting a cross section of 750 patients referred to the Washington University Sleep Medicine Center for the purpose of developing and validating functional data analysis methods for actigraphy data (HL092347).
FDA is an emerging field in statistics that extends classical statistical methods for
analyzing sets of numbers (scalars for univariate analyses, and vectors for multivariate
analyses) to analyzing sets of functions [
Functional data analysis (FDA) begins by replacing discrete activity values measured
at each time unit (e.g., minute) by a function to model the data and reduce
variability. The function represents the expected activity value at each time point
measured. Since the actigraphy has equidistant data, to allow flexibility in
representing the data as a function, a Fourier expansion model is used, though any
smoothing method could be used. Let
represents activity, where
We convert the raw actigraphy data to a functional form using a basis function
expansion for
where
Experimental results (unpublished) show most basis functions work equally well and we have found a Fourier expansion with n = 9 basis functions capture the major trend of activity pattern with reduced noise. Let
where T is the period, in our case T = 1440 (number of minutes in 24 hours). Equation 1 becomes
We will use this functional representation for all analyses in this paper.
Smooth coefficients of the expansion
where
In matrix terms, this criterion becomes:
where Φ is a 1440 × 9 matrix with columns for basis functions and rows for basis value at each minute.
Taking the derivative of the criterion
Then, the vector
The raw data does not need to be normalized since all analyzes are done on the functional form of the data.
To avoid introducing variation between weekday and weekend activity patterns, only data from midnight Monday to midnight Friday was used in this paper, although this simplification is not required for analysis. The five weekdays of actigraphy data were averaged into a single 24 hour profile and a smooth Fourier expansion function was fitted using a 24 hour periodicity and 9 basis functions. This produced a single 24 hour circadian activity pattern for each subject that can be used to estimate patient's activity level at any time point throughout the day. We are developing and preparing to publish functional linear mixed models which will analyze every day's activity data to incorporate day effects, weekday/weekend effects, and pre/post treatment effects which will provide more insight into circadian rhythm patterns and withinsubject variability.
This data smoothing method is illustrated in Figure
Reducing actigraphy data to a summary statistic can mask differences across groups. For example, if one group of patients has high activity in the morning and low activity in the afternoon, and another group has a reversed pattern with the same magnitude of activity, low activity in the morning and high activity in the afternoon, their average activity may be similar, and a significant difference in circadian activity patterns would be missed. FLM avoids masking by extending the linear regression model to the analysis of smooth functions (i.e. circadian activity patterns), and differences such as described in this example become apparent.
The conceptual change going from classical linear regression to FLM is that the model
regression coefficients, (e.g.
Using this subset of subjects, functional smoothing and linear modeling is illustrated in this section. In the following section the methods are applied to the full dataset.
To test whether high and low apnea patients have different activity levels, standard approaches would reduce each subject's data to an average activity level, and a classical statistical method such as linear regression would test if these values are the same or different. For example, a linear regression model to test if there are differences in average activity between the high apnea (average activity = 78, 76, 80 and 76) and low apnea (average activity = 370, 397, 482 and 421) groups is defined as
where k = 1,2,...,8 are the subjects in Figure
Figure
As in the linear regression model described above, we are interested in estimating
regression coefficients that will produce the groupspecific mean circadian activity
patterns, and test if these mean circadian activity patterns are different across
groups. This model, for apnea, is defined as [
where the (t) notation indicates functions over the circadian period for activity
(fitted by the Fourier expansion to the actigraphy data for each subject k)
Activity_{k}(
Equation 8 can be formulated as a matrix analysis problem as described above using a
Nx2 design matrix
and the smoothed functional data represented in matrix form by
where each row represents a subject's fitted activity values. Finally, the functional
error matrix is defined as
The coefficients
where
After we estimate
where
Because of the nature of functional statistics, it is difficult to attempt to derive
a theoretical null distribution for any given test statistic. Instead, we applied a
nonparametric permutation test methodology. If there is no relationship between
activity pattern and apnea levels, it should make no difference if we randomly
rearrange the apnea group assignment. The advantage of this is that we no longer need
to rely on distributional assumptions while the disadvantage is that we cannot test
for the significance of an individual covariate among many. The
Plot (b) in Figure
The statistical and computational details for fitting FLM models are well described
elsewhere and are outside the scope of this paper. The reader interested in these
details are referred to Ramsay and Silverman [
This illustration was meant as an introduction to the methodology only, and not an indicator of a clinical conclusion. In the following section, these methods are applied to the entire 395 subject dataset, and show how apnea and BMI clinically impacts circadian activity patterns.
Table
Demographic information and sample characteristics
Variable  N (%) Mean ± std 



196(49.87%)  



AfricanAmerican  134 (35.08%) 


Caucasian  237 (62.04%)  



Snoring  279 (70.63%) 


Gasping  93 (23.54%)  


Morning headache  67 (16.96%)  


RLS symptoms  26 (6.58%)  


PLMS  3 (0.76%)  


Witnessed apneas  146 (36.96%)  


Insomnia  42 (10.63%)  


Excessive day sleepiness  91 (23.04%)  


Nonrestorative sleep  9 (2.28%)  



Class 4  145 (41.55%) 


Class 3  136 (38.97%)  


Class 2  53 (15.19%)  


Class 1  15 (4.30%)  



OSA  292 (73.92%) 


RLS  5 (1.27%)  


Insomnia  8 (2.03%)  


Hypersomnia  20 (5.06%)  



241 (60.86%)  



34.66 ± 8.88 




47.9 ± 14.8  



22.11 ± 28.11 
Raw actigraphy data were read into the R statistical software for analysis using the
FDA package and software written by our group to apply FLM methods. Two hundred and
eighty nine patients have actigraphy data. Each patient's data from midnight Monday
through midnight Friday were averaged and fit by a 9 basis Fourier expansion and their
circadian activity patterns plotted in Figure
We apply FLM to measure the impact of apnea and BMI on subject circadian activity patterns and test the null hypothesis that circadian activity patterns are the same regardless of apnea and BMI values. The alternative hypothesis is that apnea, BMI, and/or their interaction effect activity behavior in a statistically significant way. In addition to the tests of hypotheses, FLM provides a graphical view of the subgroup circadian activity patterns that can aid interpretation of behavioral differences.
To fit these models, each subject is categorized according to their apnea and BMI values by:
For the 295 subjects, 235 subjects had data on apnea and actigraphy, 277 subjects had BMI and actigraphy, and 232 subjects had apnea, BMI, and actigraphy. The following analyses are based on these subsets.
We fit the following three functional linear models as defined in Table
Three Functional Linear Models
Model 1  apnea Main Effect Only  Activity 

Model 2  BMI Main Effect Only  Activity 


Model 3  apnea+BMI+interaction  Activity 
The impact of apnea as a main effect on circadian activity patterns was tested with
Model 1, Table
Figure
Next, the impact of BMI as a main effect on circadian activity patterns was measured
using Model 2, Table
We emphasize that the population of participants in this study had a higher overall BMI compared to the general population which may explain why the expected difference in circadian activity patterns across these groups was not observed.
Model 3, Table
Sample size for apnea, BMI mode
apnea Low 
apnea High 
Total  


61  94  155 



55  22  77 



116  116  232 
This interaction model has four functional coefficients
Four group circadian activity result
apnea  BMI  Group Mean 

Low  Low 



Low  High 



High  Low 



High  High 

When a subject's apnea or BMI is low, the functional coefficient for that factor is
added to the mean activity pattern. When a subject's apnea or BMI is high, the
functional coefficient for that factor is subtracted from the mean activity pattern.
The interaction coefficient is added when apnea and BMI are concordant (high/high or
low/low) and subtracted when apnea and BMI are discordant (low/high, high/low). Figure
It is an established statistical practice in a linear regression model to test the main effects of two covariates and the effect of the interaction of the two covariates. We extended this method to the functional linear model. The comparisons of all 4 groups in this section are actually the evaluation of the combination of the main and interaction effects which should be consistent with a 2way ANOVA.
As noted above, BMI showed little impact on circadian activity patterns which does
not correspond to general clinical belief. This is most likely explained by the fact
that our subject population has high BMI relative to the general population, so the
distinction between obese and nonobese was less pronounced. In this section, we fit a
functional linear model treating BMI as a continuous variable. BMI ranges from 17 to
67 in this dataset. Figure
Traditionally, actigraphy data is transformed into summary numbers, such as total sleep time, sleep efficiency, wake after sleep onset, and other measurements. These transformations allow data analysts to test hypothesis using simple classical statistical methods. However, large amount of information can be lost and problems of masking circadian patterns may arise.
The merit of functional linear modeling relies in determining when along the 24hour
scale groups differ. Results from parameter tests in a cosinor approach would provide
information as to differences in harmonic content between groups. Another advantage of the
functional linear modeling approach is exemplified in Figure
In this paper we have presented a novel approach for analyzing the full actigraphy data
which we believe avoids significant information loss and masking effect. Representing
actigraphy data as smooth continuous functions, and applying Functional Linear Modeling
methods allowed us to directly compare and test differences of circadian activity patterns
across apnea and BMI subgroups. Other Functional Data Analysis methods using principal
components analysis ([
The authors declare that they have no competing interests.
JW and HX carried out statistical analysis, contributed to development of methodology and wrote sections of the manuscript. AL provided clinical input and oversight. ED developed the clinical database, contributed to statistical programming and reviewed the manuscript. JD developed theoretical mathematical basis for the analysis and wrote section of the manuscript. JM and CT acted as clinical coordinators, entered the data, wrote sections and critically reviewed the manuscript. TL provided programming and mathematical support and critically reviewed the manuscript. SD is coPI on the project, oversaw all clinical aspects of the project, provided clinical theoretical perspectives and wrote sections of the manuscript. WS was the PI on the project, developed statistical methodology, oversaw the work of statisticians and programmers, wrote sections of the manuscript and critically reviewed all its contents. All authors have read and approved the final manuscript.
We are particularly grateful to the editor and reviewers who have greatly increased our knowledge of existing work in circadian rhythm data analysis. This work was supported by R01 HL092347 "New Data Analysis Methods for Actigraphy in Sleep Medicine" (Shannon, PI), the Washington University Dept. of Medicine's Biostatistics Center (Shannon, Director), and the Dept. of Neurology Sleep Center (Duntley, Director)