Fall Risk Classification in Community-Dwelling Older Adults Using a Smart Wrist-Worn Device and the Resident Assessment Instrument-Home Care: Prospective Observational Study

Background Little is known about whether off-the-shelf wearable sensor data can contribute to fall risk classification or complement clinical assessment tools such as the Resident Assessment Instrument-Home Care (RAI-HC). Objective This study aimed to (1) investigate the similarities and differences in physical activity (PA), heart rate, and night sleep in a sample of community-dwelling older adults with varying fall histories using a smart wrist-worn device and (2) create and evaluate fall risk classification models based on (i) wearable data, (ii) the RAI-HC, and (iii) the combination of wearable and RAI-HC data. Methods A prospective, observational study was conducted among 3 faller groups (G0, G1, G2+) based on the number of previous falls (0, 1, ≥2 falls) in a sample of older community-dwelling adults. Each participant was requested to wear a smart wristband for 7 consecutive days while carrying out day-to-day activities in their normal lives. The wearable and RAI-HC assessment data were analyzed and utilized to create fall risk classification models, with 3 supervised machine learning algorithms: logistic regression, decision tree, and random forest (RF). Results Of 40 participants aged 65 to 93 years, 16 (40%) had no previous falls, whereas 8 (20%) and 16 (40%) had experienced 1 and multiple (≥2) falls, respectively. Level of PA as measured by average daily steps was significantly different between groups (P=.04). In the 3 faller group classification, RF achieved the best accuracy of 83.8% using both wearable and RAI-HC data, which is 13.5% higher than that of using the RAI-HC data only and 18.9% higher than that of using wearable data exclusively. In discriminating between {G0+G1} and G2+, RF achieved the best area under the receiver operating characteristic curve of 0.894 (overall accuracy of 89.2%) based on wearable and RAI-HC data. Discrimination between G0 and {G1+G2+} did not result in better classification performance than that between {G0+G1} and G2+. Conclusions Both wearable data and the RAI-HC assessment can contribute to fall risk classification. All the classification models revealed that RAI-HC outperforms wearable data, and the best performance was achieved with the combination of 2 datasets. Future studies in fall risk assessment should consider using wearable technologies to supplement resident assessment instruments.


Introduction
Background By definition, a fall refers to "an event which results in a person coming to rest inadvertently on the ground or floor or other lower level" [1]. The high prevalence and negative impact of falls in older people have become a serious public health issue that affects the independence of older adults, distress in caregivers, and health service utilization [2]. Due to the multifactorial nature of risk factors for falls, current fall prevention strategies are comprehensive and multifaceted [3,4]. An important goal for geriatrics and public health agencies is to accurately identify fall risks and mitigate physical and psychological harm caused by falls. In fact, falls have been used as indicators of the quality of care in home care settings [5,6].
Heart rate (HR) and heart rate variability (HRV) are hypothesized biomarkers of frailty, which implies a growing susceptibility to stressors and functional decline [7,8]. These two parameters mirror the adaptability of the heart to stressors. The study by Ogliari et al (2015) [7] examined whether HR and HRV are correlated with functional status in the aging population. Participants with the highest resting HR had increased risk of decline in performing basic activities on the Activities of Daily Living (ADL) scale and Instrumental Activities of Daily Living (IADL) tasks, with a nearly 80% and a 35% increased risk, respectively [7]. Participants with the lowest HRV had approximately a 25% increased risk of decline in performing the ADL and IADL tasks [7]. The results have shown that a higher resting HR and lower HRV in the target population was associated with poorer functional performance in daily life, as well as higher risk of functional decline [7].
Frail older people expose to great risk for serious health problems, including falls, disability, hospitalization, and mortality [9]. A functional decline and a higher level of frailty caused by the muscular atrophy would escalate the risk for falls in older population [8,10,11]. The occurrence of falls increases with frailty level [4,11]. Frailty and HRV are not only indicators of the decline in health condition [7,8,10] but also served as independent predictors for incident falls in several studies [10,11].
Various studies have shown that loss of sleep implicates a decline in the sense of balance, associating with a number of cognitive impairments such as poor concentration, memory loss, low reaction, and impaired problem solving and cognition [12][13][14]. It has suggested that insufficient sleep may result in risk for falls [12][13][14][15][16]. Short sleep duration, which accounts for habitual night sleep difficulties, is significantly associated with falls [12][13][14][15][16].
Evidence-based fall risk assessment can lead to proper interventions for people who are at risk for falls. To categorize subjects into faller (high risk) and nonfaller (low risk) groups, the 3 main criteria identified in the literature [17] for such classification are as follows: (1) previous history of falls, (2) prediction of future falls, and (3) clinical assessments. Several studies have incorporated a variety of independent predictors into prediction models based on clinical tests. For example, the Berg Balance Test [18], clinical-and impairment-based tests [19], neuromuscular or cognitive tests [20], the blood pressure change on upright tilting [21], depressive symptoms [22], sleep problems or urinary incontinence [16], and frailty [10,11] have been utilized to predict falls in the aging population. These clinical assessments often use assessment scores to categorize older adults into a binary outcome, that is, fallers or nonfallers [23]. However, this type of assessment oversimplifies the risk of falling in older people, which is more accurately classified by continuous fuzzy boundaries between multiple risk categories rather than a hard boundary between only two groups [23].
Recent technological advances have incorporated wearable sensor-based systems into the protocols of fall risk assessment [17,23]. A wearable sensor system can continuously monitor body movement during day-to-day activities, carried out naturally in real-life environments [17,23]. In a review of fall risk assessment in older adults with sensor-based systems, Howcroft et al (2013) [17] evaluated inertial sensors, sensor location, assessed activity, variables, and prediction models of fall risk assessment [17]. The study revealed that variables measured by sensors have the potential to predict individuals who are at risk of falling and forecast the time-to-incident [17]. Marschollek et al (2011) [23] conducted a study to compare the predictive performance between the conventional fall risk assessment and sensor-based assessment in older adults [23]. The results demonstrated that accelerometer-based fall risk model has almost the same performance as a conventional assessment model [23]. Due to the multifactorial risk factors for falls, sensor-based prediction models may provide important information to conventional assessments and are possible to perform within real-life environments at low cost [17,23].
The interRAI suite of assessment instruments [24,25] are designed to provide standardized clinical data to support care planning in a variety of clinical domains. For example, fall assessments are used to guide care and service planning in a wide range of settings, from independent residences through nursing homes and palliative care [24,26]. The Resident Assessment Instrument for Home Care (RAI-HC) is a baseline geriatric assessment to evaluate older adults who utilize home care services by assessing their needs and ability levels [5,27]. With a variety of assessment information, the RAI-HC system is composed of two key components: the Minimum Data Set-Home Care, which is the basal portion of the RAI-HC, and the Clinical Assessment Protocols [27]. In addition, various clinical scales and indices within each interRAI instrument can also be used to evaluate each client's current health conditions (Scales: status and outcome measures). For instance, the measurement of ADL, cognition, communication, pain, behavior, and mood utilizes standardized scoring schema to generate summary indicators [26]. The interRAI assessment system is not only a suite of comprehensive and standardized assessment tools that are used in different care settings but has been utilized in several fall-related studies [18,[27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44]. For example, Muir et al (2008) [18] conducted 1 prospective cohort study using the Berg Balance Scale to examine the predictive effectiveness for any fall (≥1 fall), recurrent falls (≥2 falls), and injury-related falls based on the interRAI Community Health Assessment (CHA) [18]. The CHA and RAI-HC assessments have been widely used in studies investigating the risk factors for falls [18,27,32,33], fear of falling [28][29][30][31], and the comparative analyses of nonfallers versus fallers, nonfallers and one-time fallers versus recurrent fallers [18,27,32,33].

Objectives
To our knowledge, no prior research has combined off-the-shelf wearable sensor data with the RAI-HC assessment to examine the characteristics of different faller groups in older adults living in community, and, furthermore, to build classification models for fall risk assessment using these two data sources. This study aimed to (1) investigate the similarities and differences in physical activity (PA), HR, and night sleep patterns, which are risk factors associated with falls [7,[12][13][14]45,46], among 3 independent older adult faller groups in community-based settings, with continuous measurements from a smart wrist-worn device and (2) create and evaluate fall risk classification models based on (i) wearable data (Wearable), (ii) the RAI-HC, and (iii) the combination of wearable and RAI-HC data (Wearable + RAI-HC). The number of previous falls was targeted as a proxy for fall risk throughout this study [18,27,32,33,47,48].

Study Design
Using a smart wearable device, a prospective, observational study was conducted to investigate the similarities and differences among 3 independent faller groups, that is, nonfaller (G 0 , people who have zero (0) falls in the last 90 days), single faller (G 1 , people who have 1 fall in the last 90 days), and recurrent faller (G 2+ , people who have ≥2 falls in the last 90 days) in community-based settings, in a sample of older adults living in community settings, with continuous measurements of PA, HR, and night sleep. The nonfaller, single faller, or recurrent faller stratus is within 90 days to be consistent with the standard interval of the reassessment of RAI-HC [49].
Each participant was requested to wear the Xiaomi Mi Band Pulse 1S (hereinafter referred to as the Mi Band) on their wrist for 7 consecutive days while carrying out day-to-day activities in their normal lives. The Mi Band is a wearable activity tracker, monitoring the activity of movements, tracing quality of sleep, and HR. It is a low-cost band, weighted 5.5 g, and comes with power-efficient accelerometer and photoelectric HR sensor [50]. Xiaomi Corporation, a Chinese electronics company headquartered in Beijing, China is the manufacturer.
The battery capacity of the Mi Band is 45 mAh [50], with approximately 30 days standby time. We tested the battery life before data collection under normal wearing condition (ie, wearing the Mi Band while carrying out day-to-day activities naturally in real-life environments), which lasted more than 15 days. Before collecting data from each participant, the battery was fully charged. To ensure that no running-out-of-battery incident occurred during data collection, participants were given instructions and demonstration on when and how to recharge the battery by themselves before data collection commence. A printout copy of the instruction was given to each participant as part of the information kit during data collection period.
A Moto E mobile phone was paired with each Mi Band wirelessly via Bluetooth to collect data, synchronize, and provide health metrics to each individual. A total of two companion apps, Mi Fit and Mi Band Tools, were installed on each mobile phone to facilitate data collection. The wearable and RAI-HC data were further analyzed to create fall risk classification models and evaluate their classification performance.

Participant Recruitment
A sample of community-dwelling older people, who were active clients of the Waterloo Wellington Community Care Access Centre (WW CCAC) and were assessed with the RAI-HC instrument within a 1-year time window, was recruited in the Kitchener, Waterloo, Cambridge, and Guelph areas in Ontario, Canada between August and December 2016.
The inclusion criteria were that the participants must have been aged ≥65 years, living independently with or without family members at-home or community-based settings (eg, retirement home), and were able to walk without any assistive device. Individuals who have been diagnosed with end-stage disease or have been on medications of benzodiazepines, antidepressants, cardiac medications, narcotics, and anticonvulsants were excluded from participating in this study.
Informed written consent was obtained from all participants. This study was granted research ethics clearance by a University of Waterloo Research Ethics Committee. The study was also approved by the institutional review board at WW CCAC.

Number of Previous Falls
To assess the fall frequency, participants responded to the following questions upon enrollment and at the end of the wearable data collection phase: (1) "Have you fallen in the last 90 days?" (2) "How many times have you fallen in the last 90 days?" As the reassessment of RAI-HC at a standard interval is 90 days [49], we complied with this time window for the measurement of falls. Participants were categorized into G 0 , G 1 , or G 2+ based on their self-reported number of falls at the end of the wearable data collection phase.
There was a time gap between the RAI-HC assessment and wearable data collection (mean gap 107.6 days, SD 18.1 days; range -67.5-431 days). Some participants had new falls since their last RAI-HC assessments, which resulted in discrepancies between the self-reported fall frequency at wearable data collection and the corresponding assessment on the RAI-HC system. To be consistent, self-reported falls frequency at the end of the wearable data collection phase was used when analyzing wearable data only. The fall frequency on the RAI-HC assessment was used for model-building based on the RAI-HC data only as well as Wearable + RAI-HC data. In case some participants self-reported fewer number of falls than what had been reported by their primary caregivers, the higher number of falls was used in this study.

Wearable Data
Wearable sensor data collected from the Mi Band included continuous monitoring of PA, HR, and night sleep. PA and night sleep data were collected every minute, whereas HR was monitored every 2 min. By default, the Mi Band and Mi Fit app present no built-in function to extract data. A third-party script allowed data extraction via Android backup [51]. Initial wearable data were aggregated as daily averages for the analyses in this study. A list of individual variables derived from the Mi Band is presented in Table 1.

Resident Assessment Instrument-Home Care Data
All participants with informed written consent contributed 1 assessment each, with the latest one being selected. In this study, we used 210 variables in the RAI-HC data for analyses, including demographic information, assessment information across all the screening domains (see Multimedia Appendix 1).

Statistical Data Analysis
Data analyses were performed using R (version 3.4.0), a free statistical software for data analysis by the R Foundation for Statistical Computing. Of the 38 variables, 40 cases and 1520 values in the wearable data, 55.3%, 65%, and 6.4% have at least 1 missing value, respectively, in terms of 1-(PA and sleep) and 2-min (HR) resolution. Of the 210 variables and 40 cases in the RAI-HC dataset, 19.8% and 100% had at least 1 missing value, respectively. Of the total of 8400 values corresponding to all combinations of the 210 variables and 40 cases, 16.3% were missing. The missing values in the RAI-HC dataset were replaced by referring to previous assessments. The missing values in the wearable data were imputed using the maximum likelihood estimates with the expectation-maximization algorithm (eg, [52]).
Descriptive statistics and simple statistical analyses were conducted to examine the similarities and differences in wearable data collected from the Mi Band from all participants. All wearable parameters (continuous variables) extracted from the Mi Band were tested for normality by using the Shapiro-Wilk test. The one-way analysis of variance (ANOVA) and Kruskal-Wallis H test were conducted to compare the means and medians of the 3 independent groups (G 0 , G 1 , and G 2+ ) for normally distributed and skewed data, respectively. A two-way repeated measures ANOVA test was performed to examine the differences between groups with repeated measurements of PA, HR, and night sleep, and hence, evaluate if there was an interaction between the 7 days of measurement and groups. In all statistical analyses, P values ≤.05 were considered significant. However, in case of any main effect statistical significance among all groups, pairwise comparisons between groups were investigated with Bonferroni correction.

Fall Risk Classification
To build the classification models and evaluate the classification performance of several models in classifying fall risks, a 2-step approach was employed. First, the ordinal attribute of falls (0, 1, and ≥2) within the last 90 days was used as the outcome variable, representing 3 faller groups (G 0 , G 1 , and G 2+, respectively), for building proportional odds models (POM). Second, the 3-class fall risk was dichotomized in two different ways: (1) grouping {G 1 +G 2+ } and comparing with G 0 and (2) grouping {G 0 +G 1 } and comparing with G 2+ . A total of 3 supervised machine learning algorithms were utilized: logistic regression, decision tree (DT), and random forest (RF).
Given the large number of features in both datasets, there was a good chance that many of them are collinear or redundant. The multicollinearity test was conducted, and the collinear variables with a high variance inflation factor (≥5) were omitted for further analyses [53]. To identify discriminative independent variables contributing to fall frequency and to create accurate classification models, the recursive feature elimination algorithm available in the Caret R package was employed to rank-order each predictor's importance to classification. As both the wearable and RAI-HC datasets had many variables and relatively few cases, the objective of this feature selection process was to get a total number of best subset features of no more than 10% of the sample size for the final classification models.
Classification models were trained based on (1) Wearable, (2) RAI-HC, and (3) Wearable + RAI-HC. The growing method for DT models was Classification and Regression Trees algorithm, with pruning to avoid overfitting. Key parameters included pruned, minimum child size=3, minimum parent size=5, and Gini was applied as the impurity measure. Key parameters for RF models included the number of trees grown=100, minimum size of terminal nodes=5, and the number of variables sampled at each split randomly=3. Due to the small size of training data in this study, each final model was evaluated using leave-one-out cross-validation. For the 3-class outcome, the classification accuracy, recall, precision, and F 1 score were calculated for each final model, and the area under the receiver operating characteristic curve (AUC), accuracy, recall, precision, and F 1 score were calculated for the dichotomized fall risks. To minimize the impact of different fall assessment at two study elements on classification performance, individuals who had an additional fall each in between the RAI-HC and wearable sensor data collection within the last 90 days' time window were excluded for model building.

Subject Characteristics
Of the 40 participants aged 65 to 93 years in this study, 22 (55%) were males, and 18 (45%) were females. Table 2 shows the basic characteristics of all participants in this study based on their latest RAI-HC assessments.

Statistical Analysis
The results of the Shapiro-Wilk tests of normality showed that only the daily activity time (G 0 : P=.67, G 1 : P=.30, G 2+ : P=.09) was normally distributed in all 3 groups. Table 3 summarizes the PA, HR, and night sleep measurements collected by the Mi Band from different faller groups.

Physical Activity Measurements
The one-way ANOVA test results showed that there was a significant difference in daily activity time (P=.04). However, the follow-up comparisons with the Games-Howell test indicated that the actual pairwise differences were not significant.
The Kruskal-Wallis H test results revealed that there was a significant difference in daily steps among the 3 faller groups (P=.04), with a mean rank daily steps of 26.53 for G 0 , 18.00 for G 1 , and 15.67 for G 2+ . The posthoc Mann-Whitney test results showed that the daily steps were not significantly different between any two comparison groups, with a Bonferroni correction at a 0.05/3=0.0167 level of significance.
Similarly, a significant difference was found in daily distance among 3 faller groups (P=.04), with a mean rank daily distance of 26.53, 17.92, and 15.75 for G 0 , G 1, and G 2+ , respectively. The posthoc Mann-Whitney tests with Bonferroni correction indicated that daily distance was not significantly different between any two comparison groups.
The two-way repeated measures ANOVA test results revealed that there was a significant main effect of steps by days between groups (P=.02). The posthoc tests with Bonferroni correction showed no significant pairwise differences among the 3 groups. The main effect of day of measurement was insignificant, indicating that there was no consistent difference in step counts across different days, if the groups being measured were ignored. No significant interaction effect between daily steps and the 3 faller groups was detected.

Heart Rate Measurements
The Kruskal-Wallis H test results indicated no significant difference in daily resting HR or daily walking HR between groups.
Furthermore, the mean, median, SD, and interquartile range (IQR) of each participant's daily average HR was examined for differences across the groups. The results of the normality test revealed that the SD of daily average HR was normally distributed across all 3 groups. The mean, median, and IQR of daily average HR were shown to be significantly non-normal (P<.001, P<.001, and P=.007, respectively).
The one-way ANOVA test results showed that there was no significant difference in the participants' SD of daily average HR. The Kruskal-Wallis H test results revealed no significant difference in the mean, median, or IQR of daily average HR between groups. The two-way repeated measures ANOVA test results revealed an insignificant main effect of HR by days between groups. The main effect of the days being measured was nonsignificant, indicating that there was no consistent difference in HR across different days, if the groups being measured were ignored. No significant interaction effect between daily average HR and the 3 faller groups was detected.

Night Sleep Measurements
The Kruskal-Wallis H test results revealed that there was no statistically significant difference in daily sleep duration, daily deep sleep time, daily light sleep time, or daily awake time among 3 faller groups.
The two-way repeated measures ANOVA test results showed an insignificant main effect of sleep duration by days between groups. The main effect of the days being measured was insignificant, indicating that there was no consistent difference in sleep duration across different days, if the groups being measured were ignored. No significant interaction effect between daily sleep duration and the 3 faller groups was detected. Table 4 shows the 3-class classification results for POM, DT, or RF on Wearable, RAI-HC, and Wearable+RAI-HC. In the 3 faller group classification, RF achieved the best accuracy of 0.838 (+/-0.199), recall of 0.775 (+/-0.233), precision of 0.730 (+/-0.259), and F 1 score of 0.748 (+/-0.248) using both wearable and RAI-HC data. The lowest accuracy occurred in POM using wearable data.   Table 5 tabulates the feature analysis results for all classification models, listing various features that have been selected in the 3 datasets with 3-class classification and dichotomization in two different ways. Table 6 and Table 7

General Discussions
To the best of our knowledge, no prior study has combined off-the-shelf wearable sensor data with the interRAI assessment system to examine the characteristics of different faller groups in community-dwelling older people, or to build fall risk classification models with the combination of wearable and interRAI data. There was a gap in knowledge necessary to understand the associations between PA, HR, and night sleep and different fall frequencies in the target population. This pilot study aimed to fill this gap.
It was hypothesized that there were differences in PA, HR, and night sleep among the two faller groups in the target population. The statistical test results revealed a significant difference of PA, including daily steps, daily distance, and daily activity time between groups. The findings are consistent with the literature regarding PA and falls, that is, the decline in PA is associated with increased occurrences of falls [45,54]. However, the small sample size could have made it difficult to detect significant associations. Although there were group differences, the subsequent pairwise comparisons were not significant.
The findings in this study are in line with previous research that examined risk factors for falls in community-dwelling older adults [27,28,32,33,47,54]. For example, Gaßmann et al (2009) [47] examined predictors for single and recurrent fallers in older people living in community, and the results indicated poor health status, lower physical functioning, and mobility were risk factors for falls [47]. In our study, the top features (Table 5) incorporated into model-building were associated with poor health status, such as number of emergency room (ER) visits and IADL from the RAI-HC data, which are major risk factors for falls. As a baseline geriatric assessment to evaluate older adults who utilize home care services, the RAI-HC data represent a comprehensive assessment framework, which may serve well as a fall risk screening method. Similarly, wearable data contain discriminatory power in classifying fall risks. For instance, daily resting HR derived from the wearable device was associated with frailty, which was considered a risk factor for falls [7].
In the 3 faller group classification, RF achieved the best accuracy of using both wearable and RAI-HC data. It reveals that to achieve the best accuracy for classifying an individual into 1 of the 3 faller groups (G 0 , G 1 , or G 2+ ), applying the RF algorithm on both wearable and RAI-HC data outperforms all the other methods (Table 4). Considering dichotomization of the 3-class outcome, the combination of wearable and RAI-HC data led to the best classification results as well (Table 6 and  Table 7). The 2 datasets represent distinct features associated with fall risk. For example, the wearable data provide objective information on motion, whereas the RAI-HC data represent a comprehensive geriatric assessment, measuring IADL, cognition, communication, pain, behavior, and mood utilizing standardized scoring schema to generate summary indicators [26]. The merging of these 2 datasets seems to bring in added value while conducting automatic feature selection with the recursive feature elimination algorithm.
Although dichotomizing to binary classification models, the RF algorithm with both wearable, and RAI-HC data led to a strong discrimination with the AUC of 0.894, whereas classifying an individual into nonfallers and single-fallers {G 0 +G 1 } or recurrent fallers G 2+ . It is recommended to use both datasets as Table 7 suggests, and the best features are the method for assigning priority levels (MAPLe), number of ER visits, daily walking HR, and short-term memory as tabulated in Table 5. Similarly, comparing with all the methods and models that classify an individual into nonfallers G 0 and fallers {G 1 +G 2+ }, the RF algorithm with both wearable, and RAI-HC data gave a strong discrimination with the AUC of 0.865 (Table 6). Again, it is recommended to use the combination of wearable and RAI-HC data; the best features are MAPLe, IADL-difficulty prep meal, overall change in care needs, and daily steps as tabulated in Table 5.
Comparing the two different ways of dichotomization, that is, G 0 versus {G 1 +G 2+ } and {G 0 +G 1 } versus G 2+ , the classification models distinguishing {G 0 +G 1 } and G 2+ had better performance. However, the binary classification results of this study did not show any consistent trend as to whether G 1 is more similar to G 0 or G 2+ . There seems to be no clear and hard boundary between any two adjacent groups. Intuitively, because of the multifactorial nature of risk factors for falls, the boundaries on both sides of G 1 are expected to be fuzzy.

Limitations
The main limitation of this study is the relatively small sample size, which is not robust to analyze the binary and accidental data of falls, especially in a machine learning context. The small number of participants compromise the accuracy and, therefore, the validity of this study findings. Although it may be difficult to generalize or draw conclusions relying on a small dataset, the leave-one-out cross validation method helps address the limitation of small dataset size. The gap between the wearable and RAI-HC data collection and the subsequent decision of using the fall frequency on the RAI-HC assessment for model-building on Wearable + RAI-HC data may have limited the true ability to compare various classifier performance between the groups. In particular, the wearable component may have been disadvantaged by correlating with outdated number of falls. Evidence suggested a response bias, in particular, social desirability bias may be introduced into this study, as some participants underreported their fall frequencies while compared with the responses from their primary caregivers. We used cross sectional data instead of longitudinal outcomes, which is another major limitation that has to be addressed in future work. The findings from this study suffer from limited generalizability because of the homogenous and small sample from community-based settings within a particular geographic area. Using retrospective fall occurrence and lack of follow-up observation accounts for another limitation. In addition, although the selected wearable device is capable of monitoring sleep patterns at night with auto sleep detection, it cannot reliably detect relatively short periods of sleep or fragmented sleep. As such, the Mi Band in this study did not properly identify daytime napping.

Conclusions
This study provides a knowledge base that future research in fall risk assessment can leverage. By obtaining a better and fuller understanding of fall risk and varying characteristics of older people with different fall histories, more suggestions that are informed can be made for individuals in this population. Both wearable data and the RAI-HC assessment can contribute to fall risk classification. All the classification models revealed that RAI-HC outperforms wearable data and the best performance was achieved with the combination of 2 datasets. Future studies in fall risk assessment should consider using wearable technologies to supplement resident assessment instruments. Future studies are needed to work around the limitations of this study. For instance, larger sample sizes, reduced gap between the RAI-HC and wearable sensor collection, longer study periods, and possibly fuller use of the collected longitudinal data may be helpful in better estimating fall risk classification performance. Studies on different older adult populations are warranted, including clinical inpatients, long-term care, or other institutional residents.