Improving Measurement in Longitudinal Studies of Aging
This is the newest thematic area of NIMLAS, first introduced in 2025. This thematic area focuses on all issues related to methods for measuring and reducing measurement error in longitudinal studies of aging. Additional content will be forthcoming, along with opportunities to join this new working group.
Current Critical Directions for Future Research on Improving Measurement in Longitudinal Studies of Aging:
- Interview Mode: What are the optimal modes of data collection for longitudinal studies of aging? How does the mode affect measurement as well as other errors in longitudinal survey data collection? How does the mixed-mode design improve data quality?
-
Measurement Error: What is the prevalence of measurement error unique to longitudinal studies (e.g., learning effects/conditioning), and what are the best methods for measuring and reducing this measurement error? Is the measurement error systematic or variable? What are the sources of reduced reliability in self-administered items on the web or using IVR technology?
-
Measuring Cognition: What are the best practices for obtaining the highest quality measures of cognition for different aging subpopulations?
-
Mode Effects and Cognition: What data collection modes are optimal for measuring cognition and which cognitive measures are optimal for which modes? How can researchers modify cognition measurement tools to best fit the mode? How can researchers adjust for mode effects when analyzing cognition?
- Differential Effectiveness of Measurement: Are the instruments that we use to collect survey measures (or passive measures, using new measurement technologies) equally accepted, effective, and suitable for measuring different subpopulations, resulting in broad representation? Do we need different measurement instruments for different subgroups / languages, based on cognitive interviewing, pre-testing, knowledge of technology, etc.?
- Rapid Assessment: How best to collect for addressing urgent public health needs? What are the options for population-based data collection under unprecedented circumstances faced by the population at large?
Bibliography
All bibliography entries below are tagged with colored shapes corresponding to the major thematic research areas of NIMLAS. Specific critical topics for future research that the particular product within each area is addressing are provided in text next to the colored shapes.
Data collection methods for improving representation
Addressing increasing attrition rates
New measurement technologies
Consent to additional data collection
Improving measurement in longitudinal studies of aging
Diemer, M. A., Frisby, M. B., Marchand, A. D., & Bardelli, E. (2024)
Illustrating and enacting a Critical Quantitative approach to measurement with MIMIC models
Journal of Research on Educational Effectiveness, 1–24. doi.org/10.1080/19345747.2024.2391774
Summary
This paper provides a how-to guide for planning, implementing, and evaluating the MIMIC (Multiple Indicator and MultIple Causes) method in diverse populations – in this case, large samples of Black and white respondents from the MADICS study. The MIMIC approach can also probe for differential item functioning (DIF) in longitudinal measures (i.e., temporal invariance). MIMIC models afford powerful claims about measurement – particularly in terms of biased items – and are relatively simple to specify and test. Therefore, we argue that MIMIC models are sorely underutilized and serve important roles in ensuring sound and fair measurement. To increase their use, this tutorial carefully explains how to specify, interpret, and evaluate MIMIC models, as well as provides sample code in R (lavaan) and MPlus. MIMIC models are explained in accessible “plain English” and Greek (notation), with an OSF folder providing annotated code and output.
Measurement Error
Domingue, B. W., McCammon, R. J., West, B. T., Langa, K. M., Weir, D. R., & Faul, J. (2023)
The Mode Effect of Web-Based Surveying on the 2018 U.S. Health and Retirement Study Measure of Cognitive Functioning
The Journals of Gerontology: Series B, 78(9), 1466–1473. doi.org/10.1093/geronb/gbad068
Summary
This study examined the mode effect on the respondent performance of cognitive tests where the mode (web vs. telephone) was randomly assigned. Those assigned to the Web mode scored higher than those to the telephone mode, particularly in the Serial 7 task and numeracy items. It recommends a mode-dependent scoring system to indicate cognitively impaired but not demented status.
Mode Effects and Cognition
, Measuring Cognition
Gatz, M., Schneider, S., Meijer, E., Darling, J.E., Orriens, B., Liu, Y., and Kapteyn, A. (2022).
Identifying Cognitive Impairment Among Older Participants in a Nationally Representative Internet Panel.
The Journals of Gerontology: Series B, 78(2), 201-209. doi:10.1093/geronb/gbac172.
Summary
Cognitive impairment is a major health issue impacting many older adults, making it imperative to have indicators of cognitive functioning. Web and phone surveys have shown to be useful in measuring cognitive functioning, allowing for the development of a cognitive impairment score.
Measuring Cognition
Kumari, M., Andrayas, A., Al Baghal, T., Burton, J., Crossley, T. F., Kerry, S. J., Parkington, D. A., Koulman, A., & Benzeval, M. (2023)
A Randomised Study of Nurse Collected Venous Blood and Self-Collected Dried Blood Spots for the Assessment of Cardiovascular Risk Factors in the Understanding Society Innovation Panel
Scientific Reports, 13, 13008. doi.org/10.1038/s41598-023-39674-6
Summary
This study examined whether participants in the U.K. Understanding Society Innovation Panel were willing to provide a blood sample in different interview settings (randomly assigned to 1. Nurse collection of dried blood spot (DBS), 2. Nurse collection of blood sample by venepuncture, or 3. Self-collection of DBS), and how resulting cardiovascular risk biomarkers compared. Although the willingness was lowest in the self-collection DBS, demographic characteristics of participants in self-collection mode were not different to those in nurse collection mode. Further, clinical biomarker information relevant to cardiovascular disease risk did not differ between the venepuncture blood sample and the DBS. This demonstrates that DBS collection offers acceptable measures of clinically relevant biomarkers, enabling the calculation of population levels of cardiovascular disease risk.
,
, Differential Effectiveness of Measurement
, Measurement Error
Nichols, E., Gross, A. L., Zhang, Y. S., Meijer, E., Hayat, S., Steptoe, A., Langa, K. M., & Lee, J. (2024)
Considerations for the use of the Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) in cross-country comparisons of cognitive aging and dementia
Alzheimer’s & Dementia, 20, 4635–4648. https://doi.org/10.1002/alz.13895
Summary
This study compares the performance of IQCODE in understanding objective cognition across the U.S., England and India. It reports strong associations between IQCODE and objective cognition at low cognitive functioning in the U.S. and England. In India, however, the IQCODE was less sensitive to impairments at the lowest levels of cognitive functioning, particularly among those with no education. Informant characteristics may differentially impact informant reports across countries. Researchers should consider the use of country-specific adjustments to the IQCDOE scoring based on informant characteristics to improve cross-national comparability.
Measuring Cognition
, Different Effectiveness of Measurement
Nichols, E., Jones, R. N., Gross, A. L., Hayat, S., Zaninotto, P., & Lee, J. (2024)
Development and assessment of analytic methods to improve the measurement of cognition in longitudinal studies of aging through the use of substudies with comprehensive neuropsychological testing
Alzheimer’s & Dementia, 20, 7024–7036. https://doi.org/10.1002/alz.14175
Summary
Regression modeling or confirmatory factor analysis (CFA) can be used to incorporate information from substudies with comprehensive neuropsychological testing into measures of cognition in broader aging surveys. Compared to a gold standard measure based on comprehensive neuropsychological testing, both approaches had lower mean squared error than existing comparison approaches. Associations with example risks were similar across all approaches, though estimated standard errors were most accurate for CFA models. The similarity across approaches may be due to the brevity of available cognitive assessments in the example dataset.
Measuring Cognition
, Different Effectiveness of Measurement
Nichols, E. & Lee, J. (2024)
Considerations around the measurement of cognition in large-scale cross-national surveys: Lessons from the Health and Retirement International Network of Surveys (HRS INS) and the Harmonized Cognitive Assessment Protocol (HCAP)
CESR-Schaeffer Working Paper No. 2024-012, http://dx.doi.org/10.2139/ssrn.4986761
Summary
This paper presents key lessons on cognitive assessment from the international collaboration of aging studies. It reports challenges associated with administering cognitive tests across populations from different cultural and linguistic backgrounds and importance of maintaining consistency across time and studies, test implementation feasibility across both high-income and low- and middle-income countries, and comprehensive cognitive batteries for improved measurement precision.
Measuring Cognition
, Different Effectiveness of Measurement
Nichols, E., Markot, M., Gross, A. L., Jones, R. N., Meijer, E., Schneider, S., & Lee, J. (2025)
The Added Value of Metadata on Test Completion Time for the Quantification of Cognitive Functioning in Survey Research
Journal of the International Neuropsychological Society, 1-10. doi.org/10.1017/S1355617724000742.
Summary
This study examined the relationship between response times on cognitive testing in computerized in-person interviews and cognitive performance. Nonlinear associations between response time and cognitive functioning was reported, after adjusting for traditional cognitive test scores. Results indicate that response times from cognitive testing may contain important information on cognition not captured in traditional scoring. Incorporation of this information has the potential to improve existing estimates of cognitive functioning.
Measuring Cognition
, Paradata
Sanders, S., Schofield, L. S., Schumm, L. P., & Waite, L. (2025)
Measuring Cognitive Function and Cognitive Decline With Response Time Data in the National Social Life, Health, and Aging Project
The Journals of Gerontology: Series B, Psychological sciences and social sciences, 80(Supplement_1), S66–S74. doi.org/10.1093/geronb/gbae037
Summary
Using the data on response time of standard cognition questions in the National Social Life, Health, and Aging Survey, this study examined the relationship between the response time and the Montreal Cognitive Assessment (MoCA). The results show that response time predicted current as well as future MoCA. This predictive power varied by race and age but not by gender.
Measuring Cognition
, Paradata
Schneider, S., Junghaenel, D. U., Meijer, E., Stone, A. A., Orriens, B., Jin, H., Zelinski, E. M., Lee, P-J., Hernandez, R., & Kapteyn, A. (2023)
Using Item Response Times in Online Questionnaires to Detect Mild Cognitive Impairment
The Journals of Gerontology: Series B, 78(8), 1278–1283. doi.org/10.1093/geronb/gbad043
Summary
This study examined the utility of response time on online surveys for discriminating respondents’ cognitive health. Specifically, it used response time from 1053 items across 37 online surveys administered over 6.5 years and cognitive health measured at the end of this time in a multilevel location-scale model. Average response time as well as fluctuations in response time were associated with subsequent cognitive health. This suggests that response time on survey items may be a potential indicator of cognitive impairment.
Measuring Cognition
, Paradata
Schneider, S., Junghaenel, D. U., Zelinski, E. M., Meijer, E., Stone, A. A., Langa, K. M., & Kapteyn, A. (2021)
Subtle mistakes in self-report surveys predict future transition to dementia
Alzheimer’s & Dementia (Amsterdam, Netherlands), 13(1), e12252. doi.org/10.1002/dad2.12252
Summary
This examined the relationship between errors respondents make in completing survey interviews (e.g., implausible responses, skipped questions) and subsequent dementia incidents using the Health and Retirement Study data. All response error variables showed an independent relationship with dementia, where the relationship was stronger for those who were younger and cognitively normal at baseline.
Measuring Cognition
, Measurement Error
, Paradata
Stopczynski, A., Sekara, V., Sapiezynski, P., Cuttone, A., Madsen, M.M, Larsen, J.E., and Lehmann, S. 2014.
Measuring large scale social networks with high resolution.
PLOS One 9(4):e95978. DOI: https://doi.org/10.1371/journal.pone.0095978
Summary
Bluetooth and Wi-Fi networks can be very useful in collecting information about social networks within a specific location, and can be utilized to make connections within aging populations residing within assisted living facilities. This social network data can also be connected to relevant health data.
Social Network Measurement