AgeTopicModels: Inferring Age-Dependent Disease Topic from Diagnosis Data

We propose an age-dependent topic modelling (ATM) model, providing a low-rank representation of longitudinal records of hundreds of distinct diseases in large electronic health record data sets. The model assigns to each individual topic weights for several disease topics; each disease topic reflects a set of diseases that tend to co-occur as a function of age, quantified by age-dependent topic loadings for each disease. The model assumes that for each disease diagnosis, a topic is sampled based on the individual’s topic weights (which sum to 1 across topics, for a given individual), and a disease is sampled based on the individual’s age and the age-dependent topic loadings (which sum to 1 across diseases, for a given topic at a given age). The model generalises the Latent Dirichlet Allocation (LDA) model by allowing topic loadings for each topic to vary with age. References: Jiang (2023) <doi:10.1038/s41588-023-01522-8>.

Version: 0.1.0
Depends: R (≥ 3.5)
Imports: dplyr, ggplot2, ggrepel, grDevices, gtools, magrittr, pROC, reshape2, rlang, stats, stringr, tibble, tidyr, utils
Suggests: testthat (≥ 3.0.0)
Published: 2025-10-21
DOI: 10.32614/CRAN.package.AgeTopicModels (may not be active yet)
Author: Xilin Jiang ORCID iD [aut, cre]
Maintainer: Xilin Jiang <jiangxilin1 at gmail.com>
License: MIT + file LICENSE
NeedsCompilation: no
Materials: README, NEWS
CRAN checks: AgeTopicModels results

Documentation:

Reference manual: AgeTopicModels.html , AgeTopicModels.pdf

Downloads:

Package source: AgeTopicModels_0.1.0.tar.gz
Windows binaries: r-devel: not available, r-release: not available, r-oldrel: not available
macOS binaries: r-release (arm64): AgeTopicModels_0.1.0.tgz, r-oldrel (arm64): not available, r-release (x86_64): AgeTopicModels_0.1.0.tgz, r-oldrel (x86_64): AgeTopicModels_0.1.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=AgeTopicModels to link to this page.