| Type: | Package | 
| Title: | D-Score for Child Development | 
| Version: | 2.0.0 | 
| Description: | The D-score summarizes a child's performance on developmental milestones into a single number. Its key feature is its generic nature. The method does not depend on a specific measurement instrument. The statistical method underlying the D-score is described in van Buuren et al. (2025) <doi:10.1177/01650254241294033>. This package implements model keys to convert milestone scores to D-scores; maps instrument-specific item names to a generic 9-position naming convention; computes D-scores and their precision from a child's milestone scores; and converts D-scores to Development-for-Age Z-scores (DAZ) using age-conditional reference standards. | 
| Depends: | R (≥ 4.1.0) | 
| Imports: | dplyr (≥ 1.0.0), Rcpp, stats, stringi, tidyr (≥ 1.0.0) | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| Suggests: | ggplot2, kableExtra, knitr, lme4, Matrix, patchwork, rmarkdown, testthat | 
| Encoding: | UTF-8 | 
| License: | Apache License (≥ 2) | 
| LazyData: | TRUE | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | yes | 
| URL: | https://github.com/d-score/dscore, https://d-score.org/dscore/, https://d-score.org/dbook1/ | 
| BugReports: | https://github.com/d-score/dscore/issues | 
| RoxygenNote: | 7.3.3 | 
| Packaged: | 2025-10-02 16:30:53 UTC; buurensv | 
| Author: | Stef van Buuren [cre, aut], Iris Eekhout [aut], Arjan Huizing [aut], Jonathan Seiden [aut] | 
| Maintainer: | Stef van Buuren <stef.vanbuuren@tno.nl> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-10-03 16:30:02 UTC | 
D-score for child development
Description
The dscore package implements tools needed to calculate the D-score,
a numerical score that summarizes early development in children by
one number, the D-score.
User functions
The available functions are:
| Function | Description | 
| get_itemnames() | Extract item names from an itemtable | 
| order_itemnames() | Order item names | 
| sort_itemnames() | Sort item names | 
| decompose_itemnames() | Get four components from itemname | 
| get_itemtable() | Get a subset from the itemtable | 
| get_labels() | Get labels for items | 
| rename_gcdg_gsed() | Rename gcdg into gsed lexicon | 
| dscore() | Estimate D-score and DAZ | 
| dscore_posterior() | Calculate full posterior of D-score | 
| get_tau() | Get difficulty parameters from item bank | 
| daz() | Transform to age-adjusted standardized D-score | 
| zad() | Inverse of daz() | 
| get_reference() | Get D-score reference tables | 
| get_age_equivalent() | Translate difficulty to age | 
Built-in data
The package contains the following built-in data:
| Data | Description | 
| builtin_keys() | Available keys for calculating the D-score | 
| builtin_itembank() | Collection of items fitting the Rasch model | 
| builtin_itemtable() | Collection of items from instruments measuring early child development | 
| builtin_references() | Collection of age-conditional reference distributions | 
| milestones() | Dataset with PASS/FAIL responses for 27 preterms | 
| gsample | Sample of 10 children from the GSED Phase 1 study, gsed lexicon | 
| sample_sf | Sample of 10 children from GSED Short Form (GSED-SF) | 
| sample_lf | Sample of 10 children from GSED Long Form (GSED-LF) | 
| sample_hf | Sample of 10 children from GSED Household Form (GSED-HF) | 
Acknowledgements
The authors wish to recognize the principal investigators and their study team members for their generous contribution of the data that made this tool possible and the members of the Ki team who directly or indirectly contributed to the study: Amina Abubakar, Claudia R. Lindgren Alves, Orazio Attanasio, Maureen M. Black, Maria Caridad Araujo, Susan M. Chang-Lopez, Gary L. Darmstadt, Bernice M. Doove, Wafaie Fawzi, Lia C.H. Fernald, Günther Fink, Emanuela Galasso, Melissa Gladstone, Sally M. Grantham-McGregor, Cristina Gutierrez de Pineres, Pamela Jervis, Jena Derakhshani Hamadani, Charlotte Hanlon, Simone M. Karam, Gillian Lancaster, Betzy Lozoff, Gareth McCray, Jeffrey R Measelle, Girmay Medhin, Ana M. B. Menezes, Lauren Pisani, Helen Pitchik, Muneera Rasheed, Lisy Ratsifandrihamanana, Sarah Reynolds, Linda Richter, Marta Rubio-Codina, Norbert Schady, Limbika Sengani, Chris Sudfeld, Marcus Waldman, Susan P. Walker, Ann M. Weber and Aisha K. Yousafzai.
This study was supported by the Bill & Melinda Gates Foundation. The contents are the sole responsibility of the authors and may not necessarily represent the official views of the Bill & Melinda Gates Foundation or other agencies that may have supported the primary data studies used in the present study.
Author(s)
Maintainer: Stef van Buuren stef.vanbuuren@tno.nl
Authors:
- Iris Eekhout iris.eekhout@tno.nl 
- Arjan Huizing arjan.huizing@tno.nl 
- Jonathan Seiden jseiden@g.harvard.edu 
References
Jacobusse, G., S. van Buuren, and P.H. Verkerk. 2006. “An Interval Scale for Development of Children Aged 0-2 Years.” Statistics in Medicine 25 (13): 2272–83. https://stefvanbuuren.name/publication/jacobusse-2006/
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf.
GSED team (Maureen Black, Kieran Bromley, Vanessa Cavallera (lead author), Jorge Cuartas, Tarun Dua (corresponding author), Iris Eekhout, Gunther Fink, Melissa Gladstone, Katelyn Hepworth, Magdalena Janus, Patricia Kariger, Gillian Lancaster, Dana McCoy, Gareth McCray, Abbie Raikes, Marta Rubio-Codina, Stef van Buuren, Marcus Waldman, Susan Walker and Ann Weber). 2019. “The Global Scale for Early Development (GSED).” Early Childhood Matters. https://earlychildhoodmatters.online/2019/the-global-scale-for-early-development-gsed/
See Also
Useful links:
- Report bugs at https://github.com/d-score/dscore/issues 
Collection of items fitting the Rasch model
Description
A data frame with administrative information per item with difficulty
estimates (tau) from the Rasch model. The item bank provides the basic
information to calculate D-scores. The items in the item bank
are a subset of all items as collected in builtin_itemtable.
Usage
builtin_itembank
Format
A data.frame with variables:
| Name | Label | 
| key | String indicating a specific Rasch model | 
| item | Item name, gsed lexicon | 
| tau | Difficulty estimate | 
| label | Label (English) | 
| instrument | Instrument code | 
| domain | Domain code | 
| mode | Administration mode | 
| number | Item number | 
Details
The difficulty estimates were estimated by a Rasch model. The key
indicates the specific Rasch model used to estimate the difficulty.
Strictly speaking, one can only compare D-score calculated from the
same key.
Note
Updates:
- Dec 01, 2022 - Overwrite labels of gto by correct item order. 
- Dec 05, 2022 - Adds key - gsed2212, adding instruments- gl1and- gs1, and defining correct order for- gto
- Jan 05, 2023 - Adds instrument - gh1to key- gsed2212
See Also
dscore(), get_tau(), builtin_itemtable()
Examples
# count number of items per instrument in each key
table(builtin_itembank$instrument, builtin_itembank$key)
Collection of items from instruments measuring early child development
Description
The built-in variable builtin_itemtable contains the name and label
of items for measuring early child development.
Usage
builtin_itemtable
Format
A data.frame with variables:
| Name | Label | 
| item | Item name, gsed lexicon | 
| equate | Equate group | 
| label | Label (English) | 
Details
The builtin_itemtable is created by script
data-raw/R/save_builtin_itemtable.R.
Updates:
- May 30, 2022 - added gto (LF) and gpa (SF) items 
- June 1, 2022 - added seven gsd items 
- Nov 24, 2022 - Added instruments gs1, gs2 
- Dec 01, 2022 - Labels of gto replaced by correct order. Incorrect item order affects analyses done on LF between 20220530 - 20221201 !!! 
- Dec 05, 2022 - Redefines gs1 and instrument for Phase 2, removes gs2 (139) Adds gl1 (Long Form Phase 2 items 155) 
- Jan 05, 2023 - Adds 55 items from GSED-HF 
- Jul 15, 2025 - Rename gpaclc088 –> gpaclc089 (Can you child say five or more separate words) Rename gpasec089 –> gpasec088 (Is your child able to pee and poo) 
Author(s)
Compiled by Stef van Buuren using different sources
Available keys for calculating the D-score
Description
A key contains the item difficulty estimates from a given Rasch model.
The difficulty estimates (tau) are used to calculate D-scores.
D-scores can only be compared when calculated with the same key.
Usage
builtin_keys
Format
builtin_keys is a data.frame with variables:
| Name | Label | 
| key | String. Name of the key indicating the Rasch model | 
| base_population | String. Name of the base population for the key | 
| n_items | Number of items in the key | 
| n_instruments | Number of instruments in the key | 
| intercept | Intercept to convert logit into D-score | 
| slope | Slope to convert logit into D-score | 
| from | Starting value of the quadrature points | 
| to | Stopping value of the quadrature points | 
| by | Increment of the quadrature points | 
| retired | Has the key been retired? | 
Note
20240609 SvB: Added builtin_keys table by
data-raw\data\R\save_builtin_keys.R
Collection of age-conditional reference distributions
Description
A data frame containing the age-dependent distribution of the D-score for children aged 0-5 years. The distribution is modelled after the LMS distribution (Cole & Green, 1992) or BCT model (Stasinopoulos & Rigby, 2022) and is equal for both boys and girls. The LMS/BCT values can be used to graph reference charts and to calculate age-conditional Z-scores, also known as the Development-for-Age Z-score (DAZ).
Usage
builtin_references
Format
A data.frame with the following variables:
| Name | Label | 
| population | Name of the reference population | 
| key | D-score key, e.g., "dutch","gcdg"or"gsed" | 
| distribution | Distribution family: "LMS"or"BCT" | 
| age | Decimal age in years | 
| mu | M-curve, median D-score, P50 | 
| sigma | S-curve, spread expressed as coefficient of variation | 
| nu | L-curve, the lambda coefficient of the LMS/BCT model for skewness | 
| tau | Kurtosis parameter in the BCT model | 
| P3 | P3 percentile | 
| P10 | P10 percentile | 
| P25 | P25 percentile | 
| P50 | P50 percentile | 
| P75 | P75 percentile | 
| P90 | P90 percentile | 
| P97 | P97 percentile | 
| SDM2 | -2SD centile | 
| SDM1 | -1SD centile | 
| SD0 | 0SD centile, median | 
| SDP1 | +1SD centile | 
| SDP2 | +2SD centile | 
Details
Here are more details on the reference population:
The "dutch" references were calculated from the SMOCC data, and cover
age range 0-2.5 years (van Buuren, 2014).
The "gcdg" references were calculated from the 15 cohorts of the
GCDG-study, and cover age range 0-5 years (Weber, 2019).
The "phase1" references were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The
age range 3.5-5 yrs is linearly extrapolated and are only indicative.
The "preliminary_standards" were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) using a subset of children with
covariate indicating healthy development.
References
Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
Stasinopoulos M, Rigby R (2022). gamlss.dist: Distributions for Generalized Additive Models for Location Scale and Shape, R package version 6.0-3, https://CRAN.R-project.org/package=gamlss.dist
See Also
Examples
# get an overview of available references per key
table(builtin_references$population, builtin_references$key)
A table to translate between different lexicons (naming schema)
Description
The built-in variable builtin_translate contains a table for
translating among sets of item names into each other.
Usage
builtin_translate
Format
A data.frame with variables:
| Name | Label | 
| phase1 | Item names, Phase 1 data | 
| phase2 | Item names, Phase 2 data | 
| gsed | gsed lexion | 
| gsed2 | gto/gpa lexicon for LF/SF | 
| gsed3 | gl1/gs1 lexicon for LF/SF | 
| short1 | Short item name, phase 1 order | 
| short2 | Short item name, phase 2 order | 
| instrument | Instrument code | 
| seq_phase1 | Phase 1 order | 
| seq_phase2 | Phase 2 order | 
| label | Item label (English) | 
Details
The builtin_translate is created by script
data-raw/R/save_builtin_translate.R.
Updates:
- July 2025 - Tranferred from gsedread package 
Author(s)
Compiled by Stef van Buuren
Calculate posterior of ability
Description
If the tauj is not within the range rello - relhi from the dynamic EAP, the procedure ignores the score of item j.
Usage
calculate_posterior(scores, tau, qp, scale, mu, sd, relhi, rello)
Arguments
| scores | A vector with PASS/FAIL observations.
Scores are coded numerically as  | 
| tau | A vector containing the item difficulties for the item
scores in  | 
| qp | Numeric vector of equally spaced quadrature points. | 
| scale | Scale expansion | 
| mu | Numeric scalar. The mean of the prior. | 
| sd | Numeric scalar. Standard deviation of the prior. | 
| relhi | Positive numeric scalar. Upper end of the relevance interval | 
| rello | Negative numeric scalar. Lower end of the relevance interval | 
Value
A list with three elements:
| Name | Label | 
| eap | Mean of the posterior | 
| gp | Vector of quadrature points | 
| posterior | Vector with posterior distribution. | 
Since dscore V40.1 the function does not return the "start" element.
Author(s)
Stef van Buuren, Arjan Huizing, 2020
Median D-score from the default references for the given key
Description
Returns the age-interpolated median of the D-score of the default reference for a given key.
Usage
count_mu(t, key, prior_mean_NA = NA_real_)
Arguments
| t | Decimal age, numeric vector | 
| key | Character, key of the reference population | 
| prior_mean_NA | Numeric, prior mean when age is missing | 
Details
Do not use this function if you want the median D-score for a specific reference.
DEPRECATED in dscore 1.9.6
Value
A vector of length length(t) with the median of the default reference
population for the key.
Median of Dutch references
Description
Returns the age-interpolated median of the Dutch references (van Buuren 2014).
The working range is 0-3 years. This function is used
to set prior mean under key "dutch".
Usage
count_mu_dutch(t)
Arguments
| t | Decimal age, numeric vector | 
Value
A vector of length length(t) with the median of the Dutch references.
Note
Internal function. Called by dscore()
Examples
dscore:::count_mu_dutch(0:2)
Median of GCDG references
Description
Returns the age-interpolated median of the GCDG references (Weber
et al, 2019). The working range is 0-4 years. This function is used
to set prior mean under keys "gcdg" and "gsed1912".
Usage
count_mu_gcdg(t)
Arguments
| t | Decimal age, numeric vector | 
Value
A vector of length length(t) with the median of the GCDG references.
Note
Internal function. Called by dscore()
Examples
dscore:::count_mu_gcdg(0:2)
Median of phase1 references
Description
Returns the age-interpolated median of the phase1 references
based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used
to set prior mean under keys "293_0" and "gsed2212".
Usage
count_mu_phase1(t)
Arguments
| t | Decimal age, numeric vector | 
Details
The interpolation is done in two rounds. First round: Calculate D-scores using .gcdg prior-mean, calculate reference, estimate round 1 parameters used in this function. Round 2: Calculate D-score using round 1 estimates as the prior mean (most differences are within 0.1 D-score points), recalculate references, estimate round 2 parameters used in this function.
Round 1: Count model: <= 9MN: 21.3449 + 26.4916 t + 7.0251(t + 0.2) Count model: > 9Mn & <= 3.5 YR: 14.69947 - 12.18636 t + 69.11675(t + 0.92) Linear model: > 3.5 YRS: 61.40956 + 3.80904 t
Round 2: Count model: < 9MND: 20.5883 + 27.3376 t + 6.4254(t + 0.2) Count model: > 9MND & < 3.5 YR: 14.63748 - 12.11774 t + 69.05463(t + 0.92) Linear model: > 3.5 YRS: 61.37967 + 3.83513 t
The working range is 0-3.5 years. After the age of 3.5 years, the function will increase at an arbitrary rate of 3.8 D-score points per year.
Value
A vector of length length(t) with the median of the GCDG references.
Note
Internal function. Called by dscore()
Author(s)
Stef van Buuren, on behalf of GSED project
Examples
dscore:::count_mu_phase1(0:5)
Median of preliminary_standards
Description
Returns the age-interpolated median of the preliminary_standards
based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used
to set prior mean under key "gsed2406".
Usage
count_mu_preliminary_standards(t)
Arguments
| t | Decimal age, numeric vector | 
Value
A vector of length length(t) with the median of the GCDG references.
Note
Internal function. Called by dscore()
Author(s)
Stef van Buuren, on behalf of GSED project
Examples
dscore:::count_mu_preliminary_standards(0:5)
Calculate Development-for-Age Z-score (DAZ)
Description
The daz() function calculated the Development-for-Age Z-score (DAZ).
The DAZ represents a child's D-score after adjusting for age by an
external age-conditional reference.
Usage
daz(d, x, reference_table = NULL, dec = 3, verbose = FALSE)
zad(z, x, reference_table = NULL, dec = 2, verbose = FALSE)
Arguments
| d | Vector of D-scores | 
| x | Vector of ages (decimal age) | 
| reference_table | A  | 
| dec | The number of decimals (default  | 
| verbose | Print out the used reference table (default  | 
| z | Vector of standard deviation scores (DAZ) | 
Details
The zad() is the inverse of daz(): Given age and
the Z-score, it finds the raw D-score.
Note 1: The Box-Cox Cole and Green (BCCG) and Box-Cox t (BCT)
distributions model only positive D-score values. To increase
robustness, the daz() and zad() functions will round up any
D-scores lower than 1.0 to 1.0.
Note 2: The daz() and zad() function call modified version of the
pBCT() and qBCT() functions from gamlss for better handling
of NA's and rounding.
Value
Unnamed numeric vector with Z-scores of length length(d).
Unnamed numeric vector with D-scores of length length(z).
Author(s)
Stef van Buuren
References
Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.
See Also
Examples
# using default reference and key
daz(d = c(35, 50), x = c(0.5, 1.0))
# print out names of the used reference table
daz(d = c(35, 50), x = c(0.5, 1.0), verbose = TRUE)
# using the default reference in key gcdg
reftab <- get_reference(key = "gcdg")
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)
# using Dutch reference in default key
reftab <- get_reference(population = "dutch", verbose = TRUE)
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)
# population median at ages 0.5, 1 and 2 years, default reference
zad(z = rep(0, 3), x = c(0.5, 1, 2))
# population median at ages 0.5, 1 and 2 years, gcdg key
reftab <- get_reference(key = "gcdg", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference_table = reftab)
# population median at ages 0.5, 1 and 2 years, dutch key
reftab <- get_reference(key = "dutch", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference = reftab)
Decomposes item names into their four components
Description
This utility function decomposes item names into components: instrument, domain, mode and number
Usage
decompose_itemnames(x)
Arguments
| x | A character vector containing item names (gsed lexicon) | 
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
Value
A data.frame with length(x) rows and
four columns, named: instrument, domain, mode,
and number.
Author(s)
Stef van Buuren
References
https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
decompose_itemnames(itemnames)
D-score estimation
Description
The dscore() function estimates the following quantities: D-score,
a numeric score that quantifies child development by one number,
Development-for-Age Z-score (DAZ) that corrects the D-score for age,
standard error of measurement (SEM) of the D-score.
Usage
dscore(
  data,
  items = names(data),
  key = NULL,
  population = NULL,
  xname = "age",
  xunit = c("decimal", "days", "months"),
  prepend = NULL,
  itembank = NULL,
  metric = c("dscore", "logit"),
  prior_mean = NULL,
  prior_mean_NA = NULL,
  prior_sd = NULL,
  prior_sd_NA = NULL,
  transform = NULL,
  qp = NULL,
  dec = c(2L, 3L),
  relevance = c(-Inf, Inf),
  algorithm = c("current", "1.8.7"),
  verbose = FALSE
)
dscore_posterior(
  data,
  items = names(data),
  key = NULL,
  population = NULL,
  xname = "age",
  xunit = c("decimal", "days", "months"),
  prepend = NULL,
  itembank = NULL,
  metric = c("dscore", "logit"),
  prior_mean = NULL,
  prior_mean_NA = NULL,
  prior_sd = NULL,
  prior_sd_NA = NULL,
  transform = NULL,
  qp = NULL,
  dec = c(2L, 3L),
  relevance = c(-Inf, Inf),
  algorithm = c("current", "1.8.7"),
  verbose = FALSE
)
Arguments
| data | A  | 
| items | A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as  | 
| key | String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key  | 
| population | String. The name of the reference population to calculate
DAZ.
Use  | 
| xname | A string with the name of the age variable in
 | 
| xunit | A string specifying the unit in which age is measured
(either  | 
| prepend | Character vector with column names in  | 
| itembank | A  | 
| metric | A string, either  | 
| prior_mean | 
 | 
| prior_mean_NA | 
 | 
| prior_sd | 
 | 
| prior_sd_NA | 
 | 
| transform | Numeric vector, length 2, containing the intercept
and slope of the linear transform from the logit scale into the
the D-score scale. The default ( | 
| qp | Numeric vector of equally spaced quadrature points.
This vector should span the range of all D-score or logit values.
The default ( | 
| dec | A vector of two integers specifying the number of
decimals for rounding the D-score and DAZ, respectively.
The default is  | 
| relevance | A numeric vector of length with the lower and
upper bounds of the relevance interval. The procedure calculates
a dynamic EAP for each item. If the difficulty level (tau) of the
next item is outside the relevance interval around EAP, the procedure
ignore the score on the item. The default is  | 
| algorithm | Computational method, for backward compatibility.
Either  | 
| verbose | Logical. Print settings. | 
Details
The scoring algorithm is based on the method by Bock and Mislevy (1982). The method uses Bayes rule to update a prior ability into a posterior ability.
The item names should correspond to the "gsed" lexicon.
A key is defined by the set of estimated item difficulties.
| Key | Model | Quadrature | Instruments | Direct/Caregiver | Reference | 
| "dutch" | 75_0 | -10:80 | 1 | direct | Van Buuren, 2014/2020 | 
| "gcdg" | 565_18 | -10:100 | 13 | direct | Weber, 2019 | 
| "gsed1912" | 807_17 | -10:100 | 21 | mixed | GSED Team, 2019 | 
| "293_0" | 293_0 | -10:100 | 2 | mixed | GSED Team, 2022 | 
| "gsed2212" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2022 | 
| "gsed2406" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2024 | 
As a general rule, one should only compare D-scores
that are calculated using the same key and the same
set of quadrature points. For calculating D-scores on new data,
the advice is to use the default, which currently is "gsed2406".
The default starting prior is a mean calculated from a so-called
"Count model" that describes mean D-score as a function of age. The
The Count models are implemented in the function [get_mu()].
By default, the spread of the starting prior
is 5 D-score points around the mean D-score, which corresponds to
approximately 1.5 to 2 times the normal spread of child of a given age. The
starting prior is informative for very short test (say <5 items), but has
little impact on the posterior for larger tests.
Value
The dscore() function returns a data.frame with nrow(data) rows.
Optionally, the first block of columns can be copied to the
result by using prepend. The second block consists of the
following columns:
| Name | Label | 
| a | Decimal age (years) | 
| n | Number of items with valid (0/1) data | 
| p | Percentage of passed milestones | 
| d | D-score, mean of posterior distribution | 
| sem | Standard error of measurement, standard deviation of the posterior | 
| daz | D-score corrected for age, calculated in Z-scale (for metric "dscore") | 
The D-score in column d is a linear scale, with values usually ranging
from 0 to 100. The D-score is NA if age is missing or if age is lower
than -1/12. It is possible to calculate D-scores for cases with missing ages
by setting prior_mean_NA and prior_sd_NA to some reasonable value, e.g.,
prior_mean_NA = 50 and prior_sd_NA = 20, for the sample at hand.
The SEM is a positive number that quantifies the uncertainty of the D-score.
It is NA if the D-score is NA.
The DAZ in column daz is a Z-score that corrects the D-score for age. It
is NA when there are no reference values for the given age, or when
the D-score is extremely unlikely to be valid at the given age.
Advanced applications: The dscore_posterior() function returns a
data frame with nrow(data) rows and length(qp) plus prepended columns
with the full posterior density of the D-score at each quadrature point.
If no valid responses are found, dscore_posterior() returns the
prior density. Versions prior to 1.8.5 returned a matrix (instead of
a data.frame). Code that depends on the result being a matrix may break
and may need adaptation.
Author(s)
Stef van Buuren, Iris Eekhout, Arjan Huizing (2022)
References
Bock DD, Mislevy RJ (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431-444.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
See Also
builtin_keys(), builtin_itembank(), builtin_itemtable(),
builtin_references(), get_tau(), posterior(), milestones()
Examples
# using all defaults and properly formatted data
ds <- dscore(milestones, key = "gsed2406")
head(ds)
# step-by-step example
data <- data.frame(
  id = c(
    "Jane", "Martin", "ID-3", "No. 4", "Five", "6",
    NA_character_, as.character(8:10)
  ),
  age = rep(round(21 / 365.25, 4), 10),
  ddifmd001 = c(NA, NA, 0, 0, 0, 1, 0, 1, 1, 1),
  ddicmm029 = c(NA, NA, NA, 0, 1, 0, 1, 0, 1, 1),
  ddigmd053 = c(NA, 0, 0, 1, 0, 0, 1, 1, 0, 1)
)
items <- names(data)[3:5]
# third item is not part of the default key
get_tau(items, verbose = TRUE)
# calculate D-score
dscore(data, key = "gsed2406")
# prepend id variable to output
dscore(data, prepend = "id", key = "gsed2406")
# or prepend all data
# dscore(data, prepend = colnames(data), key = "gsed2406")
# calculate full posterior
p <- dscore_posterior(data, key = "gsed2406")
# check that rows sum to 1
rowSums(p)
# plot full posterior for measurement 7
barplot(as.matrix(p[7, 12:36]),
  names = 1:25,
  xlab = "D-score", ylab = "Density", col = "grey",
  main = "Full D-score posterior for measurement in row 7",
  sub = "D-score (EAP) = 11.58, SEM = 3.99")
# plot P10, P50 and P90 of D-score references
g <- expand.grid(age = seq(0.1, 4, 0.1), p = c(0.1, 0.5, 0.9))
d <- zad(z = qnorm(g$p), x = g$age, verbose = TRUE)
matplot(
  x = matrix(g$age, ncol = 3), y = matrix(d, ncol = 3), type = "l",
  lty = 1, col = "blue", xlab = "Age (years)", ylab = "D-score",
  main = "D-score preliminary standards: P10, P50 and P90")
abline(h = seq(10, 80, 10), v = seq(0, 4, 0.5), col = "gray", lty = 2)
# add measurements made on very preterms, ga < 32 weeks
ds <- dscore(milestones, key = "gsed2406")
points(x = ds$a, y = ds$d, pch = 19, col = "red")
Get age equivalents of items that have a difficulty estimate
Description
This function calculates the ages at which a certain percent in the reference population passes the items.
Usage
get_age_equivalent(
  items,
  pct = c(10, 50, 90),
  key = NULL,
  population = NULL,
  transform = NULL,
  itembank = dscore::builtin_itembank,
  xunit = c("decimal", "days", "months"),
  verbose = FALSE
)
Arguments
| items | A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as  | 
| pct | Numeric vector with requested percentiles (0-100). The
default is  | 
| key | String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key  | 
| population | String. The name of the reference population to calculate
DAZ.
Use  | 
| transform | Numeric vector, length 2, containing the intercept
and slope of the linear transform from the logit scale into the
the D-score scale. The default ( | 
| itembank | A  | 
| xunit | A string specifying the unit in which age is measured
(either  | 
| verbose | Logical. Print settings. | 
Value
data.frame with four columns: item, d (D-score),
pct (percentile), and a (age-equivalent, in xunit units).
Note
The function internally defines a scale factor given the key.
Examples
get_age_equivalent(c("gpagmc018", "gtogmd026", "ddicmm050"),
  key = "gsed2406", population = "dutch", verbose = TRUE)
Extract item names
Description
The get_itemnames() function matches names against the 9-code
template. This is useful for quickly selecting names of items from a larger
set of names.
Usage
get_itemnames(
  x,
  instrument = NULL,
  domain = NULL,
  mode = NULL,
  number = NULL,
  strict = FALSE,
  itemtable = NULL,
  order = "idnm"
)
Arguments
| x | A character vector,  | 
| instrument | A character vector with 3-position codes of instruments
that should match. The default  | 
| domain | A character vector with 2-position codes of domains
that should match. The default  | 
| mode | A character vector with 1-position codes of the mode
of administration. The default  | 
| number | A numeric or character vector with item numbers.
The default  | 
| strict | A logical specifying whether the resulting item
names must conform to one of the built-in names. The default is
 | 
| itemtable | A  | 
| order | A four-letter string specifying the sorting order.
The four letters are:  | 
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
Value
A vector with names of items
Author(s)
Stef van Buuren
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "age", "mdsgmd999")
# filter out impossible names
get_itemnames(itemnames)
get_itemnames(itemnames, strict = TRUE)
# only items from specific instruments
get_itemnames(itemnames, instrument = c("aqi", "mds"))
get_itemnames(itemnames, instrument = c("aqi", "mds"), strict = TRUE)
# get all items from the se domain of iyo instrument
get_itemnames(domain = "se", instrument = "iyo")
# get all item from the se domain with direct assessment mode
get_itemnames(domain = "se", mode = "d")
# get all item numbers 70 and 73 from gm domain
get_itemnames(number = c(70, 73), domain = "gm")
# get item names from GSED SF (2023 version) in published order
items_sf <- get_itemnames(instrument = "gs1", order = "indm")
# get item names from GSED LF (2023 version) in published order
items_lf <- get_itemnames(instrument = "gl1")
items_lf <- items_lf[c(55:155, 1:54)]
Get a subset of items from the itemtable
Description
The builtin_itemtable object in the dscore package
contains basic meta-information about items: a name, the equate group,
and the item label.
The get_itemtable() function returns a subset of items
in the itemtable.
Usage
get_itemtable(items = NULL, itemtable = NULL, decompose = FALSE)
Arguments
| items | A logical or character vector of item names to return. The
default ( | 
| itemtable | A  | 
| decompose | If  | 
Value
A data.frame with seven columns.
See Also
Examples
head(get_itemtable(), 3)
get_itemtable(LETTERS[1:3], "")
Get labels for items
Description
The get_labels() function obtains the item labels for a
specified set of items.
Usage
get_labels(items = NULL, trim = NULL, itemtable = NULL)
Arguments
| items | A character vector of item names to return. The
default ( | 
| trim | The maximum number of characters in the label. The
default  | 
| itemtable | A  | 
Value
A named character vector with length(items) elements with
item labels, in the same order as in items.
See Also
builtin_itemtable(), get_itemnames()
Examples
# get labels of first two Macarthur items
get_labels(get_itemnames(instrument = "mac", number = 1:2), trim = 40)
Median D-score from the base population for a given key
Description
Returns the age-interpolated median of the D-score of the default reference for a given key.
Usage
get_mu(t, key, prior_mean_NA = NA_real_)
Arguments
| t | Decimal age, numeric vector | 
| key | Character, key of the reference population | 
| prior_mean_NA | Numeric, prior mean when age is missing | 
Details
Use get_reference() for more options.
Value
A vector of length length(t) with the median of the default reference
population for the key.
Get D-score reference
Description
The get_reference() function selects the D-score reference
distribution.
Usage
get_reference(
  population = NULL,
  key = NULL,
  references = dscore::builtin_references,
  verbose = FALSE,
  ...
)
Arguments
| population | String. The name of the reference population to calculate
DAZ.
Use  | 
| key | String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key  | 
| references | A  | 
| verbose | Logical. Print settings. | 
| ... | Used to test whether the call contained the deprecated argument
 | 
Value
A data.frame with the LMS reference values.
Note
No references for population "gsed" exist.
The function will silently rewrite population = "gsed"
into to the population = "gsed".
The "dutch" reference was published in Van Buuren (2014)
The "gcdg" was calculated from 15 cohorts with direct
observations (Weber, 2019).
The "phase1" references were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The
age range 3.5-5 yrs is linearly extrapolated and are only indicative.
The "preliminary_standards" references were calculated from the GSED
Phase 1 validation using a subset of children with healthy development.
References
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368.
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf.
See Also
Examples
# see key-population combinations of builtin_references
table(builtin_references$key, builtin_references$population)
# get the default reference
reftab <- get_reference()
head(reftab, 2)
# get the default reference for the key "gsed2212"
reftab <- get_reference(key = "gsed2212", verbose = TRUE)
# get dutch reference for default key
reftab <- get_reference(population = "dutch", verbose = TRUE)
# loading a non-existing reference yields zero rows
reftab <- get_reference(population = "france", verbose = TRUE)
nrow(reftab)
Obtain difficulty parameters from item bank
Description
Searches the item bank for matching items, and returns the difficulty estimates. Matching is done by item name. Comparisons are done in lower case.
Usage
get_tau(
  items,
  key = NULL,
  itembank = dscore::builtin_itembank,
  verbose = FALSE
)
Arguments
| items | A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as  | 
| key | String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key  | 
| itembank | A  | 
| verbose | Logical. Print settings. | 
Value
A named vector with the difficulty estimate per item with
length(items) elements.
Author(s)
Stef van Buuren 2020
See Also
Examples
# difficulty levels in the GHAP lexicon
get_tau(items = c("ddifmd001", "DDigmd052", "xyz"))
Sample of 10 children from the GSED Phase 1 study
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
gsample
Format
A data.frame with 10 rows and 295 variables:
| Name | Label | 
| id | Integer, child ID | 
| agedays | Integer, age in days | 
| gpalac001 | Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered | 
| gpalac002 | Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered | 
| ... | and so on.. | 
There are 138 gpa items (item gpamoc008 (clench fists) removed) from GSED SF and
and 155 gto items from GSED LF.
Details
On July 15, 2025, the item gpaclc088 was renamed to gpaclc089
(Can you child say five or more separate words) and gpasec089 was renamed
to gpasec088 (Is your child able to pee and poo).
See Also
Examples
head(gsample)
Outcomes on developmental milestones for preterm-born children
Description
A demo dataset with developmental scores at the item level for a set of 27 preterm children.
Usage
milestones
Format
A data.frame with 100 rows and 62 variables:
| Name | Label | 
| id | Integer, child ID | 
| agedays | Integer, age in days | 
| age | Numeric, decimal age in years | 
| sex | Character, "male", "female" | 
| gagebrth | Integer, gestational age in days | 
| ddifmd001 | Integer, Fixates eyes: 1 = yes, 0 = no | 
| ... | and so on.. | 
See Also
Examples
head(milestones)
Normalize distribution
Description
Normalizes the distribution so that the total mass equals 1.
Usage
normalize(d, qp)
Arguments
| d | A vector with  | 
| qp | Vector of equally spaced quadrature points. | 
Value
A vector of length(d) elements with
the prior density estimate at each quadature point.
Note
: Internal function
Examples
dscore:::normalize(c(5, 10, 5), qp = c(0, 1, 2))
sum(dscore:::normalize(rnorm(5), qp = 1:5))
Calculate posterior for one item given score, difficulty and prior
Description
Calculate posterior for one item given score, difficulty and prior
Usage
posterior(score, tau, prior, qp, scale)
Arguments
| score | Integer, either 0 (fail) and 1 (pass) | 
| tau | Numeric, difficulty parameter | 
| prior | Vector of prior values on quadrature points  | 
| qp | vector of equally spaced quadrature points | 
| scale | expansion relative to the logit scale | 
Details
This function assumes that the difficulties have been estimated by
a binary Rasch model, e.g. by rasch.pairwise.itemcluster() of
the sirt package.
Value
A vector of length length(prior)
Note
: Internal function
Author(s)
Stef van Buuren, Arjan Huizing, 2020
See Also
Rename items from gcdg into gsed lexicon
Description
Function rename_gcdg_gsed() translates item names in the
gcdg lexicon to item names in the gsed lexicon.
Usage
rename_gcdg_gsed(x, copy = TRUE)
Arguments
| x | A character vector containing item names in the gcdg lexicon | 
| copy | A logical indicating whether any unmatches names should
be copied ( | 
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
The function currently support ASQ-I (aqi), Barrera-Moncade (bar), Batelle (bat), Bayley I (by1), Bayley II (by2), Bayley III (by3), Dutch Development Instrument (ddi), Denver (den), Griffith (gri), MacArthur (mac), WHO milestones (mds), Mullen (mul), pegboard (peg), South African Griffith (sgr), Stanford Binet (sbi), Tepsi (tep), Vineland (vin).
In cases where the domain of the items isn't clear (vin, bar), the domain is coded as 'xx'.
Value
A character vector of length length(x) with gcdg
item names replaced by gsed item name.
Author(s)
Iris Eekhout, Stef van Buuren
References
https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0
Examples
from <- c(
  "ag28", "gh2_19", "a14ps4", "b1m157", "mil6",
  "bm19", "a16fm4", "n22", "ag9", "gh6_5"
)
to <- rename_gcdg_gsed(from, copy = FALSE)
to
Rename character vector
Description
Translates names between different lexicons (naming schema).
Usage
rename_vector(
  input,
  lexin = c("phase2", "phase1", "short1", "short2", "gsed", "gsed2", "gsed3"),
  lexout = c("gsed3", "gsed2", "gsed", "short2", "short1", "phase1", "phase2"),
  notfound = "copy",
  contains = c("", "Ma_SF_", "Ma_LF_", "bsid_"),
  underscore = TRUE,
  trim = "Ma_",
  lowercase = TRUE,
  force_subjid_agedays = FALSE
)
Arguments
| input | A character vector with names to be translated | 
| lexin | A string indicating the input lexicon. One of  | 
| lexout | A string indicating the output lexicon. One of  | 
| notfound | A string indicating what to do some input value is not found | 
| contains | A string to filter the translation table prior to matching. Needed to prevent double matches. The default ("") does not filter. | 
| underscore | Replaces space (" ") and dash ("-") by underscore ("_") | 
| trim | A substring to be removed from  | 
| lowercase | Sets all variables in lower case.
in  | 
| force_subjid_agedays | If  | 
Details
The recommended approach for reading new data is to name the columns
according to the names defined by "short2" and the apply rename_vector()
to translate the names to the "gsed3" lexicon.
The lexicons "phase1", "short1", "gsed" and "gsed2" are included
for backward compatibility, and are not recommended for use with new
data.
Value
A character vector of the same length as input with processed
names.
Examples
# Using Ma_SF_Cxx as input names, 2023 SF/LF version
input <- c("file", "GSED_ID", "Ma_SF_Parent ID", "Ma_SF_C01", "Ma_SF_C02")
rename_vector(input)
rename_vector(input, lexout = "short2", lowercase = FALSE)
rename_vector(input, lexout = "gsed3", trim = "Ma_SF_")
# Convert short names to gsed names
input <- c("file", "GSED_ID", "Ma_SF_Parent ID", paste0("SF00", 1:3))
rename_vector(input, lexin = "short2", lowercase = TRUE)
Sample of 10 children from GSED HF
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_hf
Format
A data.frame with 10 rows and 57 variables:
| Name | Label | 
| subjid | Integer, child ID | 
| agedays | Integer, age in days | 
| hf001 | Integer, ...: 1 = yes, 0 = no, NA = not administered | 
| hf002 | Integer, ...: 1 = yes, 0 = no, NA = not administered | 
| ... | and so on.. | 
Sample data for 55 gpa items forming GSED HF V1
See Also
Examples
head(sample_hf)
Sample of 10 children from gto (LF)
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_lf
Format
A data.frame with 10 rows and 157 variables:
| Name | Label | 
| subjid | Integer, child ID | 
| agedays | Integer, age in days | 
| lf001 | Integer, ...: 1 = yes, 0 = no, NA = not administered | 
| lf002 | Integer, ...: 1 = yes, 0 = no, NA = not administered | 
| ... | and so on.. | 
Sample data for 155 gto items from GSED SF
See Also
Examples
head(sample_lf)
Sample of 10 children from gpa (SF)
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_sf
Format
A data.frame with 10 rows and 141 variables:
| Name | Label | 
| subjid | Integer, child ID | 
| agedays | Integer, age in days | 
| sf001 | Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered | 
| sf002 | Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered | 
| ... | and so on.. | 
Sample data for 139 gpa items from GSED SF
#' @details
On July 15, 2025, the item gpaclc088 was renamed to gpaclc089
(Can you child say five or more separate words) and gpasec089 was renamed
to gpasec088 (Is your child able to pee and poo).
See Also
Examples
head(sample_sf)
Sorts item names according to user-specified priority
Description
This function sorts the item names according to instrument, domain, mode and number. The user can specify the sorting order.
Usage
sort_itemnames(x, order = "idnm")
order_itemnames(x, order = "idnm")
Arguments
| x | A character vector containing item names (gsed lexicon) | 
| order | A four-letter string specifying the sorting order.
The four letters are:  | 
Value
sort_itemnames() return a character vector with
length(x) sorted elements. order_itemnames() return
an integer vector of length length(x) with positions of
the sorted elements.
Author(s)
Stef van Buuren
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
sort_itemnames(itemnames)