Help for package rioplot

Type:

Package

Title:

Turn a Regression Model Inside Out

Version:

1.1.2

Maintainer:

David Melamed <dmmelamed@gmail.com>

Description:

Turns regression models inside out. Functions decompose variances and coefficients for various regression model types. Functions also visualize regression model objects using techniques developed in Schoon, Melamed, and Breiger (2024) <doi:10.1017/9781108887205>.

VignetteBuilder:

knitr

Depends:

R (≥ 3.5.0), ggplot2, methods

Suggests:

dplyr, knitr, rmarkdown, ggrepel, MASS

License:

GPL-2 | GPL-3

Encoding:

UTF-8

LazyData:

true

NeedsCompilation:

Packaged:

2026-01-16 16:24:30 UTC; melamed.9

Author:

David Melamed

[aut, cre], Ronald L. Breiger [aut], Eric W. Schoon [aut]

Repository:

CRAN

Date/Publication:

2026-01-16 16:50:02 UTC

Replication data for Beckfield (2006) as re-analyzed by Schoon, Melamed, and Breiger (2024)

Description

Beckfield (2006) analyzed these data using fixed and random effects regression models. He showed that regional economic and political integregation is associated with increased economic inequality. Schoon, Melamed, and Breiger (2024) turned these models inside out and decomposed the model coefficients.

Usage

data("Beckfield06")

Format

A data frame with 48 observations on the following 9 variables.

year: a numeric vector
polint: a numeric vector
ecoint: a numeric vector
ecoints: a numeric vector
gdp: a numeric vector
trans: a numeric vector
outflo: a numeric vector
gini: a numeric vector
countryid: a character vector

References

Beckfield, Jason. 2006. "European integration and income inequality."" American Sociological Review 71(6): 964-985. Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Beckfield06)
head(Beckfield06)

Subset of data from the General Social Survey from 2016. Data were analyzed in Schoon, Melamed, and Breiger (2024).

Description

Subset of data from the General Social Survey from 2016. Data were analyzed in Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("GSS.2016")

Format

A data frame with 2867 observations on the following 27 variables.

sclass: a numeric vector
fulltime: a numeric vector
retired: a numeric vector
hrsworked: a numeric vector
occprestige: a numeric vector
occprestige_partner: a numeric vector
occprestige_mother: a numeric vector
occprestige_father: a numeric vector
children: a numeric vector
age: a numeric vector
educ: a numeric vector
paeduc: a numeric vector
maeduc: a numeric vector
speduc: a numeric vector
babs: a numeric vector
female: a numeric vector
white: a numeric vector
black: a numeric vector
other: a numeric vector
income: a numeric vector
republican: a numeric vector
conservative: a numeric vector
environment: a numeric vector
helpblackpeople: a numeric vector
science: a numeric vector
govequalwealth: a numeric vector
pclass: a numeric vector

References

Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(GSS.2016)
head(GSS.2016)

Subset of the General Social Survey analyzed by Schoon, Melamed, and Breiger (2024)

Description

Subset of the General Social Survey analyzed by Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("GSS2018")

Format

A data frame with 558 observations on the following 7 variables.

dog: a numeric vector
race: a numeric vector
sex: a numeric vector
children: a numeric vector
married: a numeric vector
age: a numeric vector
income: a numeric vector

References

Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(GSS2018)
head(GSS2018)

Replication data for regression models with a count dependent variable.

Description

Data analyzed by Hilbe (2011), and used here to illustrate model visualization and coefficient decomposition for count models.

Usage

data("Hilbe")

Format

A data frame with 601 observations on the following 9 variables.

naffairs: a numeric vector
avgmarr: a numeric vector
hapavg: a numeric vector
vryhap: a numeric vector
smerel: a numeric vector
vryrel: a numeric vector
yrsmarr4: a numeric vector
yrsmarr5: a numeric vector
yrsmarr6: a numeric vector

Source

Hilbe, Joseph M., 2011. Negative binomial regression. NY: Cambridge University Press.

Examples

data(Hilbe)
head(Hilbe)

Data to replicate OLS regression models reported in Kenworthy (1999).

Description

Data to replicate OLS regression models reported in Kenworthy (1999). Data were analyzed in Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("Kenworthy99")

Format

A data frame with 15 observations on the following 6 variables.

dv: a numeric vector
gdp: a numeric vector
pov: a numeric vector
tran: a numeric vector
ISO3: a character vector
nation.long: a character vector

References

Kenworthy, Lane. 1999. "Do social-welfare policies reduce poverty? A cross-national assessment."" Social Forces 77(3): 1119-1139. Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Kenworthy99)
head(Kenworthy99)

Subset of replication data from Ragin and Fiss (2017).

Description

Subset of replication data from Ragin and Fiss (2017). Data were analyzed in Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("RaginData")

Format

A data frame with 4185 observations on the following 10 variables.

incrat: a numeric
pinc: a numeric
ped: a numeric
resp_ed: a numeric
afqt: a numeric
kids: a numeric
married: a numeric
black: a numeric
male: a numeric
povd: a numeric

References

Ragin, Charles C. and Peer C. Fiss. 2017. Intersectional inequality: Race, class, test scores, and poverty. Chicago, IL: University of Chicago Press. Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(RaginData)
head(RaginData)

Subset of replication data from Schneider and Makszin (2014).

Description

Subset of replication data from Schneider and Makszin (2014). Data were analyzed in Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("SchneiderAndMakszin06")

Format

A data frame with 30 observations on the following 36 variables.

id: a character vector
country: a character vector
year: a numeric vector
fde: a numeric vector
fde_cilb: a numeric vector
fde_ciub: a numeric vector
wcoord: a numeric vector
govint: a numeric vector
ud: a numeric vector
epl: a numeric vector
socexp: a numeric vector
eduexp: a numeric vector
vet_un: a numeric vector
lmexp: a numeric vector
wagecov: a numeric vector
vet_isced3: a numeric vector
eduexp_pri: a numeric vector
edu_terenr: a numeric vector
vt_reg: a numeric vector
vt_vap: a numeric vector
compvote: a numeric vector
fde2: a numeric vector
low_fde_l: a numeric vector
high_fde_l: a numeric vector
high_wc_l: a numeric vector
high_int_l: a numeric vector
high_ud_l: a numeric vector
high_epl_l: a numeric vector
high_socx_l: a numeric vector
high_edux_l: a numeric vector
high_lmx_l: a numeric vector
high_vet_l: a numeric vector
p1_y: a numeric vector
p2_y: a numeric vector
p3_y: a numeric vector
sol_y: a numeric vector

References

Schneider, Carsten Q., and Kristin Makszin. 2014. "Forms of welfare capitalism and education-based participatory inequality." Socio-Economic Review 12(2): 437-462. Schoon, Eric W., David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(SchneiderAndMakszin06)
head(SchneiderAndMakszin06)

Subset of replication data from Wimmer, Cederman, and Min (2009).

Description

Subset of replication data from Wimmer, Cederman, and Min (2009). Data were analyzed in Schoon, Melamed, and Breiger (2024). Full details on the variable selection and source information is available therein.

Usage

data("Wimmer_et_al_EPR")

Format

A data frame with 7908 observations on the following 80 variables.

yearc: a numeric
year: a numeric
cowcode: a numeric
country: a character
gdpcap: a numeric
gdpcapl: a numeric
oilpc: a numeric
oilpcl: a numeric
popavg: a numeric
lpopl: a numeric
ethfrac: a numeric
western: a numeric
eeurop: a numeric
lamerica: a numeric
ssafrica: a numeric
asia: a numeric
nafrme: a numeric
lmtnest: a numeric
polity2: a numeric
polity: a numeric
anoc: a numeric
anocl: a numeric
democ: a numeric
democl: a numeric
regchg3: a numeric
pimppast: a numeric
groups: a numeric
egipgrps: a numeric
exclgrps: a numeric
exclpop: a numeric
lrexclpop: a numeric
ttlpop: a numeric
discpop: a numeric
pwrlpop: a numeric
olppop: a numeric
olpspop: a numeric
jppop: a numeric
sppop: a numeric
dompop: a numeric
monpop: a numeric
maxexclpop: a numeric
maxegippop: a numeric
maxpop: a numeric
newonset: a numeric
newethonset: a numeric
newhionset: a numeric
newethhionset: a numeric
onsetstatus: a numeric
onsetstatus2: a numeric
actoraim: a numeric
actoraim2: a numeric
ongoingwarl: a numeric
ongoinghiwarl: a numeric
newonset2: a numeric
newhionset2: a numeric
newethonset2: a numeric
warlfl: a numeric
onsetfl: a numeric
ethonsetfl: a numeric
onsetfl2: a numeric
ethonsetfl2: a numeric
warstns2: a numeric
warstns1: a numeric
atwarnsl: a numeric
npeaceyears: a numeric
nspline1: a numeric
nspline2: a numeric
nspline3: a numeric
hpeaceyears: a numeric
hspline1: a numeric
hspline2: a numeric
hspline3: a numeric
fpeaceyears: a numeric
fspline1: a numeric
fspline2: a numeric
fspline3: a numeric
speaceyears: a numeric
sspline1: a numeric
sspline2: a numeric
sspline3: a numeric

References

Wimmer, Andreas, Lars-Erik Cederman, and Brian Min. 2009. "Ethnic politics and armed conflict: A configurational analysis of a new global data set." American Sociological Review 74(2): 316-337.

Examples

data(Wimmer_et_al_EPR)
head(Wimmer_et_al_EPR)

Compute the Cosine similarity between two points.

Description

Given two points, the function computes the cosine similarity between them.

Usage

cosine(x,y)

Arguments

x

Point 1

y

Point 2

Value

The cosine similarity, ranging between -1 and +1.

Author(s)

Ronald L. Breiger, David Melamed and Eric Schoon

References

Schoon, Eric, David Melamed, and Ronald L. Breiger. 2023. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Kenworthy99)
m1 <- lm(scale(dv) ~ scale(gdp) + scale(pov) + scale(tran) -1,data=Kenworthy99)
rp1 <- rio.plot(m1,include.int="no",r1=1:15)
cosine(rp1$row.dimensions[15,],rp1$row.dimensions[8,]) 
# cosine similarity between USA and Ireland

cosine(rp1$row.dimensions[15,],rp1$row.dimensions[14,]) 
# cosine similarity between USA and United Kingdom

Decompose the Results of a Regression Model by Cases

Description

This function takes a regression model object and a vector of case assignments to groups (note, cases can be in their own group) and computes each cases' contribution to the overall regression coefficients.

Usage

decompose.model(m1,group.by=group.by,include.int="yes",model.type="OLS")

Arguments

m1

A regression model object. OLS, logistic, Poisson and negative binomial regression are supported.

group.by

A numeric vector denoting group membership. Should be the same length as the number of cases.

include.int

Whether the regression model included an intercept. Default is "yes."

model.type

Type of model to be decomposed. OLS via lm, logistic via glm ("logit"), Poisson via glm ("poisson"), and negative binomial via MASS ("nb") are supported.

Value

decomp.coef

Each case's or subset of cases' contribution to the estimated slope or regression coefficient.

decomp.var

Each case's or subset of cases' contribution to the variance of the estimated slope or regression coefficient.

Author(s)

David Melamed, Ronald L. Breiger, and Eric Schoon

References

Schoon, Eric, David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Kenworthy99)
m1 <- lm(scale(dv) ~ scale(gdp) + scale(pov) + scale(tran) -1,data=Kenworthy99)
decompose.model(m1,group.by=c("Liberal","Corp","Liberal",
"SocDem","SocDem","Corp","Corp","Corp","Corp","Corp","SocDem",
"SocDem","Liberal","Liberal","Liberal"),include.int="no")

Project point 1 onto the line (at 90 degress) running through point 2 and the origin (0,0).

Description

Given two points, p1 and p2, this function identifies the point at which p1 is projected onto the line connecting p2 and the origin (0,0). The projection occurs at a right angle.

Usage

project.point(p1,p2)

Arguments

p1

First point, the one that is to be projected onto point 2.

p2

Second point, the one that is projected to the origin. This is the outcome or dependent variable in our book. See reference below.

Details

The output is just a single point. This is implemented as the point to which lines are drawn in many graphs.

Value

Two values which correspond to the x and y co-ordinates in the graph.

Author(s)

David Melamed, Ronald L. Breiger, and Eric Schoon

References

Schoon, Eric, David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Kenworthy99)
m1 <- lm(scale(dv) ~ scale(gdp) + scale(pov) + scale(tran) -1,data=Kenworthy99)
rp1 <- rio.plot(m1,include.int="no",r1=1:15)
project.point(as.numeric(rp1$col.dimensions[1,]),as.numeric(rp1$row.dimensions[1,]))

Regression Inside Out: Plotting Regression Models

Description

rio.plot is used to generate a reduced rank image of a regression model. The function computes row and column dimensions for both cases and variables, and generates an image of the model based on those scores.

Usage

rio.plot(m1,exclude.vars="no",r1="none",case.names="",col.names="no",
h.just=-.2,v.just=0,case.col="blue",var.name.col="black",
include.int="yes",group.cases=1,model.type="OLS")

Arguments

m1

a regression model object. Supported models include OLS, Logistic, Poisson, and Negative Binomial Regression.

exclude.vars

an optional numerical vector indicating variables from the model to exclude from the plot of the model.

r1

an optional numerical vector indicating cases to include in the plot. By default, all cases are excluded from the plot.

case.names

a character string of names to label the cases. Should be the same length as 'r1.'

col.names

whether to include the variable names in the plot. Default is "no"

h.just

horizontal justification in the plot. Default is -.2

v.just

vertical justification in the plot. Default is 0

case.col

if cases are added to the plot, this is their color. Default is "blue"

var.name.col

Color of the names of variables in the plot. Default is "black"

include.int

Whether the underlying model included a model intercept. Default is "yes"

group.cases

Whether to aggregate cases into clusters or subsets. If yes, provide a numeric vector of memberships. It will aggregate over them by summing.

model.type

The type of regression model. OLS is supported via the lm function. Logistic and Poisson regression are supported via the glm function. Negative Binomial regression is supported via the MASS package. Default is "OLS." For logistic regression, use "logit." For Poisson regression, use "poisson." For negative binomial regression, use "nb."

Details

The function take a regression model object (OLS, logistic, Poisson, or negative binomial) and computes the corresponding row (case) and column (variables) scores. The scores are part of the output, as is a ggplot object of the model.

Value

rio.plot returns several objects.

p1

a ggplot object of the model space, given the terms in the function

row.dimensions

the scores assigned to each case, or each subset of cases if they were aggregated using the 'group.cases' option. These are the co-ordinates in the plot.

col.dimensions

the scores assigned to each variable. These are the co-ordinates in the plot.

case.variances

each cases' contribution (or each subsets' contribution) to the variance of the estimated regression coefficient

U

The orthogonalized column space matrix from the Singular Value Decomposition of the predictor matrix and fitted values.

UUt

The orthogonalized column space matrix from the Singular Value Decomposition of the predictor matrix and fitted values, post-multiplied by its transpose.

Author(s)

David Melamed, Ronald L. Breiger, and Eric Schoon

References

Schoon, Eric, David Melamed, and Ronald L. Breiger. 2024. Regression Inside Out. NY: Cambridge University Press.

Examples

data(Kenworthy99)
m1 <- lm(scale(dv) ~ scale(gdp) + scale(pov) + scale(tran) -1,data=Kenworthy99)
rp1 <- rio.plot(m1,include.int="no")
names(rp1)
rp1$gg.obj 
# rp1$gg.obj + ggplot2::scale_x_continuous(limits=c(-.55,1)) # useful option

rp2 <- rio.plot(m1,r1=1:15,case.names=paste(1:15),include.int="no")
rp2$gg.obj

Kenworthy99 <- data.frame(Kenworthy99,type=c("Liberal","Corp","Liberal",
"SocDem","SocDem","Corp","Corp","Corp","Corp","Corp","SocDem","SocDem",
"Liberal","Liberal","Liberal"))

rp3 <- rio.plot(m1,r1=1:15,group.cases=Kenworthy99$type,include.int="no")
rp3$gg.obj 
# rp3$gg.obj + ggplot2::scale_x_continuous(limits=c(-1,20))