Generalized linear model (R code)

2 minute read

R packages

car
Mass
lme4 (liner mixed model)

1.Linear regression model

Linear relationship of coefficients
Normality of residuals (Shapilo wilk test, Q-Q plot, histogram)
Homoscedasticity (statistical test, variance does not depend on level of X)
Independent observation (i.e., no correlation between samples)

#plot
RR_plot <- ggplot(data, aes(x = X, y = Y, colour = males))
RR_plot <- RR_plot + geom_point()
RR_plot <- RR_plot + scale_colour_brewer(palette = "Set1")
RR_plot <- RR_plot + geom_smooth(method = "lm", formula = y ~ x, aes(col = “lift”))
RR_plot <- RR_plot + labs(title = "Linear regression", x = "Number of drivers (log10)", y = "deaths (log10)")
RR_plot <- RR_plot + theme_classic() + theme(
  axis.line = element_line(linewidth = 1.0, lineend = "square"),
  text = element_text(colour ="black", size = 14),
  legend.position = "none",)

#linear model
res_lm <-  lm(Y ~ X + males + X*males, data = data) 
summary(res_lm)

#Regression and confidence interval
X_new <- data.frame(X = 23:60)
Conf <- predict(res_lm, X_new, interval = ‘confidence’, level = 0.95)
Conf <- predict(res_lm, X_new, interval = ‘prediction’, level = 0.95)

Normality & Homoscedasticity
Outliners
Collinearity
Coversion of variables
Interaction
Confounding
Fitness or predictive performance

Conversion of varaibles, interaction, confunding could be identified with likelihood test.(But Baysian modeling is better)

Fit the model containing the variables we want to test and save the log-likelihood.
Fit the model removing the variables we want to test and save the log-likelihood.
Compare the two log-likelihoods to perform the likelihood ratio test (e.g., testing whether interaction is present)
Compute the odds ratio

Step-wise method (should not be used)

lm_model <- lm(data = fev, logfev ~ Age + Hgt + sex + smoke)
 
ols_step_both_p(lm_model, pent = 0.05, prem = 0.051) #The P-value of 0.05 is the threshold.
ols_step_best_subset(lm_model)

2.Logistic regression analysis

Output is binary variable
Linear relationship of coefficient
Independent observations

#pin : プロット領域, par : 余白の指定
par(pin = c(3,3))
par(mar = c(3, 3, 3, 3))
p1 <- ggplot(data, aes(x = Clinical, y = WBE)) #WBE: 0 or 1
p1 <- p1+ geom_point(alpha = 0.4, shape = 21, size = 2, colour = "black", fill = "grey")  + scale_x_log10()
p1 <- p1 + stat_smooth(method = glm, method.args = list(family = binomial), fullrange = TRUE)


lo_1 <- glm(disease ~ passive, data = ex_1[ex_1$personal ==0, ], family = binomial(link = “logit”))
summary(lo_1)

#compute the odds ratio
library(oddsratio)
or_glm(ex_1, lo_1, incr = list(passive = 1))

3. Multiple impuation

library(mice)
library(Amelia)
library (miceadds)
R code
詳しくは、”欠測データ処理 Rによる単一代入法と多重代入法, P97

library(mice)
library(miceadds)　#library(Amelia)を使用するのもよい

# 例となるデータの作成 (# 一部の値を欠損させる)
set.seed(123)
mydata <- data.frame(
  y = rbinom(100, 1, 0.5),            # 応答変数（0か1の二項変数）
  x1 = rnorm(100),                    # 説明変数1
  x2 = rnorm(100),                    # 説明変数2
  x3 = rnorm(100)                     # 説明変数3
)
mydata[sample(1:100, 20), "x1"] <- NA
mydata[sample(1:100, 20), "x2"] <- NA


# 欠損値を多重代入で補完（mice）
imp <- mice(mydata, m = 5, method = "pmm", seed = 500)

# 補完されたデータセットでロジスティック回帰モデルをフィッティング
logit_model <- with.mids(imp, glm(y ~ x1 + x2 + x3, family = binomial()))
pooled_results <- pool(logit_model)
summary(pooled_results)

Share on

X Facebook LinkedIn Bluesky

Hiroki Ando (安藤宏紀)

Generalized linear model (R code)

R packages

1.Linear regression model

Step-wise method (should not be used)

2.Logistic regression analysis

3. Multiple impuation

Share on

You May Also Enjoy

Bayesian modeling (R code)

Statistical test & epidemiological study designs (R code)

Data visualization

Data organization & Basic analysis

Hiroki Ando (安藤 宏紀)

R packages

1.Linear regression model

Step-wise method (should not be used)

2.Logistic regression analysis

3. Multiple impuation

Share on

You May Also Enjoy

Bayesian modeling (R code)

Statistical test & epidemiological study designs (R code)

Data visualization

Data organization & Basic analysis

Hiroki Ando (安藤宏紀)