[Data] Alligator Data: In a study by the Florida Game and Fresh Water Fish Commission on the foods that alligators in the wild choose to eat, 59 alligators in Lake George, Florida, were sampled and the primary food type found in the alligators stomach was recorded along with the alligator length.
Response outcome: primary food choice (three levels: Fish, Inverterbrates, and Others)
One explanatory variable: alligator length (continuous)
[Goal] Study the relationship between the alligator length and primary food choice.
[Code]
alligator = read.csv("../data/alligator.csv",header=TRUE)
plot(food ~ length, data=alligator)
library(nnet) # run install.packages("nnet") first
## Warning: package 'nnet' was built under R version 3.4.2
# set "Others" as the baseline; otherwise, the defualt is the first one "Fish".
alligator$food <- relevel(alligator$food, ref = "O")
fit = multinom(food ~ length, data=alligator)
## # weights: 9 (4 variable)
## initial value 64.818125
## iter 10 value 49.170785
## final value 49.170622
## converged
summary(fit)
## Call:
## multinom(formula = food ~ length, data = alligator)
##
## Coefficients:
## (Intercept) length
## F 1.617952 -0.1101836
## I 5.697543 -2.4654695
##
## Std. Errors:
## (Intercept) length
## F 1.307291 0.5170838
## I 1.793820 0.8996485
##
## Residual Deviance: 98.34124
## AIC: 106.3412
observed = matrix(c(371,250,64,25, 49,45,9,5, 74,71,15,13), ncol=3)
observed
## [,1] [,2] [,3]
## [1,] 371 49 74
## [2,] 250 45 71
## [3,] 64 9 15
## [4,] 25 5 13
hat_pi = matrix(c(0.76,0.68,0.71,0.62, 0.10,0.12,0.10,0.12, 0.14,0.20,0.19,0.26),ncol=3)
hat_pi
## [,1] [,2] [,3]
## [1,] 0.76 0.10 0.14
## [2,] 0.68 0.12 0.20
## [3,] 0.71 0.10 0.19
## [4,] 0.62 0.12 0.26
num.obs = rowSums(observed)
num.obs
## [1] 494 366 88 43
expected = diag(num.obs) %*% hat_pi
expected
## [,1] [,2] [,3]
## [1,] 375.44 49.40 69.16
## [2,] 248.88 43.92 73.20
## [3,] 62.48 8.80 16.72
## [4,] 26.66 5.16 11.18
G2 = 2*sum(observed * log(observed/expected))
G2
## [1] 1.101727
X2 = sum((observed - expected)^2/expected)
X2
## [1] 1.115243
[Data] Party Data: Contingency table on Political Ideology and Party Affiliation
Response variable: Political ideology has a five-point ordinal scale, ranging from very liberal to very conservative.
One explanatory variable: Party Affiliation (Indicator)
[Goal] Relates political ideology to political party affiliation.
ideology = rbind(c(1,1,80),
c(1,2,81),
c(1,3,171),
c(1,4,41),
c(1,5,55),
c(0,1,30),
c(0,2,46),
c(0,3,148),
c(0,4,84),
c(0,5,99))
colnames(ideology) = c("party","ideology","count")
ideology = as.data.frame(ideology)
ideology
## party ideology count
## 1 1 1 80
## 2 1 2 81
## 3 1 3 171
## 4 1 4 41
## 5 1 5 55
## 6 0 1 30
## 7 0 2 46
## 8 0 3 148
## 9 0 4 84
## 10 0 5 99
## mosaic plot
ideology.table = matrix(ideology$count, ncol=2,
dimnames=list(ideology=c("1","2","3","4","5"),
party=c("1","0")))
library(vcd) # run install.packages("vcd") first
## Warning: package 'vcd' was built under R version 3.4.2
## Loading required package: grid
mosaic(ideology.table, shade=TRUE)
library(MASS)
fit = polr(factor(ideology) ~ party, weights=count, data=ideology)
summary(fit)
##
## Re-fitting to get Hessian
## Call:
## polr(formula = factor(ideology) ~ party, data = ideology, weights = count)
##
## Coefficients:
## Value Std. Error t value
## party -0.9745 0.1292 -7.545
##
## Intercepts:
## Value Std. Error t value
## 1|2 -2.4690 0.1318 -18.7363
## 2|3 -1.4745 0.1090 -13.5314
## 3|4 0.2371 0.0942 2.5165
## 4|5 1.0695 0.1039 10.2923
##
## Residual Deviance: 2474.985
## AIC: 2484.985
NOTE: The coefficient of party is the opposite of that from the SAS output. This is because the function polr
is built upon the latent variable model. It models \(\alpha_j-\beta x\). The one from SAS is the one we need.