[Data] Alligator Data: In a study by the Florida Game and Fresh Water Fish Commission on the foods that alligators in the wild choose to eat, 59 alligators in Lake George, Florida, were sampled and the primary food type found in the alligators stomach was recorded along with the alligator length.
The response outcome: primary food choice (three levels: Fish, Inverterbrates, and Others)
An explanatory variable: alligator length.
[Goal] Study the relationship between the alligator length and primary food choice.
[Specification]
link = glogit
in model statement.proc import datafile='..\data\alligator.csv' out=crab
dbms=csv replace;
run;
proc logistic;
class food (ref="O") / param=ref;
model food = length / link = glogit;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.CRAB
Response Variable food
Number of Response Levels 3
Model generalized logit
Optimization Technique Newton-Raphson
Number of Observations Read 59
Number of Observations Used 59
Response Profile
Ordered Total
Value food Frequency
1 F 31
2 I 20
3 O 8
Logits modeled use food='O' as the reference category.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 119.142 106.341
SC 123.297 114.651
-2 Log L 115.142 98.341
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 16.8006 2 0.0002
Score 12.5702 2 0.0019
Wald 8.9360 2 0.0115
Type 3 Analysis of Effects
Wald
Effect DF Chi-Square Pr > ChiSq
length 2 8.9360 0.0115
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter food DF Estimate Error Chi-Square Pr > ChiSq
Intercept F 1 1.6177 1.3073 1.5314 0.2159
Intercept I 1 5.6974 1.7938 10.0881 0.0015
length F 1 -0.1101 0.5171 0.0453 0.8314
length I 1 -2.4654 0.8997 7.5101 0.0061
Odds Ratio Estimates
Point 95% Wald
Effect food Estimate Confidence Limits
length F 0.896 0.325 2.468
length I 0.085 0.015 0.496
[Data] Afterlife Data
[Code]
data belief;
input race $ gender $ belief $ count;
datalines;
white female yes 371
white female undecided 49
white female no 74
white male yes 250
white male undecided 45
white male no 71
black female yes 64
black female undecided 9
black female no 15
black male yes 25
black male undecided 5
black male no 13
;
proc logistic;
class race (ref="black") gender (ref="male") belief (ref="no") / param=ref;
weight count;
model belief = gender race / link = glogit aggregate scale=none;
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.BELIEF
Response Variable belief
Number of Response Levels 3
Weight Variable count
Model generalized logit
Optimization Technique Newton-Raphson
Number of Observations Read 12
Number of Observations Used 12
Sum of Weights Read 991
Sum of Weights Used 991
Response Profile
Ordered Total Total
Value belief Frequency Weight
1 no 4 173.00000
2 undecide 4 108.00000
3 yes 4 710.00000
Logits modeled use belief='no' as the reference category.
Class Level Information
Design
Class Value Variables
race black 0
white 1
gender female 1
male 0
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Deviance and Pearson Goodness-of-Fit Statistics
Criterion Value DF Value/DF Pr > ChiSq
Deviance 0.8539 2 0.4269 0.6525
Pearson 0.8609 2 0.4304 0.6502
Number of unique profiles: 4
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 1560.197 1559.453
SC 1561.167 1562.362
-2 Log L 1556.197 1547.453
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 8.7437 4 0.0678
Score 8.8498 4 0.0650
Wald 8.7818 4 0.0668
Type 3 Analysis of Effects
Wald
Effect DF Chi-Square Pr > ChiSq
gender 2 7.2074 0.0272
race 2 2.0824 0.3530
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter belief DF Estimate Error Chi-Square Pr > ChiSq
Intercept undecide 1 -0.7582 0.3614 4.4031 0.0359
Intercept yes 1 0.8828 0.2426 13.2390 0.0003
gender female undecide 1 0.1051 0.2465 0.1817 0.6699
gender female yes 1 0.4186 0.1713 5.9737 0.0145
race white undecide 1 0.2712 0.3541 0.5863 0.4438
race white yes 1 0.3420 0.2370 2.0814 0.1491
Odds Ratio Estimates
Point 95% Wald
Effect belief Estimate Confidence Limits
gender female vs male undecide 1.111 0.685 1.801
gender female vs male yes 1.520 1.086 2.126
race white vs black undecide 1.311 0.655 2.625
race white vs black yes 1.408 0.885 2.240
[Specification]
IDEOLOGY takes 5 values. PROC LOGISTIC automatically fits a cumulative logit model when the response has more than two categories.
The options scale=none aggregate=(party)
request the deviance \(G^2\) and the Pearson \(X^2\) goodness-of-fit statistics
[Code]
data ideology;
input party ideology count @@; /* trailing @@ is used when one record contains multiple observations. */
datalines;
1 1 80 1 2 81 1 3 171 1 4 41 1 5 55
0 1 30 0 2 46 0 3 148 0 4 84 0 5 99
;
proc logistic;
weight count;
model ideology = party / scale=none aggregate=(party);
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.IDEOLOGY
Response Variable ideology
Number of Response Levels 5
Weight Variable count
Model cumulative logit
Optimization Technique Fisher's scoring
Number of Observations Read 10
Number of Observations Used 10
Sum of Weights Read 835
Sum of Weights Used 835
Response Profile
Ordered Total Total
Value ideology Frequency Weight
1 1 2 110.00000
2 2 2 127.00000
3 3 2 319.00000
4 4 2 125.00000
5 5 2 154.00000
Probabilities modeled are cumulated over the lower Ordered Values.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Score Test for the Proportional Odds Assumption
Chi-Square DF Pr > ChiSq
3.9106 3 0.2713
Deviance and Pearson Goodness-of-Fit Statistics
Criterion Value DF Value/DF Pr > ChiSq
Deviance 3.6877 3 1.2292 0.2972
Pearson 3.6629 3 1.2210 0.3002
Number of unique profiles: 2
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 2541.630 2484.985
SC 2542.840 2486.498
-2 Log L 2533.630 2474.985
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 58.6451 1 <.0001
Score 57.2448 1 <.0001
Wald 57.0182 1 <.0001
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 1 -2.4690 0.1318 350.8122 <.0001
Intercept 2 1 -1.4745 0.1091 182.7151 <.0001
Intercept 3 1 0.2371 0.0948 6.2497 0.0124
Intercept 4 1 1.0695 0.1046 104.6082 <.0001
party 1 0.9745 0.1291 57.0182 <.0001
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
party 2.650 2.058 3.412
Association of Predicted Probabilities and Observed Responses
Percent Concordant 25.0 Somers' D 0.000
Percent Discordant 25.0 Gamma 0.000
Percent Tied 50.0 Tau-a 0.000
Pairs 40 c 0.500
Here we use the cumulative logit model to illurstate the overdispersion.
The Pearson statistics is 3.6629 with df = 3. The Pearson statistics over df = 1.221.
The total number of free cells is \(2 \times 4 =8\). We have four equation, each has indiviudal intercpet and a common covariate effect, totally five parameters to estimate. The difference is df =3.
\(\hat \sigma^2 = 1.221 \rightarrow \hat \sigma = \sqrt{1.221} = 1.105\). Therefore, we can add scale=1.105
in the model statement.
[Code]
data ideology;
input party ideology count @@; /* trailing @@ is used when one record contains multiple observations. */
datalines;
1 1 80 1 2 81 1 3 171 1 4 41 1 5 55
0 1 30 0 2 46 0 3 148 0 4 84 0 5 99
;
proc logistic;
weight count;
model ideology = party / scale=1.105 aggregate=(party);
run;
The LOGISTIC Procedure
Model Information
Data Set WORK.IDEOLOGY
Response Variable ideology
Number of Response Levels 5
Weight Variable count
Model cumulative logit
Optimization Technique Fisher's scoring
Number of Observations Read 10
Number of Observations Used 10
Sum of Weights Read 835
Sum of Weights Used 835
Response Profile
Ordered Total Total
Value ideology Frequency Weight
1 1 2 110.00000
2 2 2 127.00000
3 3 2 319.00000
4 4 2 125.00000
5 5 2 154.00000
Probabilities modeled are cumulated over the lower Ordered Values.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Score Test for the Proportional Odds Assumption
Chi-Square DF Pr > ChiSq
3.9106 3 0.2713
Deviance and Pearson Goodness-of-Fit Statistics
Criterion Value DF Value/DF Pr > ChiSq
Deviance 3.6877 3 1.2292 0.2972
Pearson 3.6629 3 1.2210 0.3002
Number of unique profiles: 2
NOTE: The covariance matrix has been multiplied by the heterogeneity
factor (square of SCALE=1.105) 1.22103.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 2083.003 2036.973
SC 2084.213 2038.486
-2 Log L 2075.003 2026.973
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 48.0294 1 <.0001
Score 46.8826 1 <.0001
Wald 46.6970 1 <.0001
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 1 -2.4690 0.1457 287.3096 <.0001
Intercept 2 1 -1.4745 0.1205 149.6408 <.0001
Intercept 3 1 0.2371 0.1048 5.1184 0.0237
Intercept 4 1 1.0695 0.1156 85.6725 <.0001
party 1 0.9745 0.1426 46.6970 <.0001
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
party 2.650 2.004 3.504
Association of Predicted Probabilities and Observed Responses
Percent Concordant 25.0 Somers' D 0.000
Percent Discordant 25.0 Gamma 0.000
Percent Tied 50.0 Tau-a 0.000
Pairs 40 c 0.500
[Specification]
alogits
specifies response functions as adjacent-category logits_response_
requests a single effect parameter be used. Without this keyword, the adjacent-categoris logits model is just a re-parameterization of the baseline logit model.Direct
statement.PROC CATMOD
without _response_
and PROC LOGISTIC
with effect coding scheme.[Code]
data ideology;
input party ideology count @@;
datalines;
1 1 80 1 2 81 1 3 171 1 4 41 1 5 55
0 1 30 0 2 46 0 3 148 0 4 84 0 5 99
;
proc catmod;
weight count;
response alogits;
model ideology = _response_ party;
run;
The CATMOD Procedure
Data Summary
Response ideology Response Levels 5
Weight Variable count Populations 2
Data Set IDEOLOGY Total Frequency 835
Frequency Missing 0 Observations 10
Population Profiles
Sample party Sample Size
------------------------------
1 0 407
2 1 428
Response Profiles
Response ideology
--------------------
1 1
2 2
3 3
4 4
5 5
Analysis of Variance
Source DF Chi-Square Pr > ChiSq
--------------------------------------------
Intercept 1 8.84 0.0029
_RESPONSE_ 3 174.46 <.0001
party 1 52.63 <.0001
Residual 3 5.38 0.1459
Analysis of Weighted Least Squares Estimates
Standard Chi-
Parameter Estimate Error Square Pr > ChiSq
-------------------------------------------------------------
Intercept 0.0954 0.0321 8.84 0.0029
_RESPONSE_ 1 0.1256 0.1169 1.15 0.2827
2 0.8597 0.1096 61.50 <.0001
3 -1.0274 0.1110 85.69 <.0001
party 0 0.2159 0.0298 52.63 <.0001
The party affiliation effect is \(\hat\beta = 0.2159\). The estimated odds that a Republicans (PARTY=0) ideology classification is in category \(j+1\) instead of \(j\) are \(\exp(\hat\beta) = 1.24\) times the estimated odds for a Democratic.