Difference-in-means

A researcher is interested in whether attitudes about Black Lives Matter (BLM) differ between

white Democrat survey respondents and Latinx Democrat survey respondents. To measure

attitudes toward BLM, the researcher used a 101-point feeling thermometer such that a 0

denotes least favorable ratings and a 100 denotes most favorable ratings. To measure race, the

researcher created a dummy (binary) variable for race coded 1 if the survey respondent

identified as white and 0 if the survey respondent identified as Latinx.

The researcher conducts a two-group t-test and sees the results reported below:

t.test(dems$police_therm~dems$whitelat, var.equal=FALSE, type=”two.sided”)

t = -1.654, df = 454.44, p-value = 0.09882

95 percent confidence interval:

-5.4125030 0.4654062

sample estimates:

mean Latinx mean Whites

60.82073 63.29428

1. What are the null and alternative hypotheses the researcher is interested in, given the

description above?

2. Based on the evidence reported in the output given, is there sufficient evidence to reject the

null hypothesis? Why or why not? Assume =0.05.

3. Would there be sufficient evidence to conclude that white Democrat ratings of the police are

significantly higher than Latinx ratings of the police? Why or why not? Assume =0.05.

Regression 1

A researcher is interested in the relationship between evaluations of the Immigration and

Customs Enforcement agency (ICE). The researcher suspects that attitudes about ICE will be

related to race, party identification and income. To assess this, the researcher created a

dummy (binary) variable for race coded 1 if the survey respondent identified as white and 0 if

the survey respondent identified as Latinx.

To assess party affiliation, the researcher created a 7-point scale coded such that 1=strong

Democrat; 2=not strong Democrat; 3=Independent who leans Democrat; 4=true Independent;

5=Independent who leans Republicans; 6=not strong Republican; 7=strong Republican.

To assess income, the researcher created a dummy variable coded 1 if the respondent’s

reported income was above the median income level for 2020 and coded 0 if the reported

income was at or below the median.

The ICE ratings are based on a 101-point feeling thermometer such that a 0 denotes least

favorable attitudes and a 100 denotes most favorable attitudes.

Using a linear regression model, the researcher obtains the following regression estimates:

Variable Est. S.E. t p-value 95% Confidence Interval

Race 5.20 1.05

Party 7.54 0.14 7.26, 7.82

Income 0.31 0.65 0.48 -0.96 , 1.58

Intercept 15.00 1.13 13.31 0.000 12.77, 17.18

Computation questions:

Based on this output, make the following computations:

1. What is the t score for the “Race” estimate?

2. What is the t-score for the “Party” estimate?

3. What would the p-values be for “Race” and for “Party”? You can approximate this.

4. For the “Race” estimate, compute the 95% confidence interval; assume the critical t (i.e. t

*

) is

equal to 2.

5. Suppose you know the following:

= 301

=604.

That is, the sum of squares due to the regression (referred to as “explained variance” in lecture)

is 301 and the sum of squares due to error (referred to as “residual variance” in lecture) is 604.

Based on this information, what would be the R

2

for this regression model?

Interpretation questions

6. Would you conclude that “Race” is significantly related to ICE ratings? Why or why not?

7. Would you conclude that “Party” is significantly related to ICE ratings? Why or why not?

8. Would you conclude that “Income” is significantly related to ICE ratings? Why or why not?

9. Write out the regression equation using the information from the table.

10. What is the difference in ICE ratings between white and Latinx respondents?

11. How should the “Party” regression slope estimate be interpreted?

12. What is the predicted ICE rating for:

A white respondent who identifies as a strong Republican with an income below the

median?

A Latinx respondent who identifies as a strong Democrat with an income level above the

median?

13. How would you interpret the R

2?

Regression 2

A researcher is interested in the factors that predict people’s ratings of “the police.” Using a

feeling thermometer from the 2020 American National Election study, the researcher estimates

a regression model using the following independent variables: race, coded as a three-level

factor variable denoting White respondents, Black respondents, and Latinx Respondents;

gender, code as a two-level factor denoting Females and Males; party, denoted as a two-level

factor denoting Republicans and Democrats; group empathy, a 16-point scale measuring

empathy toward other racial groups (high scores=greater empathy). In addition, the researcher

is interested in seeing if any of these variables relationship with police ratings varies by party

affiliation. Because of this, the researcher includes an interaction term of each of the variables

by party.

The researcher estimates the following regression model:

𝑃𝑜𝑙𝑖𝑐𝑒 𝑟𝑎𝑡𝑖𝑛𝑔𝑠 = 𝛽0 + 𝛽1𝑅𝑒𝑝𝑢𝑏𝑙𝑖𝑐𝑎𝑛 + 𝛽2𝐵𝑙𝑎𝑐𝑘 + 𝛽3𝐿𝑎𝑡𝑖𝑛𝑥

+ 𝛽4𝐹𝑒𝑚𝑎𝑙𝑒 + 𝛽5𝐺𝑟𝑜𝑢𝑝 𝑒𝑚𝑝𝑎𝑡ℎ𝑦 + 𝛽6𝑅𝑒𝑝.∗ 𝐵𝑙𝑎𝑐𝑘

+ 𝛽7𝑅𝑒𝑝.∗ 𝐿𝑎𝑡𝑖𝑛𝑥 + 𝛽8𝑅𝑒𝑝.∗ 𝐺𝑒𝑛𝑑𝑒𝑟 + 𝛽9𝑅𝑒𝑝.∗ 𝐺𝑟𝑜𝑢𝑝 𝑒𝑚𝑎𝑝𝑡ℎ𝑦,

where Republican=1 if a Republican and 0 if Democrat; Black=1 if the respondent is black and 0

otherwise; Latinx=1 if the respondent is Latinx and 0 otherwise; Female=1 if the respondent is

female and 0 if male; Group empathy ranges from 0 to 16.

The results of the regression are reported below.

Ratings of “the police” by race, gender, and party

Variable Estimate s.e.

Republican 9.27*** (1.90)

Black -12.86*** (1.05)

Latinx -2.53** (1.22)

Female 4.94*** (0.80)

Group empathy -1.32*** (0.12)

Rep. x Black -10.62*** (3.41)

Rep. x Latinx -1.30 (2.06)

Rep x Female -2.17* (1.14)

Rep x Group empathy 1.15*** (0.18)

Constant 75.30*** (1.48)

Observations 5,818

Adjusted R2 0.255

Residual Std. Error 21.29 (df = 5808)

Note: *p<0.1; **p<0.05; ***p<0.01

1. What are the predicted values for the following:

a. A white Republican male who scores a 0 on the group empathy scale:

b. A black Democrat male who scores a 16 on the group empathy scale:

c. A white Republican female who scores a 4 on the group empathy scale:

2. Below is a plot from the regression model showing the ratings of police, by party, controlling

for race. In full and complete sentences, how would you interpret these relationships?