The data set fertility.csv contains information about n = 4286 women in Botswana during 1988. This information includes number of children, years of education, age, and religious and economic status variables. Additional information about each of the variables in this data-set is available in the file fertilitydescr.pdf, which is posted along with this homework. Our policy question is to understand the effect of education (how many years of education to undertake) on women’s fertility decisions. a. We first estimate the following model using simple OLS regression: childreni = β0 + β1educi + β2agei + β3age2 i + β4urbani + β5tvi +β6catholici + β7knowmethi + β8usemethi + ui , i = 1, …, n where our focus is on the parameter β1. Suppose at first that assumptions MLR.1 − MLR.4 for OLS are satisfied and estimate the model accordingly. Given the possibility that MLR.5 doesn’t hold, report heteroskedasticity robust standard errors. Report the value for the estimators of the regression coefficients, their standard errors, and their significance levels. b. Do you think that Cov(educi , ui) = 0? Explain. c. The public education requirements for women in Botswana, including those in our estimation sample, depend on birth date. Those born in the first six months of the year must remain in school until they are approximately six months older than those born in the second half of the year. frsthalf is a dummy variable equal to one if the woman was born during the first six months of the year, and zero otherwise. Explain what is required for frsthalf to be a valid instrumental variable for educ. Do these assumptions seem reasonable? d. Estimate the first step regression of educ on f irsthalf as well as all of the regressors included in the part (a) OLS fertility model. Verify that the coefficient associated to f irsthalf is significant. Use heteroskedasticity robust standard errors.

