In order to overcome the difficulty associated with E[XU]=0 (endogeneity). We assume that there is an additional random vector Z taking values in Rl+1 with l+1≥k+1 such that E[ZU]=0.
Any exogenous component of X is contained in Z (the so-called included instruments). In particular, we assume the first component of Z is constant equal to one, i.e., Z=(Z0,Z1,…,Zl)′ with Z0=1.
We also assume that E[ZX′]<∞,E[ZZ′]<∞ and that there is no perfect collinearity in Z.
In summary, we assume:
E[ZU]=0: Instrument Exogeneity
E[ZX′]<∞
E[ZZ′]<∞
There is no perfect collinearity in Z
We further assume the rank of E[ZX′] is k+1. This is termed Instrument Relevance or Rank Condition.
A necessary condition for 5 to be true is l≥k. This is referred to as the Order Condition.
To further understand the IV estimator, we need to know the following:
For regression
Y=X′β+U
Here is an example of using IV in real-world empirical analysis:
To analyze the education level of out-of-service military people education level's (e.g., years of schooling) relationship with proximity to the school (e.g., distance to the nearest school or a measure of school accessibility), we can choose draft lottery return status (e.g., whether an individual was likely to be drafted for military service) as an Instrument Variable. Now, we check the relevance and exogenity of the draft lottery return status as an IV.
Relevance: The draft lottery status could affect an individual's proximity to school, as those with a higher likelihood of being drafted might choose to stay in school longer or closer to educational institutions.
Exogeneity: The draft lottery status is presumably random and thus not correlated with the unobserved factors affecting education levels directly.
Therefore, the draft lottery return status can serve as a good IV.
Solving For Beta
Proof:
Lemma
Solve for Beta
Equation 1: IV estination
Equation 2: TSLS Version 1
As we can have that:
Equation 3: TSLS Version 2
Interpreting The Rank Condition (Instrument Relevance)
The rank condition for IV estimation is a technical requirement that ensures the IV or set of IVs provides enough information to identify the model.
Strong IV:
A strong IV is highly correlated with the endogenous explanatory variable.
This strong correlation ensures that the IV effectively captures the variation in the endogenous variable that is not related to the error term in the regression model.
Strong IVs lead to more reliable and precise estimates in IV regression.
Weak IV:
A weak IV has a weak correlation with the endogenous explanatory variable.
This weak correlation means that the IV does not effectively capture the variation in the endogenous variable, making it less effective in dealing with endogeneity.
Weak IVs can lead to biased estimates and poor inference because they do not provide a good substitute for the endogenous variable.
In our scenario,
Large Variance: When the instrument is weak, the variance of the IV estimator becomes very large. This leads to wide confidence intervals, making it difficult to draw precise inferences about the parameters.
Biased Estimation: In small samples, a weak instrument can lead to biased estimates, and these biases can be as bad or even worse than the OLS estimates that suffer from endogeneity.
Inconsistent Estimation: In theory, the IV estimator is consistent as the sample size approaches infinity. However, with a weak instrument, the convergence to the true parameter value can be very slow, leading to practical issues in estimation, even with large samples.
Weak Instrument Problem: If the IV is weakly correlated (or not correlated) with the endogenous variable, it results in a weak instrument problem. This can lead to biased and inconsistent estimates, similar to or even worse than the original OLS estimates that were affected by endogeneity.
Inefficiency: The estimates may also have large standard errors, leading to inefficiency and making it difficult to draw reliable inferences.
Identification Issue: In the extreme case where there is no correlation at all, the model becomes unidentified, meaning that you cannot reliably estimate the coefficients of the endogenous variables.
Interpreting The Exogeneity Condition
Consequence of Violation:
Biased and Inconsistent Estimates: If the IV is correlated with the error term, the estimates will be biased and inconsistent. This is because the instrument is not isolating only the exogenous variation in the endogenous explanatory variable—it's also capturing some of the effects that should be in the error term.
Invalid Inferences: Any inferences made about the effect of the endogenous explanatory variable on the dependent variable would be invalid because they are contaminated by the correlation with the error term.
Partition of Beta: Endogenous Components
Note that the IV can also be the variable itself, in following regression:
We have that, in this model:
It follows that
IV Estimator
The IV estimator is used under the case that number of instrument variables is equal to the number of explanatory variables. Means using it in the following condition:
Matrix Notation
This estimator may be expressed more compactly using matrix notation. Define
In this notation, we have
X: dimension is (k+1)×1
Z: dimension is (ℓ+1)×1, and l⩾k
Z only influence Ythrough X, not through U. this means that Z only correlated with X but not correlated with U.
Using that U=Y−X′β and E[ZU]=0, we see that β solves the system of equations
E[ZY]=E[ZX′]β
E[ZY]=E[Z(U+X′β)]=0+E[ZX′]β=E[ZX′]β
Note that the invertible of E[ZX′] is not guaranteed. This is because since l+1≥k+1, this may be an over-determined system of equations. There is more information than we need. Which can be shown in the following:
Therefore, in order to solve for β, we introduce the following lemma:
Suppose there is no perfect collinearity in Z and let Π be such that BLP(X∣Z)=Π′Z.E[ZX′] has rank k+1 if and only if Π has rank k+1. Moreover, the matrix Π′E[ZX′] is invertible.
Note that if some Xj are exogenous, then we do not need IVs for them.
As β solves: E[ZY]=E[ZX′]β or Π′E[ZY]=Π′E[ZX′]β, then we can have that using the previous lemma and Π=E[ZZ′]−1E[ZX′], we can derive three formulae for β
Since we have X=1X1⋮XkZ=1Z⋮Zl, and Π=[Π0,Π1,⋯,Πk], the shape of Π is (l+1)×(k+1).
As Π′ is a matrix of real numbers, we can put it in the expectation. So β can be rewritten as
β=[E[(Π′Z)(Π′Z)′]]−1E[(Π′Z)Y]
We can denote that W=Π′Z, which stands for the linear combination of instruments. Take this into the formula of β, we have
β=(E[WW′])−1E[WY]
Essentially, it requires that the matrix of instruments Z should have sufficient rank so that the projection matrix Π adequately captures the relevant information in the endogenous variables.
Interpretation: Consider the case where k=l and only Xk is endogenous. Let Zj=Xj for all 0≤j≤k−1. In this case,
Π′=10⋮0π001⋮0π1…………00⋮1πl−100⋮0πl
The rank condition therefore requires πl=0 : the instrument Zl must be "correlated with Xk after controlling for X0,X1,…,Xk−1."
If πl is close to zero, Zl is a weak IV because it does not add much explanatory power beyond what is already captured by X0,X1,…,Xk−1.
If πl is significantly different from zero, Zl is considered a strong IV, as it is meaningfully correlated with the endogenous variable Xk, independent of the other explanatory variables.
If Π is Near-Singularity, we can have that this is a weak IV estimator that doesn't explain X well. This weak IV estimator problem will lead to:
In summary, if the Z doesn't satisfy the rank condition (relevance condition), here are the consequences:
Y=X1′β1+X2′β2+U
If E[X1U]=0: we can choose to find an IV Z1 for X1, such that E[Z1U]=0
If E[X2U]=0: X2 itself can be view as an IV for X2, such as Z2=X2
Here, we partition X into X1 and X2, where X2 is exogenous. Partition Z into Z1 and Z2 and β into β1 and β2 analogously.
Z2=X2 are included instruments
Z1 are excluded instruments
We can conveniently re-write this by projecting (BLP) on Z2=X2. Consider the case k=l
BLP(Y∣Z2)=BLP(X1∣Z2)′β1+X2′β2.
Define Y∗=Y−BLP(Y∣Z2)andX1∗=X1−BLP(X1∣Z2) so that
E[Z1Y∗]=E[Z1X1∗′]β1+E[Z1U]
β1=(E[Z1X1∗′])−1E[Z1Y∗]
Just-identified case: l=k
Denote P the marginal distribution of (Y,X,Z). Let (Y1,X1,Z1),…,(Yn,Xn,Zn) be an i.i.d. sequence of random variables with distribution P.
By analogy with β=(E[ZX′])−1E[ZY], the natural estimator of β is simply
β^=(n1i∑ZiXi′)−1(n1i∑ZiYi).
This estimator is called the instrumental variables (IV) estimator of β. Note that β^ satisfies
n1i∑Zi(Yi−Xi′β^)=0
In particular, U^i=Yi−Xi′β^ satisfies
n1i∑ZiU^i=0.
Insight on the IV estimator: assume X0=1 and X1∈R. An interesting interpretation of the IV estimator β1 is obtained by multiplying and dividing by n1∑i=1n(Z1,i−Zˉ1,n)2, i.e.,
β^1=n1∑i=1n(Z1,i−Zˉ1,n)X1,i/n1∑i=1n(Z1,i−Zˉ1,n)2n1∑i=1n(Z1,i−Zˉ1,n)Yi/n1∑i=1n(Z1,i−Zˉ1,n)2= slope of X on Z slope of Y on Z.