Do remittances affect the consumption pattern of the Filipino households?

## Objectives

The objective of this paper is to formulate structural models to illustrate the change in consumption pattern of the Filipino households. In this study, our aim is to use an advanced econometric approach to find out if there is indeed such change in the consumption pattern of the household receiving remittances as compared to those who only get their income from domestic sources.

## Review of Related Literature

There are several studies regarding the consumption patterns of household. One of which is the study made by Taylor and Mora (2006), they studied about the effect of migration in reshaping the expenditure of rural households in Mexico. The conclusion that they made is that remittances has positive effects on total expenditures and investment. They also found out that as the remittances of rural household increases, the proportion of the income on consumption decreases (Taylor & Mora, 2006). Another one is the study of Rasyad A. Parinduri & Shandre M. Thangavelu (2008), wherein they used the Indonesia Family Life Survey data to observe the effect of remittances to the consumption patterns of the Indonesian households. In their study, they used the matching and difference-in-difference matching estimators to observe the relationship. They found out that remittances do not improve the living standard of the households, nor do remittances have an effect on economic development. They used the education and medical expenditure as indicators of economic development. The major findings that they have are that most of the Indonesian households used the remittances in terms of investing them into luxury goods such as house and jewelries (Parinduri & Thangavelu, 2008). Using the same study, we intend to observe the consumption pattern of the households, based not only on the remittances but also to other sources of income. In addition to that, instead of looking at economic development, we intend to look at the consumption goods that households normally consume, and see if there are indeed changes in the consumption patterns of the selected households.

## Theoretical Framework

## Engelâ€™s Law

## Methodology and Data

In the methodology and data part, our main concern is to find ways to observe the consumption patterns of the Filipino households here in this country. In order to do that, we tried to find a dataset that will explain such relationship. Based from the available datasets here in the country, we would say that the Family Income and Expenditure Survey or the FIES best suits our study. The dataset enlists all the possible consumption goods that were being consumed by the households during a specific year. In addition to that, we can also determine the source of income of the different households that was made available in the dataset. By examining the relationship of consumption and income, we will be able to observe the behavioral aspect of the Filipino householdsâ€™ consumption based from the income that they received.

Due to the inaccessibility of the latest data, we settled for the 2003 edition. Based on this data, we will be able to observe the impact of the different sources of income to the kind of goods that the Filipino families consume, using an advanced econometric approach called the simultaneous equation model (SEM).

After acquiring the right dataset for this study, we must next formulate the different structural equations to illustrate the consumption patterns. In this paper, we have formulated four equations, one of which is based from the Engelâ€™s Law, which again, states that when an individualâ€™s income increases, his/her percentage of consumption decreases (Engelâ€™s Law, n.d.). As for the other three other equations which are mainly composed of different sources of income, mainly wages, domestic source, and foreign source, we have used other studies conducted by (SOURCE) ,to see what are the factors that affects or determine the different sources of income.

After formulating the equations, we decided to use the log-log model for the estimation, simply because our study aims to observe the income elasticity of the different goods. With the use of the log-log model, we will be able to determine the elasticity of the different consumption goods, by just looking at their respective estimated coefficients. Another reason why we chose the log-log model is because of the limited information about the domestic and foreign source of income in the FIES data. There are several households in the data who either do not receive domestic or foreign source of income, or the data gatherers failed to obtain these data from the respective respondents. By using the log-log model, we will be able to exclude those unrecorded observations, so that the results will be not inconsistent and will not be affected by the people who do not receive income from either domestic or foreign source.

After citing the reasons for the construction of the model, next, we will be observing three consumption goods, particularly the total food expenditures, the total non food expenditures, and the tobacco-alcohol consumption.

## Model 1: Food Consumption

## Equation 1:

## Equation 2:

## Equation 3:

## Equation 4:

Where: food = total food expenditures

Condo = domestic source of income

Conab = foreign source of income

Wage = wages or salaries of the household

Wsag = wages or salaries from agricultural activities

Wsnag = wages or salaries from non-agricultural activities

S1021_age = household head age

S1041_hgc = household head highest grade completed

S1101_employed = total number of family employed with pay

Lc10_conwr = contractual worker indicator

In order to observe the consumption patterns of the Filipino household based from the different sources of income, we will be modifying the first equation of the model, by replacing one good to the other good, while maintaining the same structural forms. For example, in the initial first model, we have chosen food expenditure as our first consumption good. Later on, we will be observing other consumption goods such as non food expenditure, and alcoholic tobacco-alcohol consumption, and we will replace the food consumption with these other goods. This is because consumption goods are all affected by the income, and we have chosen the different income sources based from the availability of the FIES data, which was released on 2003.

## A-priori expectation

Given the interrelationship of the equations, it seems like we have to solve the equations simultaneously to estimate for the unknown variables. Before we can use the simultaneous equation model (SEM) approach, there are several identification problems that we must solve in order to know whether SEM is an appropriate method or not. According to Gujarati and Porter (2009), the identification problem process consists of the following tests: a. order and rank condition, b. Hausman specification test, which is also known as the simultaneity test, and c. exogeneity test.

## Identification Problem

## Order and rank condition

Before we proceed with the order and rank condition, we must first define the different variables that we will be using in order to test whether the equations are under-identified, exactly identified or over-identified.

Legend: M ïƒ number of endogenous variables in the model

m ïƒ number of endogenous variables in the equation

K ïƒ number of exogenous/predetermined variables in the model

k ïƒ number of exogenous/predetermined variables in the equation

## Order Condition

The order condition is a necessary but not sufficient condition for identification (Gujarati and Porter, 2009). This test is used to see whether an equation is identified by comparing the number of excluded exogenous/predetermined variables in a given equation with the number of endogenous variables in the equation less one. There will be three instances where we can determine if the equation is identified or not. First, if K-k (number of excluded predetermined variables in the equation) < m-1 (number of endogenous variables less one), then the equation is deemed to be under-identified. According to Gujarati and Porter, K-k must be greater than or equal to m-1, for the order condition to be satisfied.

In the first model, there are four endogenous variables namely lnfood, lnwages, lncondo, and lnconab (M=4). And there are also six exogenous variables in the equation which are the variables that were not named (K=6). With that, the order condition of the food consumption is illustrated below:

## Equation

## K-k

## m-1

## Conclusion

Lnfood

6

3

Over

Lnwages

4

0

Over

Lncondo

2

0

Over

Lnconab

2

0

Over

In the first case, all the equations are considered to be over-identified, simply because K-k > m-1. In the order condition, we have concluded that the model is identified. However, the order condition is not sufficiently enough to justify whether an equation is identified or not, that is why there is another condition that must be satisfied before we can proceed to the estimation process, which is the rank condition.

## Rank Condition

The rank condition is a necessary and sufficient condition for identification. In order to satisfy the rank condition, â€œthere must be at least one nonzero determinant of order (M-1) (M-1) can be constructed from the coefficients of the variables excluded from that particular equation but included in the other equations of the modelâ€?(Gujarati and Porter, 2009).

Ys

Xs

Eq.

Food

Wages

condo

conab

1

wssag

wsnag

hh_age

hh_hgc

employed

conwr

lnfood

1

0

0

0

0

0

0

lnwages

0

1

0

0

0

0

0

0

Lncondo

0

0

1

0

0

0

Lnconab

0

0

0

1

0

0

We simplify the variableâ€™s notation, but itâ€™s basically the same as the variables in the model, it only lacks the â€œlnâ€? in some variables, and some variablesâ€™ descriptions are shortened. We can observed that the (M-1) x (M-1), which in this case is 3 x 3 matrices, have at least one nonzero determinant, therefore the rank condition is satisfied. We can now proceed to the other identification test.

## Hausman specification test

The Hausman specification test is to test whether the equations exhibits simultaneity problem or not. According to Gujarati and Porter (2009), if there is not simultaneity problem, then OLS is BLUE (best linear unbiased estimator). But if there is simultaneity problem, then OLS is not blue, because the estimated results will be bias and inconsistent. With that, we have to use the different estimation techniques of the SEM in order to regress the given equations.

The Hausman specification test involves the following process: First, we regress an endogenous variable with respect to all of the exogenous/predetermined variables in the system, after which we obtain the value of the residual, in which it is the predictedThe second step is to regress the endogenous variable with respect to the other endogenous variables plus the predicted . If the is statistically significant, this means that we have all the evidence to reject the null hypothesis, which states that there is no simultaneity bias in the model. But if it is insignificant, we have no evidence to reject the null hypothesis, and if that happens, there is no simultaneity problem. The variable that exhibits no simultaneity bias should not be treated as an endogenous variable. (Gujarati and Porter, 2009)

Dependent variable: lnwages

## P-values

Independent variables: lncondo

0.370

lnconab

0.014

uhat

0.000

For the simultaneity test in the first model, we follow the steps in the Hausman specification test. After that, we observed the predicted uhat in this regression and we can see that the predicted uhat here is 0.000. This means that the null hypothesis is rejected, and there exist simultaneity bias in the first model, therefore we should use other estimation techniques other than OLS, to produce unbiased and consistent estimates.

## Exogeneity test

After the simultaneity test, we must also test for the other exogenous/predetermined variables, to check whether these variables are truly exogenous or not. The process is similar to the Hausman specification test, but instead of regressing the endogenous variables, we regress each exogenous/predetermined variable with respect to the . If the is statistically significant, then we have to reject the null hypothesis that it is truly an exogenous variable. But if the p-value of the is 1.000, this means that we have no evidence to reject the null hypothesis, and we conclude that the corresponding variables are truly exogenous or truly predetermined variables.

## Exogenous variables â€“ 2nd equation

## Resulting p-values for uhat

Lnwsag

1.000

lnwsnag

1.000

## Exogenous variables â€“ 3nd equation

## Resulting p-values for uhat

s1021_age

1.000

s1041_hgc

1.000

s1101_employed

1.000

lc10_conwr

1.000

## Exogenous variables â€“ 4nd equation

## Resulting p-values for uhat

s1021_age

1.000

s1041_hgc

1.000

s1101_employed

1.000

lc10_conwr

1.000

Based from the table given above, each exogenous variable is regressed against the predict uhat and looking at the respective p-values, which are all 1.000. This means that we have no evidence to reject that these variables are indeed truly exogenous variables in each of the equations.

## Model 2: Non Food Consumption

## Equation 1:

## Equation 2:

## Equation 3:

## Equation 4:

Where: nonfood = total non food expenditure

In model 2, we basically changed the total food expenditure with the total non food expenditure. Before we can regress the model, this model should also undergo series of identification problem process to see if whether the model is identified or not. We will also test if the nonfood expenditure model exhibits simultaneity bias and if all of its exogenous variables are truly exogenous.

## Order and Rank Condition

## Order Condition

## Equation

## K-k

## m-1

## Conclusion

Lnnonfood

6

3

Over

Lnwages

4

0

Over

Lncondo

2

0

Over

Lnconab

2

0

Over

Similar to the food consumption order condition, the non food consumption is also identified based on the order condition. All equations are concluded to be over-identified; therefore we can say that the model is identified. But again, we must use the rank condition to further validate if the equations are truly identified or not.

## Rank Condition

Ys

Xs

Eq.

nonfood

wages

condo

conab

1

wssag

wsnag

hh_age

hh_hgc

employed

conwr

lnnonfood

1

0

0

0

0

0

0

lnwages

0

1

0

0

0

0

0

0

lncondo

0

0

1

0

0

0

lnconab

0

0

0

1

0

0

Based from the sub 3×3 matrices, we can say that there exists at least one nonzero determinant in the equation, therefore rank condition is satisfied. This means that the equations are identified.

## Hausman specification test

Dependent variable: lnwages

## P-values

Independent variables: lncondo

0.533

lnconab

0.011

uhat2

0.001

For the simultaneity test in model 2, we can see that uhat2 is statistically significant, meaning there exists a simultaneity bias in the model. Therefore we must use the SEM estimation techniques similar to model 1, to estimate the impact of income and consumption goods.

## Exogeneity test

## Exogenous variables â€“ 2nd equation

## Resulting p-values for uhat2

Lnwsag

1.000

lnwsnag

1.000

## Exogenous variables â€“ 3nd equation

## Resulting p-values for uhat2

s1021_age

1.000

s1041_hgc

1.000

s1101_employed

1.000

lc10_conwr

1.000

## Exogenous variables â€“ 4nd equation

## Resulting p-values for uhat2

s1021_age

1.000

s1041_hgc

1.000

s1101_employed

1.000

lc10_conwr

1.000

Similar to the food consumption model, the exogenous variables in the nonfood model are truly exogenous, since all the resulting p-values for uhat2, are all 1.000.

## Model 3: Tobacco-Alcohol Consumption

## Equation 1:

## Equation 2:

## Equation 3:

## Equation 4:

Where: at = tobacco-alcohol consumption

The same process in model 2 was made here in model 3, we now check for the identification problems for the tobacco-alcohol consumption

## Order and Rank Condition

## Order Condition

## Equation

## K-k

## m-1

## Conclusion

Lnat

6

3

Over

Lnwages

4

0

Over

Lncondo

2

0

Over

Lnconab

2

0

Over

Order condition is satisfied here in model 3, since all of the equations are concluded to be over-identification. We now proceed to the rank condition to check if the equations are ultimately identified.

## Rank Condition

Ys

Xs

Eq.

at

wages

condo

conab

1

wssag

wsnag

hh_age

hh_hgc

employed

conwr

lnat

1

0

0

0

0

0

0

lnwages

0

1

0

0

0

0

0

0

lncondo

0

0

1

0

0

0

lnconab

0

0

0

1

0

0

Rank condition is satisfied because there is at least one nonzero determinant here in the sub 3×3 matrices.

## Hausman specification test

Dependent variable: lnwages

## P-values

Independent variables: lncondo

0.911

lnconab

0.063

uhat3

0.003

In model 3, there is no simultaneity problem because uhat3 is statistically significant. Therefore, we have all the evidence to reject the null hypothesis that there is no simultaneity bias in the equation. The same procedure as for food and nonfood model, we will be using the different estimation techniques to estimate these unknown variables.

## Estimation Techniques and Results

## Estimation Techniques

After the identification problems of the simultaneous equation problem, we proceed to the estimation techniques. As discussed by Gujarati and Porter (2009), they provided three estimation techniques in order to solve for SEM, namely the ordinary least squares (OLS), indirect least squares (ILS), and the two-stage least squares (2SLS). The OLS is used for the recursive, triangular, or causal models (Gujarati and Porter, 2009). Meanwhile, the ILS focuses more on the reduced form of the simultaneous equations, wherein there exists only one endogenous variable in the reduced form equation and it is expressed in terms of all existing exogenous/predetermined variables in the model. It is estimated through the OLS approach, and this method best suits if the model is exactly identified (Gujarati and Porter, 2009). Lastly, the 2SLS approach, wherein the equations are estimated simultaneously. Unlike ILS, 2SLS can used to estimate exact and over-identified equations. (Gujarati and Porter, 2009)

The three approaches discussed by Gujarati and Porter (2009) are all based from the single equation approach. If there are CLRM violations such as autocorrelation and heteroscedasticity in the models, we must use the system approach, particularly the three-stage least squares (3SLS), to correct these violations. The only drawback of the 3SLS method is that if any errors in one equation will affect the other equations.

## Ordinary Least Squares (OLS)

Since all three models suffer from simultaneity bias, we will not use the OLS in this paper. This is because if we used the OLS in estimating the equation which there exist simultaneity bias, the results will be biased and inconsistent. Therefore, OLS is not a good estimator for the three models.

## Indirect Least Squares (ILS)

## Food consumption model reduced form:

## Where: |

## Nonfood model reduced form:

## Where: |

## Tobacco-Alcohol model reduced form:

## Where: |

We will not estimate anymore the coefficient for the ILS, because our main goal is to observe the relationship of consumption goods with the different sources of income and not the other determinants of the different sources of income. The ILS results will not yield standard error for the structural coefficients; therefore it will be hard to obtain the values of the structural coefficients. In addition to that, all of our equations are over-identified, therefore ILS is an inappropriate method to estimate the coefficients.

## Two-stage least squares (2SLS)

## Consumption Goods

## Food (948 obs)

## Non Food (1078 obs)

## Tobacco-Alcohol (634 obs)

## 1st Equation

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

6.428484 (0.000)

1.401963 (0.070)

12.94298 (0.001)

lnwages

0.2235283 (0.000)

0.2880426 (0.000)

0.7781965 (0.000)

lncondo

0.0223739 (0.622)

0.2036453 (0.013)

-1.47202 (0.000)

lnconab

0.205797 (0.001)

0.5110999 (0.000)

0.6098058 (0.121)

## 2nd Eq. lnwages

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

2.122649 (0.000)

2.122649 (0.000)

1.884011 (0.000)

lnwsag

0.3611279 (0.000)

0.3611279 (0.000)

0.42199 (0.000)

lnwsnag

0.5175117 (0.000)

0.5175117 (0.000)

0.483135 (0.000)

## 3rd Eq. lncondo

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

7.75861 (0.000)

7.75861 (0.000)

7.887869 (0.000)

s1021_age

-0.0003422 (0.903)

-0.0003422 (0.903)

0.0014345 (0.720)

s1041_hgc

0.0346237 (0.000)

0.0346237 (0.000)

0.1302147 (0.000)

s1101_employed

-0.023387 (0.450)

-0.023387 (0.450)

-0.0601213 (0.111)

lc10conwr

0.1583353 (0.345)

0.1583353 (0.345)

0.0871853 (0.710)

## 4th Eq. lnconab

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

10.39914 (0.000)

10.39914 (0.000)

9.947326 (0.000)

s1021_age

0.004519 (0.169)

0.004519 (0.169)

0.0145833 (0.002)

s1041_hgc

0.0210221 (0.000)

0.0210221 (0.000)

0.150857 (0.000)

s1101_employed

0.0420871 (0.245)

0.0420871 (0.245)

0.0273189 (0.541)

lc10conwr

-0.6848394 (0.000)

-0.6848394 (0.000)

-0.7780885 (0.005)

Since FIES is a cross sectional data, the model maybe exposed to the violations of multicollinearity and heteroscedasticity. As shown in the appendix1, under the CLRM violations, there exists no multicollinearity in the equations, but there exists heteroscedasticity three out of four equations in the model. The only way to correct for the heteroscedasticity problem is by estimating the simultaneous equations using the three-stage least squares method, which is considered to be full information approach.

## Three-stage least squares (3SLS)

## Consumption Goods

## Food (948 obs)

## Non Food (1078 obs)

## Tobacco-Alcohol (634 obs)

## 1st Equation

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

6.383871 (0.000)

0.7926094 (0.289)

18.63624 (0.000)

lnwages

0.2224267 (0.000)

0.2831109 (0.000)

0.7374008 (0.000)

lncondo

0.0245077 (0.582)

0.3151916 (0.000)

-2.405262 (0.000)

lnconab

0.2101956 (0.001)

0.4810778 (0.000)

0.9024638 (0.020)

## 2nd Eq. lnwages

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

2.142826 (0.000)

2.126479 (0.000)

1.895235 (0.000)

lnwsag

0.3560053 (0.000)

0.3594587 (0.000)

0.419183 (0.000)

lnwsnag

0.5203181 (0.000)

0.5187091 (0.000)

0.4846674 (0.000)

## 3rd Eq. lncondo

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

7.66644 (0.000)

7.420188 (0.000)

8.252266 (0.000)

s1021_age

0.0000462 (0.987)

-0.0005333 (0.840)

0.0042572 (0.224)

s1041_hgc

0.0344578 (0.000)

0.0327889 (0.000)

0.0972984 (0.002)

s1101_employed

-0.0109756 (0.720)

0.030168 (0.302)

-0.0811008 (0.009)

lc10conwr

0.173369 (0.296)

0.234941 (0.151)

-0.0362562 (0.860)

## 4th Eq. lnconab

## Coefficients (P-value)

## Coefficients (P-value)

## Coefficients (P-value)

constant

9.635422 (0.000)

9.760654 (0.000)

9.899007 (0.000)

s1021_age

0.0025551 (0.394)

0.0034051 (0.195)

0.0140427 (0.003)

s1041_hgc

0.0212975 (0.000)

0.0171248 (0.000)

0.1589354 (0.000)

s1101_employed

0.1534522 (0.000)

0.1464836 (0.000)

0.0291422 (0.510)

lc10conwr

-0.484862 (0.011)

-0.5302148 (0.004)

-0.761339 (0.006)

By using the 3SLS, the models are now corrected and it is free from any CLRM violations. Therefore, the table shown above is already the final model of estimation, and we can now interpret the results equation per equation basis.

## Check for equality and unit elasticity

As indicated in the appendices (last part), we also check if there lnwages and lnconab in the food consumption equation are indeed equal. We used the test command in STATA, to see if the two variables are equal, by looking at its p-value. The resulting p-value of the test is 0.8614, meaning we have no evidence to reject the null hypothesis that the two variablesâ€™ coefficients are equal. We made the same process for the lnwages and lncondo in the nonfood consumption equation, and the resulting p-value of the test is 0.6846, which means that lnwages and lncondo are also equal in the estimation. Aside from the check for equality, we also check if the lnconabâ€™s income elasticity to tobacco-alcohol consumption is equal to 1. The resulting p-value for the test is 0.8007, which means that the income elasticity of lnconab to tobacco-alcohol consumption is 1, meaning it is unit elastic.

## Results

## Model 1 â€“ Food Consumption

In the first model, which is the total food expenditure model, the variable domestic source of income in the 1st equation is considered to be statistically insignificant. This means that it will be meaningless to interpret the results of that particular variable. As for wages and foreign source of income, we can see that the two coefficients are very similar, which means that for every one percent increase in wages and foreign source of income, food consumption increases by 0.22 and 0.21 percent respectively. The results are clearly consistent with Engelâ€™s Law of food consumption that the proportion of food expenditure decrease as an individualâ€™s income increases.

For the 2nd equation, which is the wage equation, the result shows that the impact of non-agricultural activities is greater compared to agricultural activities. This is consistent with our a-priori expectation of one having a larger impact than the other. In reality, we can see that non-agricultural activities result to higher income due to its high value added products that it produces. The higher the value added the work is, the higher the changes are that wages or salaries received will be also higher.

For the 3rd and 4th equation, which is considered to be similar except for the source of income where it comes from, the results show that only highest grade completed is considered to be statistically significant in the 3rd equation, while in the 4th equation, the household headâ€™s age is the only one which is statistically insignificant. For the domestic source of income, we can observed that people who has a larger share of the wages or salaries in the company, have typically higher educational attainment compared to those who have lower educational attainment. The result of the 3rd equation maybe attributed to that factor. For the 4th equation, it is the same explanation for the highest grade completed by the household head as in the 3rd equation. While for the total family members employed with pay, it has a positive relationship, simply because if there are larger number of family members who are working and receiving salaries, the cumulative source of income will be larger, compared to those families who have fewer number of family members working with pay. The last variable in the 4th equation, which is the dummy variable contract worker, we can see in the result that if an individual is a contract worker, generally, that individual will receive lower wages compared to those regular employees. This is because contractual workers are given limited period of time to work for certain companies, and companies hire contractual workers for short term uses. With that, companies usually pay lower amount of wages to these short term workers.

## Model 2 â€“ Non food consumption

For the 2nd model, the nonfood consumption model, all the variables in the 1st equation are all statistically significant. The coefficients of wages and domestic source of income are similar, but there is a disparity between these two variables and the foreign source of income, which resulted to a higher coefficient. The higher coefficient means that the foreign source of income is more sensitive to nonfood consumption compared to the initial two variables â€“ wages and domestic income. We can see in the result that a ho