Determinants of Poverty of Households: Semi parametric Analysis of Demographic and Health Survey Data from Rwanda

: The main objective of this research is to identify the key determinants of poverty of household in Rwanda based on asset index and semi parametric modeling . The asset index for each household is established and thereafter the generalized additive mixed model is used to ascertain the key determinants of poverty of households in Rwanda. The semi parametric generalized additive mixed model allowed us to study the impact of nonlinear predictors as nonparametric and categorical predictors as parametric to the asset index. Using the Rwanda Demographic and Health Survey (2010), the characteristics of households and household heads are considered. Our findings show that the level of education, gender and age of household head, region (province), size of the household (number of household members) and place of residence (urban or rural) are significant predictors of poverty of households in Rwanda.


Introduction
The measurement of the socio-economic status of household is essential for health research, program targeting and policy monitoring and evaluation. The measurement and analysis of poverty in developing countries has classically been built on income and consumption. However, collecting data on income and expenditure can be time consuming and expensive (Vyas & Kumaranayake, 2006). Furthermore, in developing countries, measurement of consumption and expenditure is awash with problems such as recall and reluctance to reveal information. Moreover, prices are more likely to vary considerably across time and areas, requiring complex adjustments of the expenditure figures to reflect these price differences (Deaton & Zaidi, 1999;Lokosang et al., 2014). Sahn and Stifel (2003) studied the theoretical framework underpinning household income or expenditure as a tool for classifying socio-economic status (SES) in developing countries. Several researchers (Filmer & Pritchett, 1998;Filmer & Pritchett, 2001;Montgomery et al., 2000;Lokosang et al., 2014) used Principal Component Analysis (PCA) to create an asset index, using the demographic health survey variables such as durable goods, source of drinking water, toilet facility and housing quality to describe the household welfare, instead of using a household's income or expenditure.
There are other methods in the literature used to compute the weights of an asset index other than PCA. For instance, multiple correspondence analysis is similar to PCA, however is used for discrete data (Galbraith et al., 2002;Booysen et al., 2008). Sahn and Stifel ( 2003) utilized factor analysis, with a similar target to PCA, in terms of expressing a set of variables into a smaller number of indices or factors. The difference between PCA and factor analysis is that, while there are no assumptions associated with PCA, the factors derived from factor analysis are assumed to represent the underlying processes that result in the correlation between the variables. The main problem of the factor analysis method is that not all the assets show a linear relationship with living standards. PCA is the widely used method because it is computationally easier, it uses the type of data that can be easily collected in household surveys (Vyas & Kumaranayake, 2006), and it uses all of the variables in reducing the dimensionality of the data. PCA, as in the case of other statistical methods, has advantages and disadvantages. The main challenge of PCA based indices is to ensure that the range of asset variables used is broad enough to avoid problems of clumping and truncation (Habyarimana et al., 2015). Once these specific problems are identified, one of the solutions is to include additional variables that capture inequalities between households (McKenzie, 2005).
The previous studies done on the determinants of poverty of households used consumption or expenditure and/or parametric regression as their primary analysis (Jalan & Ravallion, 2000;Mok, Gan, & Sanyal, 2007;Muller, 2002;Rodriguez & Smith, 1994;Achia et al., 2010). However, these parametric models may suffer from inflexibility in modelling complicated relationships between the poverty index and the determinants where the functional form is not known. For this reason, it is very crucial to assess the determinants of poverty of households based on flexible models that let the data determine the most appropriate functional forms. The combination of parametric and nonparametric methods is more powerful than any single method in many applications. Therefore, the current study focuses on the application of an asset index of each household in Rwanda, computed using PCA (Habyarimana et al., 2015) and, thereafter, using the semi parametric regression model to identify the key determinants of poverty of households in Rwanda. There is no study in the literature using the asset index from RDHS (2010) data and the generalized additive mixed model as primary tools of analysis. The findings of this study will endeavor to contribute to identifying the key factors of poverty of households in Rwanda and, hence, contribute to the effort of the Economic Development and Poverty Reduction Strategy of Rwanda.

Source of data:
The Rwanda Demographic and Health Survey (2010) was done in two stages. In the first stage, 492 villages were considered with 12540 households, of which 2009 and 10531 were urban and rural respectively. Secondly, systematic sampling was used to select households in the selected villages. All women and men, aged between15-49 and 15-59 respectively, were eligible to be interviewed. The survey had various types of questionnaires such as for households, men and women. Only the household data to identify the factors determining the poverty among households in Rwanda was used. The questionnaire included household ownership of durable goods, school attendance, source of drinking water, sanitation facilities, washing places and housing characteristics such as building material.

Principal component and computation of poverty index:
Principal component analysis (PCA) is a multivariate statistical technique that linearly transforms an original data set of variables into a considerably smaller set of uncorrelated variables that represent most of the information in the original set of variables (Joliffe,2002;Manly,2004). The coefficients of principal components are chosen such that the first component accounts for as much of the variation in the original data as possible, subject to the condition that the sum of the squares of the scoring factors (or weights) is equal to 1. The second component is completely uncorrelated with the first one, and explains additional, but less variation, than the first component, subject to the same constraint of the sum of the squares of the scoring factors equal to 1. The subsequent components are uncorrelated with the previous components; then, each component captures an additional dimension in the data, while explaining smaller and smaller proportions of the variation of the original variables in the data. The remaining components are computed in a similar fashion. The cut-off point for the number of principal components is based on the magnitude of their variances. The graphical method, called a screed diagram, uses the steepness of the graph change as a cut-off point.
The first principal component is used as the household's wealth index (Filmer & Pritchett, 1998;Manly, 2004;Habyarimana et al., 2015). The scoring factors for each indicator from this first principal component are used to generate a household score. For the Rwanda household questionnaire data, which has 53 variables, the PCA analysis, internal coherence and robustness is tested and their corresponding percentage in the wealth quintile established (Habyarimana et al., 2015). Based on the results of the asset index, the household is classified as poor or not, making the response variable binary. The data about key poverty determinant variables for household heads such as age of household head, level of education of household head, gender of household head, and for the household itself such as size of household, location of residence and province, were compiled from the survey. Generalized additive mixed model: The parametric models offer a strong tool for modeling the relationship between the outcome variable and predictor variables when their assumptions met. However, these models may suffer from inflexibility in modelling complicated relationships between the outcome variable and the predictor variables in some applications and the parametric mean assumption may not always be desirable, as suitable functional forms of the predictor variables may not be known in advance and the response variables may depend on the covariates in a complicated manner (Lin & Zhang, 1999). The generalized additive mixed model (GAMM) relaxes the assumption of normality and linearity inherent in linear regression. The flexibility of nonparametric regression for continuous predictor variables, coupled with linear models for predictor variables, offers ways to reveal structure within the data that may miss linear assumptions. This flexibility of GAMM motivated the use of semi parametric logistic mixed model to assess the determinants of poverty of households in Rwanda.
Model Overview: The generalized additive model (GAM) is a flexible model that allows non-normal error distributions. This enables modelling outcome variables with distributions such as Poisson and binomial. Generalized additive model extends the generalized linear model (GLM) by permitting the predictor function to comprise a priori unspecified nonlinear functions of some, or all, the covariates (Hastie & Tibshirani, 1990). Consider a random outcome variable Y and a set of predictor variables 1 , 2 , … , .A regression procedure can be viewed as a method for estimating the expected value of Y, given the values of 1 , 2 , … , . The standard linear regression model assumes a linear form for the conditional expectation as follows: ( | 1 , 2 , … , ) = 0 + 1 1 + 2 2 + ⋯ + (2) Where 0 , 1 , 2 , … , are, in general, obtained using least square methods. GAM generalizes equation (2) by modelling the conditional expectation as is the linear parametric component of the model with , the row of the design matrix X associated with covariates which are modelled linearly to Y j , as in GLM, is the parameter vector, Y j is from the exponential family distributions, g (.) is a known, monotonic twice differentiable link function, and . are smooth functions. If no linear component is included in model (3), then the model is known as nonparametric, but a model whose predictions consist of both linear and unspecified nonlinear functions of predictor variables is referred as semi parametric. In order to be estimable, the smooth functions have to satisfy standardized conditions such that [ ] =0, since, otherwise, there will be free constants in each of the functions (Hastie & Tibshirani, 1990).
The generalized additive mixed model (GAMM) can be seen as an extension of GAM to incorporate random effect or an additive extension of the generalized linear mixed model of Breslow and Clayton (1993), to allow the parametric fixed effects to be modeled nonparametrically, using additive smooth functions in a similar spirit to Hastie and Tibshirani (1990). Therefore, Lin and Zhang (1999) formulated GAMM as follows g = 0 + 1 1 + 2 2 + ⋯ + + (4) where g . is a monotonic differentiable link function, = 1, 1 , 2 , … , are m covariate associated with fixed effects and × 1 vector of covariates associated random effects, . is a centred twice differentiable smooth function, b is the random effect and is assumed to be distributed as 0, ( ) and is a × 1 vector of variance components.
A fundamental feature of GAMM (4) over GAM is that the additive nonparametric functions are used to model covariate effects and random effects are used to model the correlation between observations (Lin & Zhang, 1999;Wang, 1998). If . is a linear function, then GAMM (4) reduces to generalized linear mixed model (GLMM) of Breslow and Clayton (1993). For a given variance component , the log-likelihood function of 0 , , is given (Lin & Zhang, 1999) Where ( , )is the range of the predictor variable and  i are smoothing parameters that regulates the tradeoff between goodness of fit and smoothness of the estimated function. In addition, . is an × 1 unknown vector of the values of . , calculated at the ordered values of the = 1,2, … , and is the smoothing matrix (Green & Silverman, 1993).GAMM, given in (4), can be formulated in matrix form as g μ = 0 + 1 1 + 2 + ⋯ + + (7) whereg μ = [g 1 , g 2 , … . , g ], 1 is an × 1 vector of ones, is an × matrix, such that the component of is and = ( 1 , 2 , … , ). The numerical integration is needed to estimate the equation (6). The natural cubic smoothing spline estimators of . , evaluated by explicit maximization of equation (7), is sometimes challenging. To solve this problem, Lin and Zhang (1999) proposed the double penalized quasi-likelihood model as an alternate viable approach for approximation in the model, where the smoothing function . is re-parameterizedin terms of and in a one-to-one transformation as = * + , and then, the double penalized quasi-likelihood with respect to ( 0 , ) and b is given by ) and = 1  note that small values of = ( 1 , … , ) correspond to over smoothing (Breslow & Clayton, 1993;Lin & Zhang, 1999).

Model fitting and interpretation of the results
The main objective was to model continuous variables nonparametrically and other covariates, modelled parametrically using generalized additive mixed model. The various procedures for estimation discussed for fitting GAMM can be used when fitting the semi parametric logistic mixed model (9). The library mgcv from R package was used to fit the data. R package has many options for controlling the model smoothness, using splines such as cubic smoothing splines, locally-weighted running line smoothers, and kernel smoothers. For more details, see the following authors: (Green & Silverman, 1993;Hardle, 1990;Hastie & Tibshirani, 1990;Ruppert et al., 2003). The present study used the shrinkage smoothers (spline) to fit the model (9). The shrinkage smoothers have several advantages, for instance, helping to circumvent the knot placement. In addition, the method is constructed to smooth any number of covariates. Moreover, the creation of shrinkage smoothers is made in a way that smooth terms are penalized away altogether (Wood, 2006). The main effect is considered, and also possible two way interaction effects, where the AIC of each model is examined and the inference of smooth function and the p-value of the individual smooth term. Finally, the model with smaller AIC and higher value of degree of freedom and highly statistically significant was selected as follows g = 0 + 1 + 2 + 3 + 4 + 5 + 6 * + + 2 * + 0 (9) Where g is the logit link function, ′ areparametric regression coefficients, ′ are centred smooth functions and 0 is the random effect distributed as (0, ).
The results from model (9) are presented in Table 1,2 and 3 and in Figure 1 and 2. From Table1, it is observed that the level of education of the household head significantly affects the socio-economic status of the household, where the poverty of the household increases by decreasing the level of education of the household head. Furthermore, it is observed that a household with a household head with secondary education, primary education or no formal education is 4.1850( 1.4315 ), 14.2008 ( 2.6533 ) or 24.5154 ( 3.1993 ) times respectively, more likely to be poor as compared to a household with a household head with tertiary education. A household from an urban area is 0.7703( −0.2061 ) times less likely to be poor than a household from a rural area, as seen in Table 1. The size of the household significantly affects the socio-economic status of the household, also shown in Table1. A family of four members or less is0.6433 ( −0.4411 ) less likely to be poor than a family of five members or more (Table1).

Interaction effects fixed parameters:
In this study, not only are the main parametric effects considered, but also considered are the two-way interaction effects. Of interest, are the interaction effects between province or region and place of residence (urban or rural). Figure1shows that in all provinces a rural household is more likely to be poor as compared to an urban one. In the same figure, it is observed that there is a big gap, in terms of poverty, between a rural and urban household from Southern province and Western province. However, this gap is smaller in Kigali and Eastern province.
Approximate smooth function: In Figure2, the estimated smoothing components for household socioeconomic status are observed. The Y-axis represents the contribution of smooth function to the fitted values for household socio-economic status. In each figure, the smooth curve denotes the estimated trend of GAMM; s is a smooth term and the number in parentheses represents the estimated degree of freedom (edf). The effects of age and gender (female) on household socio-economic status is presented in Figure 2B; the trend shows that the poverty of a household headed by a younger female increases with the age of the household head to approximately 35, and then from there, the poverty decreases up to the age of approximately 60 years. The test statistics is 2.110 with 3.7492degrees of freedom with a high significance (p-value=0.000184***)against the assumption that the interaction of age and female gender is linearly associated to the socio-economic status of the household. In Figure 2 panel C the poverty of a household headed by young male decreases with increasing age up to approximately 30 years old. However, the poverty decreases with the increasing age of the head from approximately 35 to 60 years old. In addition, from 60 years of age, the poverty of a household increases with the increasing age of the household head regardless of the gender of the household head. The statistic test is 1.484 with 4.0044 degrees of freedom(p-value=0.004930**) against the assumption that the interaction of age and male gender is linearly associated to the socio-economic status of the household.   The main idea of the nonparametric methods is to allow the data determine the most appropriate functional forms. Wu and Zhang ( 2006)argue, that nonparametric and parametric regression methods should not be regarded as competitors; rather complements of each other. The combination of these two methods is much more powerful than any single model. This study first created the socio-economic status of each household. Thereafter, the generalized additive mixed model, that relaxes the assumption of normality and linearity inherent in linear regression, was used. The flexibility of nonparametric regression for continuous covariates combined with linear models for predictor variables provides a means to uncover structure within the data that may be missed by linear assumptions. The combination of nonparametric and parametric models (semi parametric) was very useful to the study because of non-linearity from continuous covariates and interaction effects of categorical predictors and continuous covariates and the linearity of categorical data. The asset based measurement of poverty is increasingly being used, but it has some limitations. The asset index from the DHS data set is more reflective of long-run household wealth or living standards (Filmer & Pritchett, 2001). Then, in the case of Rwanda, if the need is the current resources available to households, an asset index may not be the most appropriate measure.

Conclusion and Recommendations
Based on the asset index from RDHS (2010) and the generalized additive mixed model, this paper identified the key determinants of poverty of households in Rwanda. The results showed that the education level, gender and age of a household head, the size of household (number of household member), place of residence and province are the determinants of poverty of households in Rwanda. The trend of poverty of households headed by a young female was found to increase with the increasing age of a household head (approximately up to 35 years old);but it decreased for a household headed by a young male of the same age. This is in line with other findings such as those of Sahn and Stifel (2000), Gounder (2012) and Habyarimana et al.(2015).However, in these previous studies (Gounder, 2012;Sahn & Stifel, 2000), where the gender of a household head was considered in parametric regression(logistic regression)models, it was found that a household headed by a female was more likely to be poor than a household headed by a male. However, the use of a semi parametric logistic mixed model revealed that it is only true when both male and female are young (up to 35 years old). They methodological difference in our model as compared to (Gounder, 2012;Sahn & Stifel, 2000;Habyarimana et al., 2015) is the combination of nonparametric and parametric models (semi parametric) to account for non-linearity from continuous covariates and interaction effects of categorical predictors and continuous covariates and the linearity of categorical data as previously stated. Otherwise, the household headed by a female is slightly better off than a household headed by a male Figures2B and 2 C.
A rural household in all provinces is more likely to be poor than an urban household this is in line with other findings such as (Achia et al., 2010;Habyarimana et al., 2015).This supports the existing policy of grouped settlements where people are advised to build their houses in villages known as Imidugudu. In addition, the big gap between rural households in Southern province and Western province (Figure2) suggests the need for a detailed study to investigate the causes of this gap, possibly leading to a special targeting policy for reducing the high differences between rural households. It was also found that the poverty of household decreases with increasing the level of education of household head, this result was consistent with other authors (Achia et al., 2010;Saidatulakmal &Riaz, 2012;Habyarimana et al., 2015). This highlights the necessity for universal education. However, it can be more beneficial if universal education reaches tertiary education.