Nonclassical measurements errors in nonlinear models

Edith Madsen, Ismir Mulalic

    Research output: Contribution to conferencePaperResearch

    Abstract

    Discrete choice models and in particular logit type models play an important role in understanding and quantifying individual or household behavior in relation to transport demand. An example is the choice of travel mode for a given trip under the budget and time restrictions that the individuals of a households face. In this case an important policy parameter is the effect of income (reflecting the household budget) on the choice of travel mode. This paper deals with the consequences of measurement error in income (an explanatory variable) in discrete choice models. Since it is likely to give misleading estimates of the income effect it is of interest to investigate the magnitude of the estimation bias and if possible use estimation techniques that take the measurement error problem into account.
    We use data from the Danish National Travel Survey (NTS) and merge it with administrative register data that contains very detailed information about incomes. This gives a unique opportunity to learn about the magnitude and nature of the measurement error in income reported by the respondents in the Danish NTS compared to income from the administrative register (correct measure). We find that the classical measurement error model (for the logarithm to income) is valid except in the tails of the income distribution where those with low (high) income tends to over (under) report. In addition we find that the marginal distribution of the measurement errors is symmetric and leptokurtic and hence has a higher peak around zero and thicker tails than a normal distribution.
    In a linear regression model where the explanatory variable is measured with error it is well-known that this gives a downward bias in the absolute value of the corresponding regression parameter (attenuation), Friedman (1957). In non-linear models it is more difficult to obtain an expression for the bias as it depends on the distribution of the true underlying variable as well as the error distribution. Chesher (1991) give some approximations to very general non-linear models and Stefanski & Carroll (1985) in the logistic regression model. Using these results we find that the bias in logit models can be substantial with the magnitude of measurement error found in income from the NTS survey.
    A way to solve the problem with the measurement error is to use instruments. These are additional variables (not already in the model) that are uncorrelated with the measurement error part of income and correlated with the underlying true income. If the data contains information about the expenditure on consumption items such as housing or other consumption goods (that do not affect the discrete choice of interest directly) and we find it plausible that the possible measurement errors in these expenditures are uncorrelated with the measurement in income these will be valid instruments. However, in the Danish NTS there is only limited information about such expenditures (we know if the household owns or rents their home) so finding a good instrument will be difficult. Another possibility is to use technical instruments (understood as a function of the variables already in the model) by using the properties of the measurement errors and the model for measured income. Lewbel (1997) shows that if the distribution of the measurement errors is symmetric and the distribution of the underlying true income is skewed then there are valid technical instruments. We investigate how this IV estimation approach works in theory and illustrate it by simulation studies using the findings about the measurement error model for income from the NTS.
    Original languageEnglish
    Publication date2012
    Publication statusPublished - 2012
    EventKuhmo Nectar Conference and Summer School on Transportation Economics 2012 - Berlin, Germany
    Duration: 18 Jun 201222 Jun 2012

    Conference

    ConferenceKuhmo Nectar Conference and Summer School on Transportation Economics 2012
    CountryGermany
    CityBerlin
    Period18/06/201222/06/2012

    Bibliographical note

    References:
    Chesher, A. 1991. The effect of measurement error. Biometrika, 78, 451–462.
    Friedman, M. 1957. A Theory of the Consumption Function. Princeton University Press.
    Lewbel, A. 1997. Constructing instruments for regressions with measurement error when no additional data is available, with an application to R&D. Econometrica, 65, 1201-1213.
    Stefanski, L.A. and R.J. Carroll, 1990, Covariate measurement error in logistic regression, Ann. Statist., 13, 1355-51.

    Keywords

    • National travel survey
    • Nonclassical measurement error
    • Nonlinear models

    Cite this

    Madsen, E., & Mulalic, I. (2012). Nonclassical measurements errors in nonlinear models. Paper presented at Kuhmo Nectar Conference and Summer School on Transportation Economics 2012, Berlin, Germany.
    Madsen, Edith ; Mulalic, Ismir. / Nonclassical measurements errors in nonlinear models. Paper presented at Kuhmo Nectar Conference and Summer School on Transportation Economics 2012, Berlin, Germany.
    @conference{c756e394f3ab4e66ad0aa924b4aa7d80,
    title = "Nonclassical measurements errors in nonlinear models",
    abstract = "Discrete choice models and in particular logit type models play an important role in understanding and quantifying individual or household behavior in relation to transport demand. An example is the choice of travel mode for a given trip under the budget and time restrictions that the individuals of a households face. In this case an important policy parameter is the effect of income (reflecting the household budget) on the choice of travel mode. This paper deals with the consequences of measurement error in income (an explanatory variable) in discrete choice models. Since it is likely to give misleading estimates of the income effect it is of interest to investigate the magnitude of the estimation bias and if possible use estimation techniques that take the measurement error problem into account. We use data from the Danish National Travel Survey (NTS) and merge it with administrative register data that contains very detailed information about incomes. This gives a unique opportunity to learn about the magnitude and nature of the measurement error in income reported by the respondents in the Danish NTS compared to income from the administrative register (correct measure). We find that the classical measurement error model (for the logarithm to income) is valid except in the tails of the income distribution where those with low (high) income tends to over (under) report. In addition we find that the marginal distribution of the measurement errors is symmetric and leptokurtic and hence has a higher peak around zero and thicker tails than a normal distribution. In a linear regression model where the explanatory variable is measured with error it is well-known that this gives a downward bias in the absolute value of the corresponding regression parameter (attenuation), Friedman (1957). In non-linear models it is more difficult to obtain an expression for the bias as it depends on the distribution of the true underlying variable as well as the error distribution. Chesher (1991) give some approximations to very general non-linear models and Stefanski & Carroll (1985) in the logistic regression model. Using these results we find that the bias in logit models can be substantial with the magnitude of measurement error found in income from the NTS survey. A way to solve the problem with the measurement error is to use instruments. These are additional variables (not already in the model) that are uncorrelated with the measurement error part of income and correlated with the underlying true income. If the data contains information about the expenditure on consumption items such as housing or other consumption goods (that do not affect the discrete choice of interest directly) and we find it plausible that the possible measurement errors in these expenditures are uncorrelated with the measurement in income these will be valid instruments. However, in the Danish NTS there is only limited information about such expenditures (we know if the household owns or rents their home) so finding a good instrument will be difficult. Another possibility is to use technical instruments (understood as a function of the variables already in the model) by using the properties of the measurement errors and the model for measured income. Lewbel (1997) shows that if the distribution of the measurement errors is symmetric and the distribution of the underlying true income is skewed then there are valid technical instruments. We investigate how this IV estimation approach works in theory and illustrate it by simulation studies using the findings about the measurement error model for income from the NTS.",
    keywords = "National travel survey, Nonclassical measurement error, Nonlinear models",
    author = "Edith Madsen and Ismir Mulalic",
    note = "References: Chesher, A. 1991. The effect of measurement error. Biometrika, 78, 451–462. Friedman, M. 1957. A Theory of the Consumption Function. Princeton University Press. Lewbel, A. 1997. Constructing instruments for regressions with measurement error when no additional data is available, with an application to R&D. Econometrica, 65, 1201-1213. Stefanski, L.A. and R.J. Carroll, 1990, Covariate measurement error in logistic regression, Ann. Statist., 13, 1355-51. ; Kuhmo Nectar Conference and Summer School on Transportation Economics 2012 ; Conference date: 18-06-2012 Through 22-06-2012",
    year = "2012",
    language = "English",

    }

    Madsen, E & Mulalic, I 2012, 'Nonclassical measurements errors in nonlinear models' Paper presented at Kuhmo Nectar Conference and Summer School on Transportation Economics 2012, Berlin, Germany, 18/06/2012 - 22/06/2012, .

    Nonclassical measurements errors in nonlinear models. / Madsen, Edith; Mulalic, Ismir.

    2012. Paper presented at Kuhmo Nectar Conference and Summer School on Transportation Economics 2012, Berlin, Germany.

    Research output: Contribution to conferencePaperResearch

    TY - CONF

    T1 - Nonclassical measurements errors in nonlinear models

    AU - Madsen, Edith

    AU - Mulalic, Ismir

    N1 - References: Chesher, A. 1991. The effect of measurement error. Biometrika, 78, 451–462. Friedman, M. 1957. A Theory of the Consumption Function. Princeton University Press. Lewbel, A. 1997. Constructing instruments for regressions with measurement error when no additional data is available, with an application to R&D. Econometrica, 65, 1201-1213. Stefanski, L.A. and R.J. Carroll, 1990, Covariate measurement error in logistic regression, Ann. Statist., 13, 1355-51.

    PY - 2012

    Y1 - 2012

    N2 - Discrete choice models and in particular logit type models play an important role in understanding and quantifying individual or household behavior in relation to transport demand. An example is the choice of travel mode for a given trip under the budget and time restrictions that the individuals of a households face. In this case an important policy parameter is the effect of income (reflecting the household budget) on the choice of travel mode. This paper deals with the consequences of measurement error in income (an explanatory variable) in discrete choice models. Since it is likely to give misleading estimates of the income effect it is of interest to investigate the magnitude of the estimation bias and if possible use estimation techniques that take the measurement error problem into account. We use data from the Danish National Travel Survey (NTS) and merge it with administrative register data that contains very detailed information about incomes. This gives a unique opportunity to learn about the magnitude and nature of the measurement error in income reported by the respondents in the Danish NTS compared to income from the administrative register (correct measure). We find that the classical measurement error model (for the logarithm to income) is valid except in the tails of the income distribution where those with low (high) income tends to over (under) report. In addition we find that the marginal distribution of the measurement errors is symmetric and leptokurtic and hence has a higher peak around zero and thicker tails than a normal distribution. In a linear regression model where the explanatory variable is measured with error it is well-known that this gives a downward bias in the absolute value of the corresponding regression parameter (attenuation), Friedman (1957). In non-linear models it is more difficult to obtain an expression for the bias as it depends on the distribution of the true underlying variable as well as the error distribution. Chesher (1991) give some approximations to very general non-linear models and Stefanski & Carroll (1985) in the logistic regression model. Using these results we find that the bias in logit models can be substantial with the magnitude of measurement error found in income from the NTS survey. A way to solve the problem with the measurement error is to use instruments. These are additional variables (not already in the model) that are uncorrelated with the measurement error part of income and correlated with the underlying true income. If the data contains information about the expenditure on consumption items such as housing or other consumption goods (that do not affect the discrete choice of interest directly) and we find it plausible that the possible measurement errors in these expenditures are uncorrelated with the measurement in income these will be valid instruments. However, in the Danish NTS there is only limited information about such expenditures (we know if the household owns or rents their home) so finding a good instrument will be difficult. Another possibility is to use technical instruments (understood as a function of the variables already in the model) by using the properties of the measurement errors and the model for measured income. Lewbel (1997) shows that if the distribution of the measurement errors is symmetric and the distribution of the underlying true income is skewed then there are valid technical instruments. We investigate how this IV estimation approach works in theory and illustrate it by simulation studies using the findings about the measurement error model for income from the NTS.

    AB - Discrete choice models and in particular logit type models play an important role in understanding and quantifying individual or household behavior in relation to transport demand. An example is the choice of travel mode for a given trip under the budget and time restrictions that the individuals of a households face. In this case an important policy parameter is the effect of income (reflecting the household budget) on the choice of travel mode. This paper deals with the consequences of measurement error in income (an explanatory variable) in discrete choice models. Since it is likely to give misleading estimates of the income effect it is of interest to investigate the magnitude of the estimation bias and if possible use estimation techniques that take the measurement error problem into account. We use data from the Danish National Travel Survey (NTS) and merge it with administrative register data that contains very detailed information about incomes. This gives a unique opportunity to learn about the magnitude and nature of the measurement error in income reported by the respondents in the Danish NTS compared to income from the administrative register (correct measure). We find that the classical measurement error model (for the logarithm to income) is valid except in the tails of the income distribution where those with low (high) income tends to over (under) report. In addition we find that the marginal distribution of the measurement errors is symmetric and leptokurtic and hence has a higher peak around zero and thicker tails than a normal distribution. In a linear regression model where the explanatory variable is measured with error it is well-known that this gives a downward bias in the absolute value of the corresponding regression parameter (attenuation), Friedman (1957). In non-linear models it is more difficult to obtain an expression for the bias as it depends on the distribution of the true underlying variable as well as the error distribution. Chesher (1991) give some approximations to very general non-linear models and Stefanski & Carroll (1985) in the logistic regression model. Using these results we find that the bias in logit models can be substantial with the magnitude of measurement error found in income from the NTS survey. A way to solve the problem with the measurement error is to use instruments. These are additional variables (not already in the model) that are uncorrelated with the measurement error part of income and correlated with the underlying true income. If the data contains information about the expenditure on consumption items such as housing or other consumption goods (that do not affect the discrete choice of interest directly) and we find it plausible that the possible measurement errors in these expenditures are uncorrelated with the measurement in income these will be valid instruments. However, in the Danish NTS there is only limited information about such expenditures (we know if the household owns or rents their home) so finding a good instrument will be difficult. Another possibility is to use technical instruments (understood as a function of the variables already in the model) by using the properties of the measurement errors and the model for measured income. Lewbel (1997) shows that if the distribution of the measurement errors is symmetric and the distribution of the underlying true income is skewed then there are valid technical instruments. We investigate how this IV estimation approach works in theory and illustrate it by simulation studies using the findings about the measurement error model for income from the NTS.

    KW - National travel survey

    KW - Nonclassical measurement error

    KW - Nonlinear models

    M3 - Paper

    ER -

    Madsen E, Mulalic I. Nonclassical measurements errors in nonlinear models. 2012. Paper presented at Kuhmo Nectar Conference and Summer School on Transportation Economics 2012, Berlin, Germany.