SURE-Autometrics algorithm for model selection in multiple equations

The ambiguous process of model building can be explained by expert modellers due to their tacit knowledge acquired through research experiences. Meanwhile, practitioners who are usually non-experts and lack of statistical knowledge will face difficulties during the modelling process. Hence, algorit...

Full description

Saved in:
Bibliographic Details
Main Author: Norhayati, Yusof
Format: Thesis
Language:eng
eng
Published: 2016
Subjects:
Online Access:https://etd.uum.edu.my/6060/1/s92279_01.pdf
https://etd.uum.edu.my/6060/2/s92279_02.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-uum-etd.6060
record_format uketd_dc
institution Universiti Utara Malaysia
collection UUM ETD
language eng
eng
advisor Ismail, Suzilah
topic QA Mathematics
spellingShingle QA Mathematics
Norhayati, Yusof
SURE-Autometrics algorithm for model selection in multiple equations
description The ambiguous process of model building can be explained by expert modellers due to their tacit knowledge acquired through research experiences. Meanwhile, practitioners who are usually non-experts and lack of statistical knowledge will face difficulties during the modelling process. Hence, algorithm with a step by step guidance is beneficial in model building, testing and selection. However, most model selection algorithms such as Autometrics only concentrate on single equation modelling which has limited application. Thus, this study aims to develop an algorithm for model selection in multiple equations focusing on seemingly unrelated regression equations (SURE) model. The algorithm is developed by integrating the SURE model with the Autometrics search strategy; hence, it is named as SURE-Autometrics. Its performance is assessed using Monte Carlo simulation experiments based on five specification models, three strengths of correlation disturbances and two sample sizes. Two sets of general unrestricted models (GUMS) are then formulated by adding a number of irrelevant variables to the specification models. The performance is measured by the percentages of SURE-Autometrics algorithm that are able to eliminate the irrelevant variables from the initial GUMS of two, four and six equations. The SURE-Autometrics is also validated using two sets of real data by comparing the forecast error measures with five model selection algorithms and three non-algorithm procedures. The findings from simulation experiments suggested that SURE-Autometrics performed well when the number of equations and number of relevant variables in the true specification model were minimal. Its application on real data indicated that several models are able to forecast accurately if the data has no quality problem. This automatic model selection algorithm is better than non-algorithm procedure which requires knowledge and extra time. In conclusion, the performance of model selection in multiple equations using SURE-Autometrics is dependent upon data quality and complexities of the SURE model.
format Thesis
qualification_name Ph.D.
qualification_level Doctorate
author Norhayati, Yusof
author_facet Norhayati, Yusof
author_sort Norhayati, Yusof
title SURE-Autometrics algorithm for model selection in multiple equations
title_short SURE-Autometrics algorithm for model selection in multiple equations
title_full SURE-Autometrics algorithm for model selection in multiple equations
title_fullStr SURE-Autometrics algorithm for model selection in multiple equations
title_full_unstemmed SURE-Autometrics algorithm for model selection in multiple equations
title_sort sure-autometrics algorithm for model selection in multiple equations
granting_institution Universiti Utara Malaysia
granting_department Awang Had Salleh Graduate School of Arts & Sciences
publishDate 2016
url https://etd.uum.edu.my/6060/1/s92279_01.pdf
https://etd.uum.edu.my/6060/2/s92279_02.pdf
_version_ 1747828016821567488
spelling my-uum-etd.60602021-04-19T03:02:57Z SURE-Autometrics algorithm for model selection in multiple equations 2016 Norhayati, Yusof Ismail, Suzilah Awang Had Salleh Graduate School of Arts & Sciences Awang Had Salleh Graduate School of Arts and Sciences QA Mathematics The ambiguous process of model building can be explained by expert modellers due to their tacit knowledge acquired through research experiences. Meanwhile, practitioners who are usually non-experts and lack of statistical knowledge will face difficulties during the modelling process. Hence, algorithm with a step by step guidance is beneficial in model building, testing and selection. However, most model selection algorithms such as Autometrics only concentrate on single equation modelling which has limited application. Thus, this study aims to develop an algorithm for model selection in multiple equations focusing on seemingly unrelated regression equations (SURE) model. The algorithm is developed by integrating the SURE model with the Autometrics search strategy; hence, it is named as SURE-Autometrics. Its performance is assessed using Monte Carlo simulation experiments based on five specification models, three strengths of correlation disturbances and two sample sizes. Two sets of general unrestricted models (GUMS) are then formulated by adding a number of irrelevant variables to the specification models. The performance is measured by the percentages of SURE-Autometrics algorithm that are able to eliminate the irrelevant variables from the initial GUMS of two, four and six equations. The SURE-Autometrics is also validated using two sets of real data by comparing the forecast error measures with five model selection algorithms and three non-algorithm procedures. The findings from simulation experiments suggested that SURE-Autometrics performed well when the number of equations and number of relevant variables in the true specification model were minimal. Its application on real data indicated that several models are able to forecast accurately if the data has no quality problem. This automatic model selection algorithm is better than non-algorithm procedure which requires knowledge and extra time. In conclusion, the performance of model selection in multiple equations using SURE-Autometrics is dependent upon data quality and complexities of the SURE model. 2016 Thesis https://etd.uum.edu.my/6060/ https://etd.uum.edu.my/6060/1/s92279_01.pdf text eng public https://etd.uum.edu.my/6060/2/s92279_02.pdf text eng public Ph.D. doctoral Universiti Utara Malaysia Armstrong, J. S., & Collopy, F. (1992). Error measures for generalising about forecasting methods: Empirical comparison. International Journal of Forecasting. Armstrong, J. S., & Fildes, R. (1995). On the Selection of Error Measures for Comparison among Forecasting Methods. Journal of Forecasting, 71, 67–71. Bartolomei, S. M., & Sweet, A. L. (1989). A note on a comparison of exponential smoothing methods for forecasting seasonal series. International Journal of Forecasting, 5, 111–116. Beasley, T. M. (2008). Seemingly unrelated regression (SUR) models as a solution to path analytic models with correlated errors. Multiple Linear Regression Viewpoints, 34(1), 1–7. Bhatti, M. I., Al-Shanfari, H., & Hossain, M. Z. (2006). Econometric Analysis of Model Selection and Model Testing. Aldershot: Ashgate Publishing Limited. Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37, 373–384. Castle, J. L., Doornik, J. A., & Hendry, D. F. (2011). Evaluating automatic model selection. Journal of Time Series Econometrics, 3(1), 33. Castle, J. L., Qin, X., & Reed, W. R. (2013). Using model selection algorithms to obtain reliable coefficient estimates. Journal of Economic Surveys, 27(2), 269–296. http://doi.org/10.1111/j.1467-6419.2011.00704.x Chen, H., Wan, Q., & Wang, Y. (2014). Refined Diebold-Mariano Test Methods for the Evaluation of Wind Power Forecasting Models. Energies, 7(7), 4185–4198. http://doi.org/10.3390/en7074185 Denton, F. T. (1985). Data Mining as an Industry. The Review of Economics and Statistics, 67(1), 124–127. Derksen, S., & Keselman, H. J. (1992). Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematicals and Statistical Psychology, 45, 265–282. Doornik, J. A. (2008). Encompassing and automatic model selection. Oxford Bulletin of Economics and Statistics, 70, 915–925. http://doi.org/10.1111/j.1468-0084.2008.00536.x Doornik, J. A. (2009). Autometrics. In J. L. Castle & N. Shephard (Eds.), The Methodology and Practice of Econometrics: A Festschrift in Honour of David F. Hendry (pp. 88–121). New York: Oxford University Press. Doornik, J. A., & Hendry, D. F. (2007). Empirical Econometric Modelling using PcGive 12: Volume 1. London: Timberlake Consultants Ltd. Dufour, J.-M., & Khalaf, L. (2002). Exact tests for contemporaneous correlation of disturbances in seemingly unrelated regressions. Journal of Econometrics, 106, 143–170. Efroymson, M. A. (1960). Multiple Regression Analysis. Mathematical Methods for Digital Computers. New York: Wiley. Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987–1007. Ericsson, N. R., & Kamin, S. B. (2009). Constructive Data Mining: Modeling Argentine Broad Money Demand. In J. L. Castle & N. Shephard (Eds.), The Methodology and Practice of Econometrics: A Festschrift in Honour of David F. Hendry (pp. 412–439). New York: Oxford University Press. Fernandez, S., Smith, C. R., & Wenger, J. B. (2007). Employment, privatization, and managerial choice: Does contracting out reduce public sector employment? Journal of Policy Analysis and Management, 26, 57–77. Fildes, R. (1992). The evaluation of extrapolative forecasting methods. International Journal of Forecasting. Fildes, R., Wei, Y., & Ismail, S. (2011). Evaluating the forecasting performance of econometric models of air passenger traffic flows using multiple error measures. International Journal of Forecasting, 27(3), 902–922. http://doi.org/10.1016/j.ijforecast.2009.06.002 Fisher, S. (1993). The role of macroeconomics factors in growth. Journal of Monetary Economics, 32, 485–512. Foster, D. P., & Stine, R. A. (2004). Variable selection in data mining: Building a predictive model for bankcruptcy. Journal of the American Statistical Association, 99(466), 303–313. Garcıa-Ferrer, A., Highfield, R. A., Palm, F., & Zellner, A. (1987). Macroeconomic Forecasting Using International Data. Journal of Business & Economic Statistics, 8(1), 53–67. Retrieved from http://core.kmi.open.ac.uk/download/pdf/6750786.pdf Granger, C. W. J. (1999). Empirical Modelling in Economics: Specification and Evaluation. New York: Cambridge University Press. Granger, C. W. J., & Hendry, D. F. (2005). A Dialogue Concerning A New Instrument for Econometric Modeling. Econometric Theory, 21, 278–297. http://doi.org/10.1017/S0266466605050164 Granger, C. W. J., Hendry, D. F., & Hansen, B. E. (2005). Challenges for econometric model selection. Econometric Theory, 21, 60–68. http://doi.org/10.1017/S0266466605050048 Greene, W. H. (2012). Econometric Analysis (7th ed.). Edinburgh Gate: Pearson Education Limited. Hansen, P. R. (2005). A test for superior predictive ability. Journal of Business and Economic Statistics, 23, 461–465. Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of Forecasting, 13, 281–291. Hastie, T., Tibshirani, R., & Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Sringer Series in Statistics. New York: Springer. Hendry, D. F. (1980). Econometrics-Alchemy or Science? Economica, 47(188), 387–406. Hendry, D. F. (1995). Dynamic Econometrics. Oxford University Press. Hendry, D. F. (2001). Achievements and challenges in econometric methodology. Journal of Econometrics, 100(1), 7–10. http://doi.org/10.1016/S0304-4076(00)00045-2 Hendry, D. F., & Doornik, J. A. (2014). Empirical Model Discovery and Theory Evaluation: Automatic Selection Methods in Econometrics. MIT Press. Hendry, D. F., & Krolzig, H.-M. (1999). Improving on “Data mining reconsidered” by K.D. Hoover and S.J. Perez. Econometrics Journal, 2, 202–219. Hendry, D. F., & Krolzig, H.-M. (2001). Automatic Econometric Model Selection Using PcGets 1.0. London: Timberlake Consultans Press. Hendry, D. F., & Krolzig, H.-M. (2003). New Developments in Automatic General to-Specific Modeling. In Econometrics and the Philosophy of Economics: Theory-data confrontations in economics (pp. 379–419). Princeton: Princeton University Press. Hendry, D. F., & Krolzig, H.-M. (2004). We ran one regression. Oxford Bulletin of Economics and Statistics, 66, 799–810. Hendry, D. F., & Krolzig, H.-M. (2005). The properties of automatic Gets modelling. Economic Journal, 115(502), C32–C61. http://doi.org/10.1111/j.0013-0133.2005.00979.x Hendry, D. F., & Reade, J. J. (2008). Modelling and forecasting using model averaging. Working paper. Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian Model Averaging: A Tutorial. Statistical Science, 14(4), 382–401. Hoover, K. D., & Perez, S. J. (1999). Data mining reconsidered: Encompassing and the general-to-specific approach to specification search. Econometrics Journal, 2, 167–191. Hoover, K. D., & Perez, S. J. (2000). Three attitudes towards data mining. Journal of Economic Methodology, 7(2), 195–210. http://doi.org/10.1080/13501780050045083 Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. http://doi.org/10.1016/j.ijforecast.2006.03.001 Ismail, S. (2005). Algorithmic approaches to multiple time series forecasting. University of Lancaster, Lancaster. Ismail, S., & Fildes, R. (2007). Algorithmic approaches to multiple time series forecasting. In The 27th Annual International Symposium on Forecasting. New York. Ismail, S., Yusof, N., & T-Muda, T.-Z. (2015). Algorithmic approaches in model selection of the air passengers flows data. Proceedings of the 5th International Conference on Computing and Informatics, 32–37. Judge, G. G., Hill, R. C., Griffiths, W. E., Lütkepohl, H., & Lee, T.-C. (1988). Introduction to the Theory and Practice of Econometrics (2nd ed.). New York: Wiley. Kontoghiorghes, E. J. (2004). Computational methods for modifying seemingly unrelated regressions models. Journal of Computational and Applied Mathematics, 162(1), 247–261. http://doi.org/10.1016/j.cam.2003.08.024 Krolzig, H.-M. (2001). General-to-specific reductions of vector autoregressive process. Econometrics Studies: A Festschrift in Honour of Joachim Frohn. Münster: LIT. Krolzig, H.-M., & Hendry, D. F. (2001). Computer automation of general-to-specific model selection procedures. Journal of Economic Dynamics and Control, 25, 831–866. Kurata, H. (2004). One-sided tests for independence of seemingly unrelated regression equations. Journal of Multivariate Analysis, 90(2), 393–406. http://doi.org/10.1016/j.jmva.2003.09.003 Lazim, M. A. (1995). Econometric forecasting model and model evaluation: A case study of air passenger traffic flow. Lancaster University. Retrieved from http://eprints.uitm.edu.my/3170/1/MOHAMAD_ALIAS_ LAZIM_95.pdf Leamer, E. E. (1978). Specification searches: Ad hoc inference with non-experimental data. New York: Wiley. Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory. Leeb, H., & Pötscher, B. M. (2009). Model Selection. In T. G. Andersen, R. A. Davis, J.-P. Kreiss, & T. V. Mikosch (Eds.), Handbook of Financial Time Series (pp. 889–925). New York: Springer-Verlag Berlin Heidelberg. Lovell, M. C. (1983). Data mining. The Review of Economics and Statistics, 65(1), 1–12. Magnus, J. R. (1999). The success of econometrics. De Economist, 147(1), 55–77. Magnus, J. R., & Morgan, M. S. (1997). Design of the Experiment. Journal of Applied Econometrics, 12(5), 459–465. Magnus, J. R., & Morgan, M. S. (1999). Methodology and Tacit Knowledge: Two Experiments in Econometrics. New York: John Wiley. Mariano, R. S. (2002). Testing forecast accuracy. A Companion to Economic Forecasting, (July), 284–298. http://doi.org/10.1007/s10614-008-9144-4 Miller, A. J. (1984). Selection of Subsets of Regression Variables. Journal of Royal Statistical Society, 147(3), 389–425. Miller, A. J. (2002). Subset selection in regression. In Monographs on Statistics and Applied Probability (Vol. 95). Florida: Chapman & Hall/ CRC. Mizon, G. E. (1995). Progressive modelling of macroeconomic time series: The LSE methodology. Macroeconometrics: Developments, Tensions, and Prospects. Dordrecht: Kluwer Academic Press. Mycielski, J., & Kurcewicz, M. (2004). A specification search algorithm for cointergrated systems. Computing in Economics and Finance. Retrieved from http://econpapers.repec.org/RePEc:sce:scecf4:321 Oksanen, E. H. (1987). A Note On Seemingly Unrelated Regression Equations with Residual Vectors as Explanatory Variables. Statistics & Probability Letters, 6, 103–105. Pagan, A. (1987). Three Econometric Methodologies: A Critical Appraisal. Journal of Economic Surveys, 1(1), 3–24. Pant, P. N., & Starbuck, W. H. (1990). Innocents in the forecast: Forecasting and research methods. Journal of Management, 16, 433–460. Pérez-Amaral, T., Gallo, G. M., & White, H. (2003). A flexible tool for model building: The relevant transformation of the inputs network approach (RETINA). Oxford Bulletin of Economics and Statistics, (65), 821–838. Philips, P. C. B. (2003). Laws and limits of econometrics. The Economic Journal, 113, 26–52. Pindyck, R. S., & Rubinfeld, D. L. (1998). Econometric Models and Economic Forecasts. Boston, Massachusetts: Irwin/McGraw-Hill. Reade, J. J. (2007). Modelling and forecasting football attendances. Oxonomics, 2, 27–32. http://doi.org/10.1111/j.1752-5209.2007.0015.x Romano, J. P., Shaikh, A., & Wolf, M. (2008). Formalized data snooping based on generalized error rates. Econometric Theory, 24, 404–447. Santos, C., Hendry, D. F., & Johansen, S. (2007). Automatic selection of indicators in a fully saturated regression. Computational Statistics, 23(2), 317–335. http://doi.org/10.1007/s00180-007-0054-z Schwartz, J. (2006). Family structure as a source of female and male homicide in the United States. Homicide Studies, 10(4), 253–278. Sims, C. A. (1980). Macroeconomics and reality. Econometrica, 48(1), 1–48. Srivastava, V. K., & Dwivedi, T. D. (1979). Estimation of seemingly unrelated regression equations: A brief survey. Journal of Econometrics, 10, 15–32. Srivastava, V. K., & Giles, D. E. A. (1987). Seemingly Unrelated Regression Equations Models: Estimation and Inference. Statistics: Textbooks and Monographs. New York: Marcel Dekker, Inc. Srivastava, V. K., & Maekawa, K. (1995). Efficiency properties of feasible generalized least squares estimators in SURE models under non-normal disturbances. Journal of Econometrics, 66(03630015), 99–121. Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 58(1), 267–288. Timm, N. H. (2002). Seemingly Unrelated Regression Models. In Applied Multivariate Analysis (pp. 311–349). New York: Springer. Timm, N. H., & Al-Subaihi, A. A. (2001). Testing model specification in seemingly unrelated regression models. Communications in Statistics: Theory and Methods, 30(4), 579–590. Tsay, W.-J. (2004). Testing for contemporaneous correlation of disturbances in seemingly unrelated regressions with serial dependence. Economic Letters, 83, 69–76. Verzilli, C. J., Stallard, N., & Whittaker, J. C. (2005). Bayesian modelling of multivariate quantitative traits using seemingly unrelated regressions. Genetic Epidemiology, 38, 313–325. http://doi.org/10.1002/gepi.20072 White, H. (2000). A Reality Check for Data Snooping. Econometrica, 68(5), 1097–1126. Whittingham, M. J., Stephens, P. A., Bradbury, R. B., & Freckleton, R. P. (2006). Why do we still use stepwise modelling in ecology and behaviour? Journal of Animal Ecology, 75(5), 1182–1189. http://doi.org/10.1111/j.1365-2656.2006. 01141.x Yusof, N., & Ismail, S. (2011). Independence test in SURE-Autometrics algorithm. Proceedings of the International Symposium on Forecasting. Yusof, N., & Ismail, S. (2014). Lag variables reduction in multiple models selection algorithm. Proceedings of the International Conference on the Analysis and Mathematical Applications in Engineering and Science, 169–173. Yusof, N., Ismail, S., & T-Muda, T.-Z. (2015). Assessing the simulation performances of multiple model selection algorithm. Proceedings of the 5th International Conference on Computing and Informatics, 25–31. Zellner, A. (1962). An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias. Journal of the American Statistical Association, 57(298), 348–368. Zellner, A. (1963). Estimators for seemingly unrelated regressions: Some exact finite sample results. Journal of the American Statistical Association, 58, 977–992.