American Journal of Energy Research
ISSN (Print): 2328-7349 ISSN (Online): 2328-7330 Website: https://www.sciepub.com/journal/ajer Editor-in-chief: Apply for this position
Open Access
Journal Browser
Go
American Journal of Energy Research. 2013, 1(1), 7-16
DOI: 10.12691/ajer-1-1-2
Open AccessArticle

The Combined Effect of Applying Feature Selection and Parameter Optimization on Machine Learning Techniques for Solar Power Prediction

Md Rahat Hossain1, , Amanullah Maung Than Oo1 and A B M Shawkat Ali1

1Power Engineering Research Group (PERG), Central Queensland University, Rockhampton, Australia

Pub. Date: February 26, 2013

Cite this paper:
Md Rahat Hossain, Amanullah Maung Than Oo and A B M Shawkat Ali. The Combined Effect of Applying Feature Selection and Parameter Optimization on Machine Learning Techniques for Solar Power Prediction. American Journal of Energy Research. 2013; 1(1):7-16. doi: 10.12691/ajer-1-1-2

Abstract

This paper empirically shows that the combined effect of applying the selected feature subsets and optimized parameters on machine learning techniques significantly improves the accuracy for solar power prediction. To provide evidence, experiments are carried on in two phases. For all the experiments the machine learning techniques namely Least Median Square (LMS), Multilayer Perceptron (MLP) and Support Vector Machine (SVM) are used. In the first phase five well-known wrapper feature selection methods are used to obtain the prediction accuracy of machine learning techniques with selected feature subsets and default parameter settings. The experiments from the first phase demonstrate that holding the default parameters, LMS, MLP and SVM provides better prediction accuracy (i.e. reduced MAE and MASE) with selected feature subsets rather than without selected feature subsets. After getting improved prediction accuracy from the first phase, the second phase continues the experiments to optimize machine learning parameters and the prediction accuracy of those machine learning techniques are re-evaluated through adopting both the optimized parameter settings and selected feature subsets. The comparison between the results of two phases clearly shows that the later phase (i.e. machine learning techniques with selected feature subsets and optimized parameters) provides substantial improvement in the accuracy for solar power prediction than the earlier phase (i.e. machine learning techniques with selected feature subsets and default parameters). Experiments are carried out using reliable and real life historical meteorological data. The machine learning accuracy of solar radiation prediction is justified in terms of statistical error measurement and validation metrics. Experimental results of this paper facilitate to make a concrete verdict that providing more attention and effort towards the feature subset selection and machine learning parameter optimization (e.g. combined effect of selected feature subsets and optimized parameters on prediction accuracy which is investigated in this paper) can significantly contribute to improve the accuracy of solar power prediction.

Keywords:
feature selection machine learning regression algorithm solar radiation parameter optimization DTREG

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Figures

Figure of 4

References:

[1]  Yu, L., Liu, H. Feature selection for high-dimensional data: a fast correlation based filter solution. Proc. 20th Int’l Conf. Machine Learning, 2003; 856-863.
 
[2]  Blum, A., Langley, P. Selection of relevant features and examples in machine learning. Artificial Intelligence, 1997; 97:245-271.
 
[3]  Mitchell, T. Machine Learning. McGraw Hill, 1997.
 
[4]  Karagiannopoulos, M., Anyfantis, D., Kotsiantis, S. B., Pintelas, P. E. Feature selection for regression problems. The 8th Hellenic European Research on Computer Mathematics & its Applications, HERCMA 2007, 20-22.
 
[5]  Langley, P. Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance, 1994; 1-5.
 
[6]  Automatic parameters selection in machine learning. Editorial / Neurocomputing, Elsevier, 2012; 75:1-2.
 
[7]  Kalogirou, S. a. Artificial neural networks in renewable energy systems applications: a review. Renewable and Sustainable Energy Reviews, 2001; 4:373-401.
 
[8]  Daelemans, W., Hoste, V., Meulder F., Naudts, B. Combined optimization of feature selection and algorithm parameters in machine learning of language. CNTS Language Technology Group, University of Antwerp.
 
[9]  Konen, W., Koch, P., Flasch, O., Bartz-Beielstein, T. Parameter-tuned data mining: a general framework. University of Applied Sciences, Cologne.
 
[10]  Tan, Feng. Improving feature selection techniques for machine learning. Computer Science Dissertations. Paper 27, 2007.
 
[11]  Caruana, R., Freitag, D. Greedy attribute selection. Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, CA, 1994.
 
[12]  Kohavi, R., John, G. H. Wrappers for feature subset selection. Artificial Intelligence, 1997; 97(1-2):273-324.
 
[13]  Guetlein, M., Frank, E., Hall, M., Karwath, A. Large scale attribute selection using wrappers. Proc IEEE Symposium on Computational Intelligence and Data Mining, 2009; 332-339.
 
[14]  Guetlein, M. Large scale attribute selection using wrappers. Germany, 2006.
 
[15]  Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, H. The WEKA data mining software: an update. SIGKDD Explorations, 2009; 11.
 
[16]  Goldberg, E. Genetic algorithms in search, optimization and machine learning. Addison-Wesley, 1989.
 
[17]  Postema, M., Menzies, T., Wu, X. A decision support tool for tuning parameters in a machine learning algorithm. The Joint Pacific Asia Conference on Expert Systems/Singapore International Conference on Intelligent Systems. (PACES/SPICIS 97) 1997.
 
[18]  Geisser, Seymour. Predictive inference: an introduction. New York: Chapman & Hall, 1993.
 
[19]  Sherrod, P. H. DTREG: Predictive modeling software.
 
[20]  Rousseeuw, P.J. Least median of squares regression. J. Amer. Statist. Assoc., 1984; 79:871-880.
 
[21]  Haykin, S. Neural networks: a comprehensive foundation. Prentice Hall, 1999.
 
[22]  Shevade, S., Keerthi, S., Bhattacharyya, C., Murthy, K. Improvements to the SMO algorithm for SVM regression. IEEE Transaction on Neural Networks, 2000; 5:1183-88.
 
[23]  Zheng, H. Y., Kusiak, A. Prediction of wind farm power ramp rates: a data-mining approach. ASME J. Solar energy Eng., 2009.
 
[24]  Hyndman, R. J., Koehler, A. B. Another look at measures of forecast accuracy. Monash Econometrics and Business Statistics Working Papers, 2005.
 
[25]  DTREG manual in PDF format.
 
[26]  Miller, G.F., Todd, P.M., Hedge, S.U. Designing neural networks using genetic algorithms. Proc. 3rd International Conference on Genetic Algorithms, 1989.
 
[27]  Rumelhart, D. E., Hinton, G. E., Williams, R. J. Learning representations by back propagating errors. Nature, 1986, 323(9):533-536.
 
[28]  Nguyen, Derrick, Widrow, B. Improving the learning speed of 2-layer neural networks by choosing initial values of adaptive weights. In Proc. IJCNN, 1990; 3: 21-26.
 
[29]  Moller, Fodslette, M. A scaled conjugate gradient algorithm for fast supervised learning. Pergamon press. 1993.
 
[30]  Zhang, J., Lee, R., Wang, Y. J. Support vector machine classifications for microarray expression data set. Proceedings of the Fifth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’03), IEEE, 2003.
 
[31]  Burges, C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, Kluwer Academic Publishers, 1998.
 
[32]  Wang, J., Wu X., Zhang, C. Support vector machines based on k-means clustering for real-time business intelligence systems, Int. J. Business Intell. Data Mining, 2005,1(1): 54-64.
 
[33]  Hsu, C. W., Chang, C. C., Lin, C. J. A practical guide to support vector classification. Technical Report, University of National Taiwan, Department of Computer Science and Information Engineering, 2003: 1-12.