• 검색 결과가 없습니다.

Jang, Ikhoon Major in Regional Information

Department of Agricultural Economics and Rural Development The Graduate School Seoul National University

The government has maintained a rice production system through various policies in order to stabilize the supply and demand of rice as main grain. However, the recent abandonment of the status of WTO developing countries has increased anxiety for rice producers. The volatility of weather conditions due to global climate change is increasing, and the uncertainty of rice production is gradually increasing. In addition, Korea is facing an important turning point in changing the rice planting schedule to adapt the agricultural sector to climate change.

In order to ensure stable rice supply and demand, future production observations must be preceded, and this requires the development and operation of reliable rice yield forecasting models. In Korea, the Agricultural Observation Center, which is run by the Korea Rural Economic Institute, develops a model predicting rice yields at the end

However, the machine learning method that is widely used in the prediction field has not been actively studied in the field of rice yield prediction in Korea. Therefore, this study explores the research topics that can contribute to the improvement of rice yield prediction model using machine learning method and suggests new research method through the following empirical research to solve the research question.

First, we examined whether the predictive performance improved when the rice yield predictor was selected using Bayesian model averaging which is a machine learning method that has not been used in previous studies. Most of the meteorological variables selected by Bayesian model averaging were found to be significant by regression analysis, and the performance of the prediction model by Bayesian model averaging was superior to previous model based on cross-validation. The performance of the prediction model trained by machine learning methods such as support vector regression (SVR) is better than the prediction by the linear regression model (OLS) used in the conventional statistical model. However, when evaluating the predictive performance over the past 7 years with the prediction model learned from past data before 2012, the performance of the prediction model by Bayesian model averaging did not show any significant difference in the error-based indicators compared with the previous studies. In terms of explanatory power-based indicators, the performance was lower than that of the relatively simple preceding model. In the case of predicting the future with high uncertainty, we found that complex prediction models with many explanatory variables can lead poor predictive performance due to overfitting problems.

Second, in the situation where there is a lack of label data with observations of dependent variables, and there is abundance of unlabeled observation data, a semi-supervised regression method that

can improve the prediction performance by using unlabeled data for prediction model training was applied to the rice yield prediction model. The results showed that the predictive performance of the rice yield prediction model was improved by 4.6% on the error-based indicator and 5.8% on the explanatory power-based index, compared with the case without using the semi-superviesed regression method.

The model that trained 200 labed province data using a semi-supervised regression method performed 12% lower on the error-based index, 8.9% lower on the explanatory power-based indicator than the reference model that uses more than 2,000 labeled city data for training. This is a meaningful result considering the number of training data. Thus, the semi-supervised regression method can be a good alternative to improve the performance of prediction models by using unlabeled data.

Third, when comparing existing methods for early prediction of crop yield with methods using machine learning techniques that were not used in previous studies, the prediction performance was found to be improved when using machine learning. In the comparison of the prediction performance by lead time of the early prediction, the prediction performance of the model using future weather variables is superior to the models without future weather variables in all four predictive performance. However, the use of forecasts by weather

models with longer lead times (five months before harvest).

The results of the empirical research suggest that the application of Bayesian model averaging to the crop yield prediction model enables data-based variable search and predictive performance improvement. Therefore, it is expected to be developed as the basic technology of artificial intelligence in agriculture, and it is necessary to continuously verify through subsequent research. The semi-supervised regression method can provide yield estimates using historical weather information for a specific city even when yield data of the city are not observed. This makes it possible to determine if the crop is suitable for the region. The empirical analysis of the early prediction model suggests that prediction model using future weather variables that are not measured at the baseline are recommended for improving predictive performance. In this context, it is necessary to expand research focused on agricultural weather forecasts for early crop yield prediction. As the uncertainty of the crop production system increased due to the recent climate change, the importance of early prediction of crops is expected to increase gradually. Therefore, before the volatility of the production system due to climate change increases, analysis and data accumulation on the effects of climate factors should be preceded. In addition, considering the analysis of various crops and regions, there is a lack of analysis experts in the domestic agricultural sector. To solve this problem, Research on artificial intelligence-based technologies that are specialized for agricultural data analysis should continue to be developed.

Keywords : Rice Yield, Prediction Model, Machine Learning, Bayesian Model Averaging, Semi-supervised Regression Student Number : 2012-30990

관련 문서