Time Series Stock Prices Prediction Based On Fuzzy Model

(1)

접수일자 : 2009년 5월 8일 완료일자 : 2009년 9월 20일

퍼지 모델에 기초한 시계열 주가 예측

Time Series Stock Prices Prediction Based On Fuzzy Model

황희수․오진성

Heesoo Hwang and Jinsung Oh 한라대학교 전기전자과

요 약

본 논문은 일별 및 주별로 시계열 주가를 예측할 수 있는 퍼지 모델을 구성하는 방법을 제안한다. 전통적인 시계열 분석으 로 주가를 예측하는 것은 어렵지만 퍼지 모델은 비선형적인 주가 데이터의 특성을 잘 기술할 수 있는 장점을 갖고 있다.

주가 예측 모델에 사용될 입력 정보를 결정하는 데는 상당한 수고가 필요한데, 본 논문에서는 전통적인 캔들 스틱 차트의 정보를 입력변수로 고려한다. 주가 예측 퍼지 모델은 사다리꼴 멤버쉽함수를 갖는 전건부와 비선형식인 후건부로 된 퍼지 규칙으로 구성된다. 차분 진화를 통해 퍼지 모델은 최적화된다. 일별 및 주별로 코스피 지수의 시가, 고가, 저가 및 종가를 예측하는 모델을 만들고 그 성능을 평가한다.

키워드 : 퍼지 모델링, 시계열 예측, 차분 진화, 비선형 시스템, 주가 예측.

Abstract

In this paper an approach to building fuzzy models for predicting daily and weekly stock prices is presented.

Predicting stock prices with traditional time series analysis has proven to be difficult. Fuzzy logic based models have advantage of expressing the input-output relation linguistically, which facilitates the understanding of the system behavior. In building a stock prediction model we bear a burden of selecting most effective indicators for the stock prediction. In this paper information used in traditional candle stick-chart analysis is considered as input variables of our fuzzy models. The fuzzy rules have the premises and the consequents composed of trapezoidal membership functions and nonlinear equations, respectively. DE(Differential Evolution) identifies optimal fuzzy rules through an evolutionary process. The fuzzy models to predict daily and weekly open, high, low, and close prices of KOSPI(KOrea composite Stock Price Index) are built, and their performances are demonstrated.

Key Words : Fuzzy modeling, Time series prediction, Differential evolution, Nonlinear system, Stock prediction.

1. Introduction

The characteristic that all stock markets have in common is the uncertainty and complexity [1]. This feature is undesirable and unavoidable for the trader.

There have been a number of methods to reduce the uncertainty and resolve the complexity. These methods can be grouped into four major categories: i) technical analysis, ii) fundamental analysis, iii) traditional time series forecasting, and iv) machine learning method.

Technical analyst known as chart analyst, attempts to predict the market by tracing chart patterns from historic data of the market [5]. Fundamental analyst stud- ies the intrinsic value of a stock, and invests on it if its current value is estimated lower than its intrinsic value.

In traditional time series forecasting mathematical prediction models to trace patterns in historic data are used [3,4]. These three methods are shown to be non-effective due to stock markets' chaotic behavior,

and non-linearity. Since the late 1980s a number of machine learning methods have been developed. The methods use a set of samples and try to trace patterns in order to approximate the underlying function that generate the sample data. Some models use expert system [6,7], neural networks [8-10] and fuzzy logic [11,12]. Others use decision-tree [13], SVM(Support Vector Machine) [14] and data-mining [15,16]. Since neural networks and fuzzy logic are able to learn nonlinear mappings between inputs and outputs, they don’t require any assumption on input-output relations.

However, what is learned in neural networks is not easy for humans to understand. Complexity and inter- actions between the hidden nodes of a neural network make it unattainable to understand how a decision is made. Fuzzy logic based models have an additional advantage of expressing the input-output relation linguistically, which facilitates the understanding of the system behavior. Automatic identification of a fuzzy model doesn’t require any prior knowledge about the system, and only raw input and output data are enough to ex- tract new and useful knowledge. The fuzzy model with a combination of fuzzy and non-fuzzy predicates, have

(2)

effective potential to be a quantitative expressing of nonlinear system [17-20].

The aim of this paper is to develop an objective fuzzy model that can predict future prices in the stock markets by taking samples of past prices. The model is composed of ‘if-then’ fuzzy rules. The antecedent part of the rules consists of fuzzy predicates, while the consequent part is expressed as a nonlinear combination of antecedent variables. In building a stock prediction model we bear a burden of selecting most effective indicators for the stock prediction. In case of fuzzy models various indicators among the technical and fundamental indexes have been applied as inputs [11,12]. In this paper to mitigate the burden information used in traditional candle stick-chart analysis is selected as the effective indicators and also considered as input variables of the fuzzy model. Optimal fuzzy rules are identified through an evolutionary process of DE(Differential Evolution). The fuzzy models to predict daily and weekly open, high, low, and close prices of KOSPI(KOrea composite Stock Price Index) are built, and their performances are demonstrated and compared with those of neural networks.

2. Fuzzy Model

Theoretically, a system with MIMO(Multi Inputs and Multi Outputs) can be reduced to several MISO(Multi Inputs and Single Output) systems. Therefore, the fuzzy rule of a MIMO system can be presented as a set of rules of MISO systems. For a MISO system we consider fuzzy model formats as in (1). The antecedent parts consist of fuzzy predicates defined by trapezoidal membership functions, and the consequent parts are composed of nonlinear combinations of the antecedent variables.

    _ _^_ ⋯ and_ _^_

 ^ _^ _^∙ _^^



 ⋯ _^∙ _^^



 ≦  ≦ (1) Where  is the number of rules, _ ≤  ≤  is input variable, and ^ an output in the    rule. _^_ is the fuzzy variable defined as in (2). If all _^'s are equal to zero, the consequent parts become a linear combination of coefficients _^'s and _'s. _^'s and _^'s are the parameters to be identified in the evolutionary process of fuzzy model.

_^_ 







 

_^

_ _ _^  _

_ _ _ _^ 

 _ _ ≤ _≤ _ _

_^

 _ _ _^ _

_ _ ≤ _≤ _ _ _^

 

(2)

Where _^ is a trapezoidal form; in the case of

_ , a triangular form. _'s, _'s, _^'s and _^'s are the parameters to be identified in the evolutionary process of fuzzy model.

We consider the following reasoning procedures.

1) Given I/O(Input/Output) data  _ ⋯_

_ _ ⋯ _ _ ⋯ _, calculate the degree of the fulfillment ^ in the premise for the    rule as in (3).

^_^_ × ⋯ ×_^_ (3) where × means minimum operation.

2) Calculate the inferred value by taking the weighted average of with respect to as in (4).

_^  

  





^



  



^∙ ^

(4)

where  is the number of rules.

3. Evolutionary Identification of Fuzzy Model

The identification of fuzzy model is to elicit if-then fuzzy rules from raw input and output data. The identification is accomplished by DE. Before the evolution, the number of the fuzzy rules, r should be determined.

Since the consequent parts of fuzzy rules are composed of nonlinear equation of input variables, a few rules are likely to be enough to express nonlinear relationships of sample data.

In order to identify optimal fuzzy rules for a given task we adopt DE, which requires few control variables, is robust, easy to use and lends itself very well to par- allel computation [21]. DE utilizes population size() parameter vectors as a population, and population size doesn’t change during the evolution. A parameter vector contains parameter values to be identified for the antecedent and the consequent parts of the fuzzy rules. The evolution of DE is processed as follows:

[Step 1] The initial population is randomly chosen if nothing is known about the system. Generation number

 is set to 0.

[Step 2] New trial vector _{ } is generated by adding the weighted difference vector between two population members to a third member as in (5).

_{ }_

∙ _

_

 (5)

_

, _

 and _

 are randomly selected members among population. _, _ and _ are mutually different integers between 1 and . _ is the i-th population member in the t-th generation(   ⋯).  is a

(3)

real and constant factor which controls the amplification of the differential variation _

_

.

[Step 3] In order to increase the diversity of parameter vectors, _{ } new population vector is generated through uniform crossover of _{ }_and _. For each element of _{ }, if the randomly generated number is greater than a predetermined crossover rate, the corresponding element of trial vector _{ } is transferred to

_{ }, otherwise that of _ is transferred to _{ }. [Step 4] If the newly generated vector _{ } yields a lower objective function value than __, _{ } is set to

_{ }, otherwise the old parameter vector _ is retained.

[Step 5] Unless a termination criterion is reached, the generation number is increased by 1(    ), and re- turn to step2. The best parameter vector with minimum objective function value calculated by (6) is maintained during the evolution.

The target of the above-mentioned evolution is to minimize MAPE(Mean Absolute Percent Error) as in (6).

_{ }







  



_



^ _



(6) where _ is the actual value, _ is the fuzzy model output, and  is the total number of data used in the modeling.

4. Stock Price Prediction

To model stock market, first of all, the dominant input variables that affect the output of the system should be identified. Stock prices are affected by many complex factors from economic or political domains. It is impossible to take all the factors into consideration in building a model to predict the behavior of the market.

Instead of trying to estimate all major factors that run a market, we can focus on the movements of prices themselves. Like what the prices have done in the past, we can suppose that under similar circumstances they will move in the same way. Several methods have been developed to look at price and its effect on the market [4-7]. Most of them have been founded on the base of experience. One of them is candlestick chart, which is an old but still popular method to visualize stock price.

The candlestick patterns reflect the psychology of the market, and the investors can make their investment decision based on the identified candlestick patterns.

The study with stock price data during a period of five and half years from January 1992 to June 1997 shows the usefulness of candlestick analysis [6]. The study proves that stock markets do not follow random walks, there are certain patterns which occur frequently, and when a pattern is detected, the next market step can be

predicted. Therefore, we use the key information con- structing the candlestick chart in deriving dominant fuzzy rules for the stock price prediction. Candlestick chart has been founded based on the open, high, low and close price of a time period, where the time period can be a day, a week, a month or any other possible duration. The open, high, low, and close prices are selected as the input variables of our fuzzy model.

Another issue is to select time periods of actual inputs in order to build an effective model. In the candlestick analysis a sequence of individual candlesticks forms patterns, for example a single candle line, a dou- ble candle line and three or more candle lines. The patterns with longer size appear less frequently than those with shorter one. The experiments with historical stock data showed that patterns with pattern size one oc- curred 75% of the time, with size two less than 25%, and with size three about 0.002% [6]. It means that the prices in two most recent periods cover 99.8% of the patterns which occur in the stock market. In other words, when the duration of    and  form a certain pattern, a determinate movement can be predicted in the period of   . Therefore, two most recent periods are considered as the inputs of our fuzzy model. In the model,  represents the current period which is the most recent period of the market,    does the pre- vious, and    the upcoming period.

The proposed daily and weekly stock prediction models are composed of four MISO fuzzy models, where

, ,  ,  ,   ,

  ,     and     are the inputs, and each of   ,   ,     and

    is the output of the model. In daily and weekly stock prediction the current period  represents today and this week, respectively. Fig. 1 shows the input and output relationship of the fuzzy model for predicting the upcoming    price. The daily prediction models use KOSPI daily data from December 2006 to February 2008 for the modeling, and from March 2008 to August 2008 for the evaluation. The weekly prediction models use KOSPI weekly data from December 2005 to February 2008 for the modeling, and from March 2008 to August 2008 for the evaluation.

For the evolutionary identification of the fuzzy models the following DE control parameters are used: population size = 20, maximum generation number = 5000, differential amplification factor = 0.5, and crossover rate

= 0.5. The number of parameters to be identified through the evolution of DE is 98 for a fuzzy model with 8 inputs and an output: in the antecedent parts 8 input variables × 4 membership function parameters × 2 rules = 64, in the consequent parts (8 coefficients + 1 constant + 8 multipliers) × 2 rules = 34.

Fig. 2 shows the change of the objective function values in the evolutionary identification of the fuzzy model for predicting KOSPI weekly close prices.

(4)

그림 1. 시가 open(t+1) 예측을 위한 퍼지 모델.

Fig. 1. Fuzzy model for prediction of    price

그림 2. 코스피 주별 종가 퍼지 모델의 진화에 따른 목적함수 값

Fig. 2. Objective function value of KOSPI weekly close price fuzzy model in the evolution

Table 1 summarizes the performance measures, MAPEs of the daily and weekly stock prediction models for the modeling and the evaluation data, respectively.

For the performance comparisons, the results of neural network modeling are presented. Both the neural network and fuzzy models have the same inputs. The neural networks have 1 hidden layer and 5 nodes in hidden layer. Tangent sigmoid is used as transfer function in hidden layer and linear function for output layer.

Levenberg-Marquardt algorithm is used for training. To avoid the occurrence of any significant over-fitting, the data for the modeling are separated into 2 groups, one for training and the other for validation, and the training stops if the validation error increases. The number of hidden nodes is selected by trial and errors.

The daily KOSPI values of the fuzzy models are predicted with the accuracy of less than 1.1 MAPE. Since the weekly prices contain larger variations than the daily prices, the predictions are not as accurate as those of the daily prices. The weekly close prediction is worst in fuzzy models are smaller. It appears that neural networks are vulnerable to over-fitting, and fuzzy models errors of the fuzzy models compared to those of neural

표 1. 주가 예측 성능 평가 지수(MAPE).

Table 1. Performance measures(MAPEs) of stock prediction

Model OutputProposed fuzzy model Neural network Modeling Evaluation Modeling Evaluation

Daily

open 0.751 0.739 0.625 0.899

high 0.737 0.822 0.616 0.915

low 0.929 1.067 0.760 1.078

close 1.039 1.063 0.877 1.140

Weekly

open 0.823 0.789 0.577 0.832

high 1.137 1.346 0.868 1.798

low 1.561 1.867 1.087 2.289

close 1.931 2.321 1.313 3.230

표 2. 일별 종가 예측 퍼지 모델의 파라미터.

Table 2. Fuzzy model parameters for daily close price prediction

Antecedent Consequent

Input variables

Para- meters

Rule (  )

Rule (  )

Para- meters

Rule (  )

Rule (  )

_^ 2065.25 1313.25

  

_ 1946.55 1645.64

_^ 0.0 2.03449

_^ 480.80 669.29

_ 629.19 413.96

_^ 1.14122 -0.00005

_^ 735.12 602.53

  

_ 1681.14 1935.45

_^ -0.00001 1.24319

_^ 355.47 446.72

_ 689.38 698.42

_^ 1.33165 0.00003

_^ 741.83 654.12

  

_ 1539.64 1524.82

_^ 0.00007 2.70363

_^ 244.13 461.97

_ 387.59 586.83 _



 1.33138 -0.00004 575.14 735.16

  

_ 1640.54 1760.89

_^ 0.00007 2.63036

_^ 649.43 285.44

_ 818.15 732.32

_^ 1.58077 0.00007

_^ 218.15 807.07



_ 1544.26 1748.70

_^ 0.00004 2.63532

_^ 225.84 556.65

_ 585.85 704.38

_^ 1.47569 0.00001

_^ 469.76 513.089



_ 1695.16 1350.83

_^ 0.00011 2.24866

_^ 631.26 737.56

_ 741.72 636.83

_^ 1.18226 -0.00010

_^ 593.30 667.00



_ 1952.85 1297.55

_^ 0.00005 2.59150

_^ 675.55 63.75

_ 676.16 767.88

_^ 1.39457 0.00005

_^ 634.31 447.69



_ 2057.44 1404.87

_^ -0.00001 2.00876

_^ 400.58 712.548

_ 649.44 672.37

_^ 1.30276 -0.00005

_^ 418.26 504.58

(5)

both fuzzy model and neural network. The training networks are a little big, but the evaluation errors of the are more reliable. Table 2 shows the identified parameters of the fuzzy model for predicting daily close price.

Fig. 3 shows the actual KOSPI daily close price and its corresponding fuzzy model output. The dots are the predicted values by the fuzzy model. Some of the data used in the modeling are displayed in the left of the solid vertical line, and the data used in the evaluation are displayed in the right of the solid vertical line.

Likewise, Fig. 4 shows the case of KOSPI weekly close price.

그림 3. 코스피 일별 종가의 비교.

Fig. 3. Comparison of KOSPI daily close prices

그림 4. 코스피 주별 종가의 비교.

Fig. 4. Comparison of KOSPI weekly close prices Fig. 5 indicates that the daily predicted averages are in the range of the actual daily prices, where the upper and lower lines are for the actual daily high and low prices, and the dots are the averages of the daily predicted open, high, low and close prices. The weekly predicted averages are shown in Fig. 6. According to Fig. 5 and 6, we can determine entry points in stock trade near the predicted low or somewhere below the

average of the predicted open, high, low, and close prices. We can also determine exit points near the predicted high or somewhere above the average of the predicted open, high, low, and close prices.

그림 5. 코스피 일별 시가, 고가, 저가 및 종가 예측치의 평균.

Fig. 5. Predicted average of daily open, high, low and close prices

그림 6. 코스피 주별 시가, 고가, 저가 및 종가 예측치의 평균.

Fig. 6. Predicted average of weekly open, high, low and close prices

5. Conclusion

This paper presents an approach to identifying the optimal fuzzy model to predict the KOSPI daily and weekly open, high, low, and close prices. The identification is carried out through the evolution of a randomly initialized fuzzy model. To evaluate the effective- ness of the proposed models, the performance measures are compared with those of neural networks. The fuzzy models demonstrate reliable predictability in both the modeling and the evaluation data. The results show

(6)

that we can at least forecast the stock market trend based on the predicted open, high, low, and close values. Our method can be also applied to an individual stock price, which can aid the trader to determine buy or sell points of the stock.

참 고 문 헌

[1] R. G. Palmer, W. B. Arthur, J. H. Holland, and B.

Le Baron, ''An Artificia1 Stock Market", Artificial Life and Robotics, Vol. 3, 1998.

[2] B. M. Louis, Trend forecasting with technical anal- ysis, Marketplace BOOKS, 2000.

[3] S. M. Kendall and K. Ord, Time Series, Oxford, 1997.

[4] L. C. H. Leon, A. Liu and W. S. Chen, “Pattern discovery of fuzzy time series for financial pre- diction”, IEEE Trans. Knowledge and Data Engineering, Vol. 18, No. 5, pp. 613-625, 2006.

[5] D .R. Jobman, The handbook of technical analysis, Chicago, Illinois: Probus pubiishing, 1995.

[6] K. H. Lee and G. S. Jo, “Expert system for predicting stock market timing using a candlestick chart”, Expert System With Applications, Vol. 16, pp. 357-364, 1999.

[7] T. J. Beckman, "Stock Market Forecasting Using Technical Analysis", The World Congress on Expert System Proc., pp.2512-2519, 1991.

[8] K. Nygren, Stock Prediction: A neural network approach, Master Thesis, Royal Institute of Technology, KTH, March, 2004.

[9] Y. Tang, F. Xu, X. Wan and Y. Q. Zhang,

“Web-based fuzzy neural networks for stock prediction”, Computational Intelligence and Applications, pp. 169 – 174, 2002.

[10] G. Armano, M. Marchesi and A. Murru, “A hy- brid genetic-neural architecture for stock indexes forecasting”, Inforamtion Sciences, Vol. 170, No.

1, pp. 3-33, 2005.

[11] P. C. Chang and C. H. Liua, “A TSK type fuzzy based system for stock price prediction”, Expert Systems with Applications, Vol. 34, No. 1, pp.

135-144, 2008.

[12] M. H. Zarandi, E. Neshat, I. B. Turksen and B.

Rezaee, “A type-2 fuzzy model for stock market analysis”, Fuzzy System Conf., FUZZ-IEEE, pp.

1-6, July 2007.

[13] J. L. Wanga and S. H. Chanb, “Stock market trading rule discovery using two-layer bias deci- sion tree”, Expert Systems with Applications, Vol.

30, No. 4, pp. 605-611, May 2006.

[14] Fan and M. Palaniswami, “Stock selection using support vector machines”, Proceedings IJCNN 2001, Vol.3, pp. 1793-1798, 2001.

[15] M. Noor and R. H. Khokhar, “Fuzzy Decision Tree for Data Mining of Time Series Stock

Market Databases”, Critical Assessment of Mocroarray Data Analysis Conference, November 11-12, 2004.

[16] P. Giudici, Applied Data Mining, Statistical Methods for Business and Industry, Wiley, 2003.

[17] M. Sugeno and T. Yasukawa, “A fuzzy logic based approach to qualitative modeling”, IEEE Trans. Fuzzy Syst., Vol. 1, No. 1, pp. 7-31, 1993.

[18] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its application to modeling and con- trol”, IEEE Trans. Syst. Man. Cybern., Vol. 15, pp. 116-132, 1985.

[19] H. S. Hwang, “Automatic design of fuzzy rule base for modeling and control using evolutionary programming”, IEEE Proc-Control Theory Appl., Vol. 146, No. 1, pp. 9-16, 1996.

[20] S. J. Kang, C. H. Woo, H. S. Hwang and K. B.

Woo, “Evolutionary design of fuzzy rule base for nonlinear system modeling and control”, IEEE Trans. Fuzzy Systems, Vol. 8, No. 1, pp. 37-44, 2000.

[21] R. Storn, “Differential evolution, a simple and ef- ficient heuristic strategy for global optimization over continuous spaces”, Journal of Global Optimization, Vol. 11, No. 4, pp. 341-359, 1997.

저 자 소 개

황희수(Heesoo Hwang) 1986년 : 연세대 전기과 졸업.

1988년 : 동 대학원 전기과 석사.

1993년 : 동 대학원 전기과 박사.

2001년～현재 : 한라대학교 전기전자과 부교수

관심분야 : 퍼지 모델링, 최적화, 예측 및 진단 Phone : 033-760-1249

Fax : 033-760-1251 E-mail : [email protected]

오진성(Jinsung Oh) 1987년 : 연세대 전기과 졸업.

1989년 : 동 대학원 전기과 석사.

1998년 : 미국 Pittsburgh대학 박사.

2001년～현재 : 한라대학교 전기전자과 조교수

관심분야 : 신호 및 영상처리 Phone : 033-760-1248 Fax : 033-760-1251 E-mail : [email protected]