Fusion of Background Subtraction and Clustering Techniques for Shadow Suppression in Video Sequences

(1)

Fusion of Background Subtraction and Clustering Techniques for Shadow Suppression in Video Sequences / 231

Fusion of Background Subtraction and Clustering Techniques for Shadow Suppression in Video Sequences

Anuva Chowdhury^*, Jung-pil Shin^**, Ui-pil Chong^***

This paper introduces a mixture of background subtraction technique and K-Means clustering algorithm for removing shadows from video sequences. Lighting conditions cause an issue with segmentation. The proposed method can successfully eradicate artifacts associated with lighting changes such as highlight and reflection, and cast shadows of moving object from segmentation. In this paper, K-Means clustering algorithm is applied to the foreground, which is initially fragmented by background subtraction technique. The estimated shadow region is then superimposed on the background to eliminate the effects that cause redundancy in object detection. Simulation results depict that the proposed approach is capable of removing shadows and reflections from moving objects with an accuracy of more than 95% in every cases considered.

Keywords : Background subtraction, K-means clustering, shadow removal, video sequence.

It is necessary to eliminate the artifacts that misguide the detection of moving objects, as it is the fundamental step in many security based applications. Among the segmentation methods, background subtraction is very popular. This subtraction technique can be done in many different ways to cope up with the challenges due to change in illumination condition and object motion.

Mixture of Gaussian has been used to update the pixel intensity to deal with changes in the background[1]. A variety of bandwidth is used for kernel density estimation to model the background [2]. A proposal of using the mean shift vector has given in [3]. The recursive update of Gaussian Mixture Model (GMM) can be found in [4]. The description of probabilistic GMM using the color information of the scene is enumerated in [5], which is further updated over time automatically in [6]. Combination of running Gaussian average filter with frame averaging process can also

* Chittagong University

** University of Aizu

*** University of Ulsan (corresponding author) 투고 일자 : 2013. 4. 17 수정완료일자 : 2013. 10. 28 게재확정일자 : 2013. 10. 31

※ This work was supported by University of Ulsan, School of Excellence in Electrical Engineering.

be used efficiently for removing shadows [7].

Performance of clustering has been improved in [9-10]

by considering its robustness against noise by improving the objective function of FCM. Since, the shadow intensity is quite similar to the background, and pixels belong to lightning artifacts can never be a part of foreground region, we consider hard clustering technique for segmentation. Here, we propose to use K-means clustering after the methodology given in [7] to groups the region of interest and the artifacts in a way that the unwanted region can easily be merged with the background portion.

If Ft(x,y) represents the first frame, the foreground mask is created by using the following equation:

_  



^{ i f} ^^^^^^{ }^^ 

where, Rt(x,y)=|Ft(x,y)-Bt(x,y)| is the region created by the moving object or due to the changes in lightening conditions and TlandTu are the lower and upper thresholds, respectively. Bt(x,y) represents the background. The threshold values can be determined by

(2)

信號處理․시스템 學會論文誌 14 卷 4 號 2013. 10 / 232 the following ways:

here, S is a numeric value calculated experimentally



^^_ ^ ^_^ ^{  }^ ^

by the performance of the methodology. This value has been added and subtracted to the pixel color value of the segmented region to create the threshold mask to estimate the foreground. Finally, the background is updated over time as:

where, α is the learning rate, which determines the

_{ } 



^{ ×}^^     ×_i f_  

  

sensibility of the background to the variations, which further defines the adaption speed of the background.

The K-Means is used to classify the foreground pixels into several classes featuring the RGB intensities, based on their inherent distance from each other. Here, the intensity values are taken as the grouping centers, where pixels of a class represent more similar intensities. Thus the pixels belong to the artifacts fall on the same cluster by minimizing the objective function, Jobj. This can be calculated as follows:

_ 



  





  



∥

_^ _

∥

^ 

where,||xij-cj|| is the similarity between the image pixels, n is the number of pixels and k is the number of clusters.

Proposed algorithms

Our research study would like to propose the following steps to accomplish the proposed methodology:

Step1: The original image is acquired in RGB format.

Step2: The foreground can be obtained by taking the standard deviation of each pixel from its previous reference frame. As the pixel intensity values changes each time due to the motion of the object, the foreground mask can easily be obtained from the consecutive frames. When this mask falls within the given threshold the decision is foreground, otherwise

considered as background.

Step3: Each time the background progression is accomplished to deal with the changes in the background. For this, the value of α in equation 2 is optimally selected as 0.02 during the experiment.

Step4: The segmented foreground may contain dark cast shadow or other lightning artifacts. To solve this problem, the image is now grouped into background, foreground and shadow region by using K-means clustering algorithm. Here, three clusters have been used to contain the information of all pixels.

Step5: The shadow group is merged with the background to remove the lightning artifacts fully from the foreground.

Video sequences of 480×360 pixels resolution and frame rate of 30 frames per second have been used in our simulation. The three sequences “Fire”, “Sneak-Walk”

and “IK-Pole” are taken with different background scene. The first video frame represents fire as an object, where the fire produces strong reflection as well as local change in illumination. The other two frames represent moving object along with their dark cast shadow.

The proposed method has been compared with the method described by Tang, Miao and

_{ }

^^^

^

× 

_{ }



 × 

_{  }

^

 × 







 

^

Wa in [7]. Both of these algorithms have been implemented in the C++ environment using OpenCV library. For each of the sequences, ground truth has been prepared manually from the averages of three frames to calculate the quality matrices, the Accuracy (AC), Precision (PR) and the Sensitivity (Sen). The calculation is as follows: here, TP (true positive) - number of foreground pixels, TN (true negative) - number of background, FP (false positive) - number of pixels actually belongs to background but detected as foreground and FN (false negative) - number of pixels

(3)

Fusion of Background Subtraction and Clustering Techniques for Shadow Suppression in Video Sequences / 233

Fig. 1. Visual comparison between the method in [7] and our method in case of, large and dark cast shadow, (a) raw video frames, (b) result of methodology given in [7] and (c) results of our

methodology.

actually belongs to foreground but detected as background. The results are given in Table 1. The table illustrates the excellent performance of the method under heavy shadows and reflections except Sneak-walk. In addition, figure 1visualizes the superiority of the proposed algorithm with that in [7]. Here we see that

the method can easily removes artifacts caused by the lightning changes from the object. But segmentation becomes difficult whenever the object color and background become indistinguishable. This leads misclassification and object region remains undetected.

Table 1. Comparison of the proposed methodology with the existing algorithm

The paper presents a fusion technique to remove shadow and other illumination effects from moving object for video sequences. The proposal first segments the image with background subtraction technique, where foreground is detected from the deviation of each consecutive frame from its previous frame. To pull out the large dark cast shadow from the fragmented foreground, over segmentation is done by K-Means clustering algorithm, which partitions the shadow portion from the object region. Extensive simulation results demonstrate the success and efficiency of the proposed combination in shadow removal as well as object detection from video sequences.

[1] Stauffer C., Grimson W., “Mean-shift background image modeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, 2000, pp.747-757.

[2] Anurag M., Nikos P., “Motion based background subtraction using adaptive kernel density estimation,”

in Proc. Computer vision and patter recognition, 2004, pp. 302-309.

[3] Piccardi M., Jan T., “Mean-shift background image modeling,” in Proc. IEEE International Conf. on Image Processing, Singapore, 2004, pp. 3399-3402.

[4] Zoran Z., “Improved Adaptive Gaussian Mixture Model for Background Subtraction,” in Proc. ICPR, 2004.

[5] Jwu-Sheng H., Tzung-Min S., “Robust Background Subtraction with Shadow and Highlight Removal for

(4)

信號處理․시스템 學會論文誌 14 卷 4 號 2013. 10 / 234 Indoor Surveillance,” Journal on Adv Signal

Processing, 2007, pp.1-14.

[6] Parisa Darvish Zadeh V., Michael S.-L., Guillaume-Alexandre B., “An Efficient Region-Based Background Subtraction Technique,” in Proc.Canadian Conference on Computer and Robot Vision, CRV’08, May 2008, pp. 71 –78.

[7] Tang Z., Miao Z. Wan Y., “Background Subtraction Using Running Gaussian Average and Frame Difference,” Journal of International Federation for Information Processing (IFIP), vol.4740, 2007, pp.411-414.

[8] Te-Feng S., Yi-Ling C., Shang-Hong L.,

“Over-Segmentation Based Background Modeling and Foreground Detection with Shadow Removal by Using Hierarchical MRFs,” in Proc. Computer Vision–ACCV, 2010, pp. 535-546.

[9] Ahmed M.N., Yamany S.M., Mohamed N., Farag A.A., Moriarty T,” A Modified Fuzzy C-Means Algorithm for Bias Field Estimation and Segmentation of MRI Data,”, IEEE Trans. on Medical Imaging, vol. 21, 2002, pp. 193–199.

[10] Zhang D.Q., Chen S.C., Pan Z. S., Tan K.R.,

“Kernel-Based Fuzzy Clustering Incorporating Spatial Constraints for Image Segmentation,” in Proc. International Conference on Machine Learning and Cybernetics, 2003, pp. 2189–2192.

Ms. Anuva Chowdhury received her B. Sc. degree in Electrical and Electronic Engineering from Chittagong University of Engineering and Technology, Bangladesh, in 2008, and M.Sc. degree in Electrical Engineering from University of Ulsan, Korea in 2012. She joined as a Lecturer in the Department of Electrical and Electronic Engineering at Chittagong University of Engineering and Technology on September 2, 2008.

Now, she is working as an assistant professor at the same university.

Jung-pil Shin received a B.A. in Computer Science and Statistics and an M.S. in Computer Science from Pusan National University, Korea in 1990 and 1994, respectively. He received a Ph.D. in Communication Engineering from Kyushu University, Japan in 1999.

He became an Associate Professor and Senior Associate Professor in the Department of Computer Software, the University of Aizu, Japan, in 1999 and 2004, respectively. His research interests include pattern recognition, character recognition, image processing, and computer vision. He is currently researching the following advanced fields: pen-based interaction system, real-time system, oriental character processing, mobile computing, computer education, human recognition, and machine intelligence.

Dr. Ui-pil Chong (Member) received the B. S. degree in Electrical Engineering from University of Ulsan, Korea, in1978, and M.S. degree in Electrical Engineering from Korea University, Seoul, Korea in 1980.

He studied in field of computer engineering of Oregon State University and received M. S. degree in 1985 and received Ph. D. degree at New York University(POLY), NY, USA 1997. In January of 1997, Dr. Chong joined the School of Computer Engineering and Information Technology of the University of Ulsan in Ulsan City, Korea where he has been promoted to full professor since 2006. He has published more than 210 journal papers, conference papers in the area of Digital Signal Processing, Fault Detection and Diagnosis in the plants, Biomedical Engineering, Computer Music, and Multimedia Applications. He also holds the 10 Korean patents.

Currently, He is the Head of Whale Research Institute in University of Ulsan and Vice President of the Korea Institute of Signal Processing and Systems.