Image Clustering using Geo-Location Awareness

(1)

반도체디스플레이기술학회지 제19권 제4호(2020년 12월)

Journal of the Semiconductor & Display Technology, Vol. 19, No. 4. December 2020.

135

Yong-Hwan Lee^*†

*†Dept. of Digital Contents, Wonkwang University

ABSTRACT

This paper suggests a method of automatic clustering to search of relevant digital photos using geo-coded information. The provided scheme labels photo images with their corresponding global positioning system coordinates and date/time at the moment of capture, and the labels are used as clustering metadata of the images when they are in the use of retrieval. Experimental results show that geo-location information can improve the accuracy of image retrieval, and the information embedded within the images are effective and precise on the image clustering.

Key Words : Image Clustering, Photo Image Search, Geo-location Information, Multimedia Metadata

1. 서 론¹

Image clustering is a critical data analysis process in the field of computer vision. This is the process of grouping images into clusters such that the images within the same clusters are similar to each other. Otherwise, they are dissimilar without group in different clusters [1]. Many applications such as content-based image annotation [2]

and image retrieval [3] can be previewed as different instances of image clustering. In the academic field, many paper have been proposed on image clustering, which are included K-means [12], agglomerative clustering [13], deep adaptive clustering [14] and so on. Traditionally, they depend on predefined distance metrics which is difficult to identify in image datasets. Nowadays, those tasks are typically tackled by training convolutional neural networks [15, 16] on large scaled datasets [17] which have contained annotated images and ground-truth semantic labels. While the neural networks are excellent at learning feature representations with this supervised setup [18], there is some restricts under no access to ground truth labels at training time.

The location information, on the other hand, is

†E-mail: [email protected]

extremely useful in the representation of image. It is one of the strongest memory cues when a user is recalling past records of digital images [4]. Therefore, our main concern is the search scheme of making more efficient metadata with geographic location.

This paper suggests a method for image clustering which is helpful for searching and retrieving the relevant images with geo-locational information. The experimental results showed that there was a synergy by combining the images and its location information

2. Proposed Method

The goal of this work is to evaluate how geo-locational information can affect a usage of image clustering and this can be helpful for improving the performance of image retrieval. This clustering method is to automatically index the relevant image with user’s query for searching. This scheme is based on geographic coordinate where and when the image is taken. A set of positional locations are used as a metadata for image clustering and searching.

The suggested system performs three major tasks illustrated in Fig. 1. First, the mobile device equipped with camera and GPS module takes a picture and immediately requests the GPS receiver for the information of current location (i.e., latitude and longitude). Digital image and

(2)

Yong-Hwan Lee

반도체디스플레이기술학회지 제19권 제4호, 2020 136

Fig. 1. Three major tasks of the system.

Fig. 2. Threshold for grouping method.

coordinates are then compressed and saved in the image file header. Second, textual information such as place location and/or name of building of the position with latitude and longitude is appended when it is being downloaded to the operation machine. The information is taken from the National Geospatial-Intelligence Agency (NGA) [5]. Last, relevant images to a user’s query are retrieved with image files which are supporting the query- by- example.

Each geo-coded image is able to be classified into subsets based on location information and date/time as shown in Figure 2.

Let to be a set of user’s photo

images, ordered by the capturing time.

is a set of consecutive images which are taken in the nearby location. Then we use clusters to be grouped by the set C. Non-sequential photos could be included in the same cluster, such as P1 and P5 shown in

Fig. 2. The proposed retrieval method is to filter the collection based on geo-location, and then rank the images based on content- based similarity to a user’s query image.

Relevant images filtered using geo-location belong to a subset as defined below.

(1)

where I is the subset of images filtered and ε is the user- defined rank order. In equation, Dqt is the distance of locations between query image and target image which is defined as follow.

where lat and lon are the decimal degrees of latitude and

(p p pn)

P= ₁, ₂,...,

(

i i j

)

i p p p

c = , +1,..,

( )

{

^≤^ε

}

∈ I rank D_qt bset

filteredSu :

) cos(

) sin(

) cos(

t q t

q

t q

qt

lon lon lat

lat clon

lat lat

slat

clon slat a C D

−

×

=

×

=

+

×

=

(3)

Image Clustering using Geo-Location Awareness

Journal of KSDT Vol. 19, No. 4, 2020 137

longitude, indexes for q and t mean the query and target image, respectively. C is the constant to convert the angle from radians to degrees.

In order to make the content-based rankings, we use two image descriptors; the 64-bin color histogram in YCbCr color space, which is used in most of picture compressing standards, such as JPEG and the well-known MPEG-7 edge histogram descriptor (EHD) [6]. CH provides a compact summarization of the distribution of color. This is obtained by quantizing colors in an image and counting how each quantized frequently color occurs. EHD is designed to extract the spatial distribution of edges in an image. This is obtained by dividing the gray-scale image into non-overlapping blocks and then edges are categorized into 5 types (0, 45, 90, 135 degrees and non- direction) in each block. The output of feature is a total of 80 histogram bins, which is derived from . Histogram intersection for CH and EHD is used to measure the similarity between the query image and images in the database.

3. Experiments and Results

DSLR camera supports the recording of latitude, longitude and altitude information on picture when it receives GPS data from valid sources [7]. It also supports GPS data input in the NMEA-0183 using ASCII, serial communications protocol [8]. The experimental system is designed to evaluate whether the integration of geo-coded content analysis could improve the performance of retrieval, compared to using content analysis alone for the search. The system has capability to store the set of geo- location coordinates and their place names with GPS image file directories (IDF) defined by EXIF specification of JEITI [9] and IPTC metadata fields [10].

While most of image retrieval researches examine general image databases and many of the systems use the images from the COREL photo CDs, we would like to use our own photo collections as they have important clues of geo-coded image content. Image database consists of 1,040 photo images, which are downsized to resolution 640*480 formatted JPEG. And these images are to be set of 26 groups, which are 40 images for each group. A semantic group is only used in calculating the effective-

ness. Retrieved images would be considered as similar, if those are belonged to the same group of the query image.

Since one of the most common functions for performance evaluation used in image retrieval is recall and precision [11], we use the precision curve to evaluate the performance of the proposed method. Since we focus on how geo-coded information improves the performance of search results, we estimated the precision curve and recall curve of the numbers of retrieved images as shown in Fig. 3.

Fig. 3. Precision curve for the test of image search.

In order to obtain more reliable estimations on the experiments, the numbers of images in the database are queried in total. The results show that significant improvement can be obtained by integrating the geo-coded information into the images. The analysis of the simulation shows there were improvement up to 49% for average precision and 59% for average recall at the point of 40 retrieved images, compared to the use of visual content only.

4. Conclusion

This paper presented a novel image search method using geo-coded information which is GPS-derived position and date/time of image capture. Experimental results observe that geo-location not only plays a key role of providing a great advance in managing and browsing but also enhances the retrieval performance significantly.

4 4 ×

5 4 4× ×

(4)

Yong-Hwan Lee

반도체디스플레이기술학회지 제19권 제4호, 2020 138

Acknowledgement

This work has supported by the National Research Foundation of Korea (NRF) grant funded by Korea government (MSIT) (No. 2018R1A2B6008255).

References

1. Amara Tariq, Hassan Foroosh, "T-Clustering: Image clustering by tensor decomposition", International Conference on Image Processing, 2015.

2. G. Qi, X. Hua, Y. Rui, J. Tang, T. Mei, H. Zhang, “Correlative multi-label video annotation”. In ACM MM, pp,17–26, 2007.

3. H. Jegou, O. Chum. “Negative evidences and co-occurences in image retrieval: The benefit of PCA and whitening", In ECCV, pages 774–787, 2012.

4. Ramech Jain, “Photo Retrieval: Multimedia’s Chance to Solve a Real problem for Real People”, IEEE Multimedia, vol.14, issue.3, pp.111~112, July, 2007.

5. National Geospatial-Intelligence Agency (NGA). Website http://earth-info.nga.mil/gns/html/namefiles.htm.

6. B.S.Manjunath, J.R.Ohm, V.Vasudevan and A.Yamada, “Color and Texture Descriptors”, IEEE Transactions on Circuits and System from Video Technology, vol.11, no.6, pp.703~715, 2001.

7. Nikon Technical Note GPS Connection to D1X and D1H, website http://www.nikonusa.com/pdf/GPS.pdf.

8. Klaus Betke, “The NMEA 0183 Protocol”, Aug., 2001, website http://www.nmea.org.

9. “Exchangeable Image File Format for Digital Still Cameras:

EXIF Version 2.2”, Japan Electronics and Information Technology Industries Association (JEITI), 2002.

10. “Photo Metadata 2008 IPTC Core Specification version 1.1”, International Press Telecommunications Council, 2008, website http://www.iptc.org.

11. Vittorio Castelli, Lawrence D. Bergman, Image Database:

Search and Retrieval of Digital Imagery, Wiley Inter-Science, 2002.

12. J. Wang, J. Wang, J. Song, X. Xu, H. Shen, S. Li, "Optimized cartesian k-means", IEEE Trans. Knowl. Data Eng., 27(1):180–

192, 2015.

13. K. Gowda, G. Krishna, "Agglomerative clustering using the concept of mutual nearest neighbourhood", Pattern Recognition, 10(2):105–112, 1978.

14. J. Chang, L. Wang, G. Meng, S. Xiang, C. Pan, "Deep Adaptive Image Clustering", International Conference on Computer Vision,2017.

15. S. Xie, R. Girshick, P. Dollar, Z. Tu, K. He, "Aggregated residual transformations for deep neural networks", Computer Vision and Pattern Recognition, 2017.

16. Y.H. Lee, H.J. Kim, "Implementation of Fish Detection based on Convolutional Neural Networks", Journal of he Semiconductor &Display Technology, vol.19, issue.3, pp.124- 129, 2020.

17. Karen Simonyan, Andrew Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition", Computer Vision and Pattern Recognition, 2015.

18.Z. Wu, Y. Xiong, S.X. Yu, D. Lin, "Unsupervised feature learning via non-parametric instance discrimination", Computer Vision and Pattern Recognition, 2018.

접수일: 2020년 12월 21일, 심사일: 2020년 12월 21일, 게재확정일: 2020년 12월 23일