Network intrusion detection method based on matrix factorization of their time and frequency representations

(1)

152

|

wileyonlinelibrary.com/journal/etrij ETRI Journal. 2021;43(1):152–162.

1 | INTRODUCTION

Researchers in the field of computer and network sciences have focused on the statistical characterization of data trans- fer. Additionally, they have formulated simple descriptive simulation mechanisms for security incident handling. They have attempted to develop effective intrusion detection systems (IDS) that monitor traffic and events in computer systems or networks. This data analysis could indicate possible security threats and attack vectors [1]. Matrix analysis tools can also be utilized for this purpose, by using a matrix representation of the data.

Singular value decomposition (SVD) and principal component analysis (PCA) are two major tools for explor- atory data analysis, data processing, compression, and

dimensionality reduction. SVD is based on the principles of data matrices (singular values). PCA can be used for data compression, or, as in our case, for projecting the dataset on a new basis produced by its eigenvectors (eg, see [2–4]).

Computer-aided diagnosis systems may be useful in as- sisting the network administrator to detect network security threats, as they could be of several types. Examples of intrusions are as follows: user-to-root (U2R), remote-to-local (R2L), denial-of-service (DoS), and its distributed type (DDoS). Valuable contributions to research on network security systems include the datasets of recorded network traffic in a certain period of time [5,6]. Reliable datasets of this type are the Defense Advanced Research Projects Agency (DARPA) 1998 dataset and its later versions: the Knowledge O R I G I N A L A R T I C L E

Network intrusion detection method based on matrix factorization of their time and frequency representations

Spiros Chountasis

¹

| Dimitrios Pappas

²

| Dimitris Sklavounos

³

This is an Open Access article distributed under the term of Korea Open Government License (KOGL) Type 4: Source Indication + Commercial Use Prohibition + Change Prohibition (http://www.kogl.or.kr/info/licenseTypeEn.do).

1Department of Systems and Infrastructure, Independent Power Transmission Operator, Athens, Greece

2Department of Statistics, Athens University of Economics and Business, Athens, Greece

3Department of Computer Science, Metropolitan College, Athens, Greece Correspondence

Spiros Chountasis, Department of Systems and Infrastructure, Independent Power Transmission Operator, Athens, Greece.

Email: schountasis@admie.gr

Abstract

In the last few years, detection has become a powerful methodology for network pro- tection and security. This paper presents a new detection scheme for data recorded over a computer network. This approach is applicable to the broad scientific field of information security, including intrusion detection and prevention. The proposed method employs bidimensional (time-frequency) data representations of the forms of the short-time Fourier transform, as well as the Wigner distribution. Moreover, the method applies matrix factorization using singular value decomposition and principal component analysis of the two-dimensional data representation matrices to detect intrusions. The current scheme was evaluated using numerous tests on network activities, which were recorded and presented in the KDD-NSL and UNSW-NB15 datasets. The efficiency and robustness of the technique have been experimentally proved.

K E Y W O R D S

network analysis, network security, principal component analysis, singular value decomposition

(2)

Discovery and Data Mining Competition (KDDCup) 1999 dataset as well as the NSL-KDD.

These datasets were created for the American Air Force, and they have been widely used to evaluate the performance of IDS. The method proposed in the present work initially uses the NSL-KDD dataset (which is the latest version), focusing on the source bytes attribute of TCP packets to detect R2L intrusions.

The proposed scheme considers that data can be fully expressed in a bidimensional—time and frequency (TF)—domain. This pictorial representation of data contains all the recent information required for network analysis and data handling. The joint TF representation of the data provides the necessary plane for a matrix decomposition technique that is adopted. The proposed scheme is experimentally evaluated. The results show that the joint TF representation of the recorded data along with SVD is a suitable technique for the detection of an R2L attack, while the PCA method is a comprehensive technique for anomaly detection in cloud networks. As defined, anomalies in network data represent network attacks—deviation from the normal behavior of network data. The attacks involved were R2L using the NSL- KDD dataset and fuzzers obtained using the UNSW-NB15 dataset.

The remainder of this paper is organized as follows.

Section 2 introduces the basic theoretical concepts behind our proposed method. The performance of the scheme is also described; this includes detection systems, TF representations, SVD, and PCA. The detection method based on the TF representations and matrix factorization and decomposition properties is evaluated in Sections 3 and 4. Section 3 presents the R2L attacks. In Section 4, round-trip time under a fuzz test is interpreted, and the performance is evaluated accordingly. Finally, Section 5 presents the conclusions and possible extensions of the current work.

2 | DETECTION METHODS AND SYSTEMS

A primary function of an intrusion detection method is to detect suspicious activity and report it to the network administrator in a clear, understandable, and detailed manner. The three main types of IDS are referred to as misuse, anomaly, and hybrid [7–9].

The misuse systems detect intrusions by matching sus- pected activities with known patterns. The anomaly systems detect attacks by identifying the deviations in the behavior of the intrusion from the normal network behavior. Although many research works have been carried out on IDS systems, the complexity of the current type of cyberattacks requires a hybrid detection approach. The hybrid scenario can detect anomalies [10,11] as well as provide a prognosis of the misuse systems.

The authors of [12–14] applied machine learning.

2.1 | Time-frequency representation

In the past decades, there has been an alternative development in the study of time-varying spectra. The basic idea was to devise a joint function of time and frequency, a representation that will describe the energy density or signal intensity simultaneously in TF. The motivation for devising such representations is to (a) find and illustrate the fraction of the energy in a certain frequency and time range, and to (b) calculate the distribution of frequency at a particular time and calculate the global and local moments of the distribution such as the mean frequency and its local spread. The TF plane corresponds to two orthogonal axes for time and frequency. The short-time Fourier transform (STFT) is the result of applying the FT at different points in time on finite length (short-time) sections of a signal [15,16]. This description is fundamental to data analysis because it introduces time dependency, whereas the FT of the whole dataset is not time dependent. The STFT of a signal s(t) is defined as.

where h (t) is a suitably chosen analysis window. The STFT can also be computed from its FTs S (𝜔) and H (𝜔):

The Wigner distribution (WD) is the most widely studied and applied bidimensional representation. It was first intro- duced in the academic field of quantum mechanics [17]. The WD is defined as.

or

where 𝜏 and 𝜐 are the time and frequency lags, respectively.

In this paper, the asterisk denotes a complex conjugation.

The WD of a dataset can be interpreted as the pseudo-energy density of the signal, because it is real and covariant to

(1) STFT (s, 𝜔) = 1

2𝜋

∞

∫

− ∞

s(t^�) h (t^�− t) e^−i𝜔t^�dt^�,

(2) STFT (s, 𝜔) = 1

2𝜋e^−i𝜔t

∞

∫

− ∞

S(𝜔^�) H (𝜔^�− 𝜔) e^−i𝜔^�^td𝜔^�.

(3) WD (t, 𝜔) = 1

√2𝜋

∞

∫

− ∞

s

� t +1

2𝜏

� s^∗

� t −1

2𝜏

� e^−i𝜔𝜏d𝜏

(4) WD (t, 𝜔) = 1

√2𝜋

∞

∫

− ∞

S

� 𝜔+1

2𝜐

� S^∗

� 𝜔−1

2𝜐

� e^−it𝜐d𝜐,

(3)

time and frequency domain translations, but it is not always positive. The data energy in the TF region can be determined by integrating the distribution over that region.

2.2 | SVD and PCA

SVD is a powerful matrix computation tool with various applications [2,18]. The SVD factorization of an m × n matrix A has the form.

where U and V are orthogonal matrices that satisfy U^TU = Im

and V^TV = In, where I_n is the 𝑛× 𝑛 identity matrix. Matrix Σ is a diagonal matrix whose entries are called the singular values of A: 𝜎

1≥ 𝜎2≥⋯ ≥𝜎_N≥ 0. The singular values are always real numbers; they are the square roots of eigenvalues 𝜆_𝜄 of matrix AA^T:√𝜆_𝜄. The singular values are arranged in the descending order. The first singular values contain the information of the original matrix, while the last are very close or equal to zero. Therefore, the last columns of V have a very small effect on the reconstruction of matrix A . This means that, practically, one can skip the last columns of V and the corresponding zero singular values; therefore, fewer dimensions are needed to transmit the information.

SVD is a numerical method that can be applied to all matrices, square or not. SVD has the advantage of being more ro- bust to numerical errors. Moreover, it shows the geometric structure of a matrix, a property that is used in this work to determine possible differences between various examined datasets.

The main advantage of our method, which is based on SVD, is that when a small perturbation occurs on the TF plane, their singular values differ considerably occur.

Moreover, the singular values represent the intrinsic alge- braic data properties [19].

PCA is a useful statistical technique that has been applied in fields such as face recognition and image compression.

It is used to identify patterns in data and express the data by highlighting their similarities and differences. Moreover, PCA is an efficient tool to reduce the dimension of a dataset that includes a large number of interrelated variables, while retaining most variations. Thus, it reduces the number of dimensions without the substantial loss of information. PCA is based on the calculation of the eigenvalues and eigenvectors of the covariance matrix of the dataset, which are used as a new basis for the space. Therefore, the data expressed under the new basis will be transformed so that they are expressed in terms of the patterns between them. As a descriptive tool needs no distributional assumptions, it works as an explor- atory method that is suitable for various types of numerical data.

Overall, this method works in three steps. First, the components of the input vectors are orthogonalized so that they are uncorrelated with each other. Second, the resulting orthogonal components are ordered so that those with the largest variations come first. Finally, the components contrib- uting the least to the variation in the dataset are eliminated.

The principal components of an array x (which represents input data) are obtained by applying an orthonormal linear transformation, so that the elements of the principal com- ponent vector w become mutually uncorrelated. The goal of the PCA is to find an orthogonal matrix P that determines a change in the variable, w = Px, so that new variables w_i are uncorrelated and arranged in the order of decreasing variance. This matrix comprises the unit eigenvectors of the covariance matrix of the dataset, called the principal components of data. The first principal component is the eigenvec- tor corresponding to the largest eigenvalue of Σ.

From a computational point of view, SVD is a faster and more accurate method for finding the eigenvalues of Σ . Moreover, the entries of the matrix in SVD show us how much each principal component contributes in terms of vari- ability. The proposed technique may be considered as the detection mechanism of a network IDS (NIDS). NIDS ana- lyzes network traffic and uses sophisticated algorithms for detecting possible intrusions of several types. Such a system may be connected to a network in the following most com- monly used ways—such as the span port connection, by using a network tap, or by inline connection [20]. Simulations of matrix factorization and time-frequency representations, as previously described, have been implemented using a computer with Intel Core i7 processor with eight virtual cores and 32 GB of DRAM.

3 | EVALUATION OF THE

PROPOSED METHOD UNDER R2L ATTACK

Anomaly-based detection is the process of comparing normal activity with the observed events that represent significant deviations from the normal behavior. An IDS using anomaly-based detection has profiles that represent the normal behavior of users, hosts, network connections, or applications.

These profiles are developed by monitoring the characteristics of typical activity over a period of time. Then, the system uses statistical methods to compare the characteristics of current activity to the thresholds related to these profiles. If a deviation from the expected values is detected, the network administrator is alerted of the anomaly.

The first stage of the method is the recording of the source and destination bytes. In the beginning, it is considered that the network works normally with no attacks. Thereafter, the TCP source bytes size of the R2L intrusions was considered (5)

A = UΣV^T,

(4)

to evaluate the possibility of detection by using the proposed techniques. The recorded dataset for the source and destination bytes under normal operation and under R2L attacks is shown in Figure 1A and 1B respectively. Clearly, it is impos- sible to detect the network intrusion from these two plots.

In Figure 2A and 2B the real part of the FT of the recorded data is shown. In these frequency domains, a function is rep- resented by coefficients that express it as a superposition of the exponential functions of time in the corresponding time domains. The FT in these cases is a delta function at zero frequency, with a coefficient equal to the value of the bytes.

Data that are greatly extended in the time domain are closely bunched in the frequency domain.

The TF representation by means of the STFT of the spe- cific dataset is illustrated in Figure 3. The number of samples to compute the frequency is equal to 256. The window chosen is of the Gaussian form with the duration equal to 5. The data are confined to a line at a frequency equal to zero.

The case of energy distribution, as defined by WD, is illustrated in Figure 4. Clearly, the spectrum resolution provided is better compared with that of the STFT. Nevertheless, both representations provide us a clear depiction of the recorded data. However, the R2L intrusions can hardly be recognized from the graphs. As we have seen so far, no difference between the original data and the attacked data can be detected.

Therefore, we will use matrix analysis methods to evaluate the difference between them.

Our first approach was to test the difference in the TF representations of the SVD of the matrices produced before and after the attack. The geometric structure of the matrices is im- portant, and SVD clearly shows the difference between them.

The magnitude of the singular values (especially the large ones) can highlight which dimensions of any vector multi- plied by the matrix are affected. Hence, the difference between the SVD of the two matrices exactly shows the change in the magnitudes before and after the attack.

FIGURE 1 (A) Normal bytes recording dataset and (B) bytes recording including R2L attacks

0 50 100 150 200 250 300 350 400 450

Time (s) 500

0 100 200 300 400 500 600 700 800 900 1000

Byte

0 50 100 150 200 250 300 350 400 450

Time (s) 500

0 100 200 300 400 500 600 700 800 900 1000

Byte

(A)

(B)

FIGURE 2 (A) FT for Figure 1A and (B) FT for Figure 1B 250

200

150

100

50

0

250 200 150 100 50 0 50 100 150 200 250 ω (rad/s)

S (ω)

250

200

150

100

50

0

250 200 150 100 50 0 50 100 150 200 250 ω (rad/s)

S (ω)

(A)

(B)

(5)

Figure 4 shows a 3D plot of the difference between the SVD of the matrices. For numerical simulations, we selected the WD, as it is a TF representation that provides a high resolution. Our approach was to examine the difference between the source and destination bytes when the network operates under normal conditions and under R2L attacks. As input, we considered the diagonal matrices of the SVD produced by the WDs. This measure is illustrated in Figure 5 as a 3D plot, where the x and y axes represent the position of each entry of the 512 × 512 matrix, while z-value is the magnitude of each matrix element.

The diagonal structure of the matrix can be clearly seen in Figure 5. The highest values are concentrated at the origin, and they rapidly decrease.

Our next approach was to use PCA to obtain a clearer view of the difference between the datasets, because PCA reduces the dimension of the data and provides a set of values that are linearly uncorrelated. Our method consists of the following

steps. By taking the original data s₁ (before the attack) and s₂ (after the attack), we construct the variance-covariance matrices.

where N is the data length. The next step is to perform a PCA in both matrices S₁ and S₂ and make a 3D plot of the difference in the datasets. In this way, the only difference will be in the principal component of each dataset and, therefore, in the principal component of their difference. Data reduction for large datasets can be addressed using the PCA. This technique rearranges the dataset. In our case, matrices are

S₁= 1

N − 1s₁× s^T₁,

S₂= 1

N − 1s₂× s^T₂,

FIGURE 3 (A) STFT representation for Figure 1A and (B) STFT representation for Figure 1B

0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8

1.00 50 100 150 200 250 300 350 400 450 Time

Frequency

500

0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8

1.00 50 100 150 200 250 300 350 400 450 Time

Frequency

500 (A)

(B)

FIGURE 4 (A) WD representation for Figure 1A and (B) WD representation for Figure 1B

0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.50

Frequency

50 100 150 200 250 300 350 400 450

Time 500

0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4 0.50

Frequency

50 100 150 200 250 300 350 400 450

Time 500

(A)

(B)

(6)

created. Thus, the representation of the data is easier to be analyzed due to dimensionality reduction.

In Figure 6, the x and y axes represent the position of each entry of a 512 × 512 matrix, while the z-value is the magni- tude of each matrix element.

The principal component is the first column of the new matrix. As we can observe, the rest of the principal components are very close to zero. The principal component clearly shows the difference between the data (derived from the matrices produced by PCA). The major advantage of the

anomaly-based detection method adopted here is that it can be very effective at detecting previously unknown attacks. A threshold value sets the limit between normal and abnormal behavior by specifying a maximum acceptable level.

4 | MEASURING THE ROUND- TRIP TIME IN A NETWORK FUZZ TEST

Evaluating our network intrusion technique using the ex- isting benchmark datasets of NSL-KDD yields satisfactory results. To further investigate our method, a more recently generated dataset was used: UNSW-NB15.

The NSL-KDD dataset was created by the American Air Force as a newer version of the DARPA 1998 dataset and its

FIGURE 5 A 3D plot of the SVD difference between destination and source bytes when the network is under R2L attack

4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0 0.5 1.0

0 100 200 300 400 500

100 200 300 400 500

FIGURE 6 A 3D plot of the difference between the PCAanalyzed data (before and after the attack)

0

6000

200 100 400 300 500

100 200 300 400 500 4000

2000 2000 4000

FIGURE 7 (A) SYN/ACK recording dataset and (B) ACK recording dataset

0 50 100 150 200 250 300 350 400 450 Time

0 500 0.05 0.10 0.15 0.20 0.25

0 50 100 150 200 250 300 350 400 450

Time 500

0 0.05 0.10 0.15 0.20 0.25

(A)

(B)

(7)

later version, KDDCup 1999. It includes 42 attributes, and it is cleaned and filtered. All data duplication of the previous version (KDD’99) has been removed. It includes four types of intrusions (DoS, Probe, U2R, and R2L), while the 42nd attribute is categorized as “class,” which specifies whether the given instance is a normal connection or an attack of the above types [21]. The UNSW-NB15 dataset was created in the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) by the IXIA traffic generator with three virtual servers. Two of them were configured for normal traffic (no attacks), while the third one generated malicious activity in the network. All servers were connected to internet hosts via two routers, and the routers were connected to a firewall device configured to allow all the network traffic (normal and malicious) to pass through. The tcpdump tool, which was in- stalled on the one of the routers, captured the packet capture (Pcap) files of the simulation. The above configuration was used to capture the traffic generated from the IXIA tool along with the CVE database to represent real-world conditions.

The dataset includes nine types of modern attacks as well as normal traffic, and it contains 49 attributes classified into a binary class (0 as “normal” and 1 as “attack”) [22].

Fuzz testing is a black-box procedure of discovering faults by providing randomized inputs to the system to find cases that might cause a crash. IT professionals often use this spe- cific technique, as it can provide results with little effort, giving an overall picture of the robustness of the target system. A three-way handshake is a method used in a TCP/IP network to create a connection between a local host/client and a server. It is a standard communication technique that requires both the client and the server to exchange SYN and ACK packets before the actual data communication begins.

A three-way handshake (also known as a TCP handshake) is performed as follows:

FIGURE 8 Total recording for an SYN/ACK and ACK response

0 50 100 150 200 250 300 350 400 450

Time 500

0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

FIGURE 9 (A) WD representation for Figure 7A and (B) WD representation for Figure 7B

0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4

0.50 50 100 150 200 250 300 350 400 450 Time

Frequency

0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4

0.50 50 100 150 200 250 300 350 400 450 Time

Frequency

(A)

(B)

FIGURE 10 WD representation for the complete SYN/ACK and ACK cycle as presented in Figure 8

0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4

0.50 50 100 150 200 250 300 350 400 450 Time

Frequency

(8)

• A client sends a SYN data packet over an IP network to a server. The objective of this packet is to infer if the server is open for new connections.

• The server receives the SYN packet from the client node. It responds and returns a confirmation receipt with the ACK packet or a SYN/ACK packet.

• After receiving the SYN/ACK from the server, the client responds with an ACK packet.

Upon completion of this process, the connection is established, and the host and server can communicate. Here, we investigated the round-trip time (RTT): the time taken for an outgoing TCP client packet to be answered by the server.

Measuring and monitoring the network's RTT allow network op- erators and end-users to understand their network performance.

By comparing the RTT for the normal operation of the network and under a fuzz test, critical information regarding a possible network performance degradation can be extracted.

In the following experiment, the dataset was divided into two parts. The first 244 samples were recorded under normal operation, and the following 256 samples were recorded under the fuzz test. Figure 7 illustrates the SYN/ACK and ACK packets separately. Moreover, Figure 8 shows the sum of the SYN/ACK and ACK packets.

Accordingly, the WDs of the server (SYN/ACK) and client (ACK) packets are illustrated in Figures9A, 9B and 10

Our approach was to test the difference between the fuzzer and normal operations of the SVD for the diagonal matrices produced by the WDs. In the 3D plots of a matrix, the x and y axes represent the position of each entry of the

FIGURE 11 (A) Difference of the diagonal components of the SVD between the fuzzer and normal operations for the sum of SYN- ACK and ACK packets. (B) 3D plot of the diagonal components of the SVD between the fuzzer and normal operations for the sum of SYN/

ACK and ACK packets

0 50 100 150 200 250 300 350 400 450 500 0.01

0 0.01 0.02 0.03 0.02 0.03 0.04 0.05 0.06 0.07

0.05 0.06 0.08

0.02 0 0.02 0.04150

100 50

0 0 50 100 150

Time Frequency

(A)

(B)

FIGURE 12 (A) 3D plot of the difference for the diagonal components of the SVD between the fuzzer and normal operations for the ACK packets. (B) 3D plot of the difference for the diagonal components of the SVD between the fuzzer and normal operations for the SYN/ACK packets

0.01 0.02 0.03

0 0.01 0.02150

100 50

0 0 50 100 150

Time Frequency

0.02 0.03 0.04

0 0.01 0.02150

100 50

0 0 50 100 150

Time Frequency

0.01

(A)

(B)

(9)

matrix, and the z-value represents the magnitude of each matrix element.

Figure 11A shows the plot of the difference between the SVD of the matrices that include the summation of the server (SYN/ACK) and client (ACK) packets. Similarly, a 3D version of the figure is shown in Figure 11B. The diagonal structure of the matrix can be clearly seen in this figure. The highest values are concentrated at the origin, and they rapidly decrease.

For completeness, we also present the plots of the difference between the diagonal components of the SVD for the fuzzer and normal operations for the SYN/ACK and ACK

packets. The 3D plots are shown in Figure 12A and 12B, respectively.

Finally, Figures 13–15 show the differences between the PCA of the presented datasets. Figure 13 shows the 3D plot of the difference between the PCA of the SYN/ACK data and the PCA of the fuzzy SYN/ACK data.

In Figure 14, the 3D plot illustrates the difference between the PCA of the ACK data and the PCA of the fuzzy ACK data. In Figure 15, the 3D plot shows the difference between the PCA of the total sum (SYN/ACK added with the ACK data) and the PCA of the fuzzy total sum.

FIGURE 13 3D plot of the difference between the PCA of the SYN/ACK data (normal and fuzzy) [Colour figure can be viewed at wileyonlinelibrary.com]

0 10 20 30 40 50 60 70 80 90 100

20 0 60 40 100 80 0.15 0.10 0.05 0 0.05 0.10 0.15 0.20

FIGURE 14 3D plot of the difference between the PCA of the ACK data (normal and fuzzy)

0 10 20 30 40 50 60 70 80 90 100

20 0 60 40 100 80 0.20 0.15 0.10 0 0.05 0.10 0.15

0.05

FIGURE 15 3D plot of the difference between the PCA of the total SYN/ACK added with the ACK data (normal and fuzzy)

0 10 20 30 40 50 60 70 80 90 100

20 0 60 40 1000.8 80

0.6 0.4 0 0.2 0.4 0.6

0.2

(10)

According to our results, the principal component of the RTT data substantially differs between the normal and the fuzzy dataset. The minor components that follow do not show any differences between them.

5 | CONCLUSION

Network intrusion has become the most dangerous threat to many organizations in safeguarding their vital data and system resources. This study aimed to derive new insights in the detection process to monitor events occurring in a network and analyze them for indications of possible incidents, system violations, or imminent network threats. In general, this was done through the development of a framework for a TF representation. This work introduces a new scheme for detection in the matrix decomposed domain that is closely related to the quadratic distributions (such as the STFT and WD).

The proposed scheme has the advantages of TF domain data representations.

SVD and PCA were found to be powerful tools for ana- lyzing data. Applying them on the TF plane could accentuate small differences with a high accuracy and clearly indicate them. The experimental evaluation using the NSL-KDD and UNSW-NB15 benchmark datasets showed that the detection ability of the proposed method is very high for network attacks. Thus, the proposed method can compete with the current well-established network detection techniques. Future perspectives of this work include the formulation of a mo- ment-based detection scheme, which is adaptive to intensity distribution.

ORCID

Spiros Chountasis https://orcid.

org/0000-0002-6638-1024 REFERENCES

1. A. Kundu, S. Sural, and A. K. Majumdar, Database intrusion detec- tion using sequence alignment, Int. J. Inf. Security 9 (2010), 179–191.

2. D. Meyer, Matrix Analysis And Applied Linear Algebra, SIAM, Philadelphia 2000.

3. H. Demirel, C. Ozcinar, and G. Anbarjafari, Satellite image con- trast enhancement using discrete wavelet transform and singular value decomposition, IEEE Geosci. Remote Sens. Lett. 7 (2010), 333–337.

4. N. Halko, P. G. Martinsson, and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approxi- mate matrix decompositions, SIAM Rev. 53 (2011), 217–288.

5. H. Anat and J. Darcy, The impact of denial of service attack an- nouncements on the market value of firms, Risk Manage. Insurance Rev. 6 (2003), 97–121.

6. S. Paliwal and G. Ravindra, Denial-of-service, probing and re- mote to user (R2L) attack detection using genetic algorithm, Int. J.

Comput. Applicat. 60 (2012), 57–62.

7. S. Antonatos, K. Anagnostakis, and E. Markatos, Generating re- alistic workloads for network intrusion detection systems, in Proc.

ACM Workshop Softw. Performance (Redwood City, CA, USA), Jan. 2004, pp. 1–9.

8. E. Ireland, Intrusion detection with genetic algorithms and fuzzy logic, in Proc. UMMC SciSenior Seminar Conf. (Morris, MN, USA), 2013, pp. 1–30

9. K. Scarfone and P. Mell, Special Publication 800-94: Guide to in- trusion detection and prevention systems (IDPS), National Institute of Standards and Technology (NIST), 2007.

10. P. Garcia-Teodoro et al., Anomaly-based network intrusion detec- tion: Techniques, systems and challenges, Comput. Security 28 (2009), 18–28.

11. K. Wang, J. Salvatore, and S. J. Stolfo, Recent Advances in Intrusion Detection, In Anomalous payload-based network intru- sion detection, Springer: Berlin Heidelberg, 2007, pp. 203–222.

12. L. Tan, B. Brotherton, and T. Sherwood, Bit-split string-match- ing engines for intrusion detection and prevention, ACM Trans.

Architecture Code Optimization 3 (2006), 3–34.

13. Y. Qu and Q. Lu, Effectively mining network traffic intelligence to detect malicious stealthy port scanning to cloud servers, J. Internet Technol. 15 (2014), 841–852.

14. K. Watanabe, N. Tsuruoka, and R. Himeno. Performance of net- work intrusion detection cluster system, in Proc. Int. Symp. High Performance Comput. (Tokyo, Japan), Oct. 2003, pp. 278–287.

15. M. J. Bastiaans, T. Alieva, and J. Stankovic, On rotated time-fre- quency kernels, IEEE Signal Process. Lett. 9 (2002), 378–381.

16. F. Hlawatsch and G. F. Boudreaux-Bartels, Linear and quadratic time-frequency signal representations, IEEE Signal Process Mag.

9 (1992), 21–67.

17. L. Cohen, Time-frequency distributions—A review, Proc. IEEE 77 (1989), 941–981.

18. S. Chountasis, D. Pappas, and V. N. Katsikis, Signal watermark- ing in bi-dimensional representations using matrix factorizations, Comput. Appl. Math. 36 (2017), 341–357.

19. D. Lay, Linear Algebra and its Applications, 4th ed, Addison- Wesley, Boston, MA, USA, 2012.

20. H. Liu, C. Xiangdong, and L. Shalini, Understanding modern in- trusion detection systems: A survey, arXive preprint, 2017, arX- iv:1708.07174v2[cs.CR].

21. P. Aggarwala and S. K. Sharma, Analysis of KDD dataset attri- butes- class wise for intrusion detection, Procedia Comput. Sci. 57 (2015), 842–851.

22. N. Moustafa and J. Slay, The evaluation of network anomaly detec- tion systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set, Inf. Secur. J. 25 (2016), 18–31.

(11)

AUTHOR BIOGRAPHIES

Spiros Chountasis received his BEng (Hons) and PhD in electrical engineering and electronics from the University of Liverpool, Liverpool, UK, in 1996 and 1999, respectively.

During his postdoctoral position under the European Union's Training and Mobility of Researchers program (from 1999 to 2001) at the Institute for Scientific Interchange, Turin, Italy, he studied quantum information and quantum computing. Since 2005, he has been a senior engineer in the Department of Systems and Infrastructure, Independent Power Transmission Operator, Athens, Greece. His research expertise is in the areas of signal/image processing and computational methods.

Dimitrios Pappas received his BSc degree in mathematics from the University of Ioannina, Ioannina, Greece, in 1990, and his MSc and PhD in applied mathematics from the National Technical University of Athens, Athens, Greece, in 2000 and 2006, respectively. Since 2007, he has been teaching at the Statistics Department of the Athens University of Economics and Business, Athens, Greece, as a visiting lecturer. His main research interests are matrix analysis, numerical linear algebra, and applications of linear algebra in signal and image processing.

Dimitris Sklavounos received his BEng degree in electronics and communication engineering from the School of Communications Technology and Mathematical Sciences, University of North London, London, UK, in 1992. He received his MSc and PhD in data communications from the School of Electronics and Computer Engineering, Brunel University, London, UK, in 2004 and 2016, respectively. From 1992 to 2015, he worked in Icom Computer Systems Ltd., Athens, Greece, as a technical director. Along with this, he was a lecturer at the University of West Attica, Athens, Greece (from 2007 to 2012), and IST College, Athens, Greece (from 2014 to 2016). Since 2016, he has been with the Department of Computer Science, Metropolitan College, Athens, Greece, where he is currently a lecturer and program leader. His research interests include network security, machine learning, and energy efficiency systems.