A Novel Technique to Classify the Network Data by Using OCC with SVM

Raghavendra Sai N; Satya Rajesh K

doi:10.4172/0976-4860.1000201

Review Article - (2018) Volume 9, Issue 1

View PDF Download PDF

A Novel Technique to Classify the Network Data by Using OCC with SVM

Raghavendra Sai N¹^* and Satya Rajesh K²: ¹Department of Computer Science and Engineering, SRR and CVR Degree College, Vijayawada, Andhra Pradesh, India; ²Department of Computer Science, Bharathiar University, Tamilnadu, India

^*Corresponding Author: Raghavendra Sai N, Research Scholar, Department of Computer Science and Engineering, SRR and CVR Degree College, Vijayawada, Andhra Pradesh, India Email:

Abstract

One class grouping perceives the target class from each and every unique class using simply getting ready data from the goal class. One class characterization is fitting for those conditions where oddities are not speaking to well in the preparation set. One-class learning, or unsupervised SVM, goes for confining data from the beginning stage in the high-dimensional, pointer space (not the main marker space), and is an estimation used for special case area. Bolster vector machine is a machine learning method that is for the most part used for data examining and design perceiving. Bolster vector machines are overseen learning models with related learning counts that separate data and perceive plans, used for grouping and relapse examination. In the present paper, we are going to introduce a mixture characterization strategy by coordinating the "neighbourhood Support Vector Machine classifiers" with calculated relapse strategies; i.e., using a separation and vanquish technique. The estimation container starting of crossover technique presented now is still in Support Vector Machine.

Keywords: Logistic regression; Support vector machine; One class classifier

Introduction

Ordinary multi-class arrangement calculations intend to order an obscure question into one of a few pre-characterized classifications. An issue emerges when the obscure protest does not have a place among that classification. In this class grouping, among the classes (alluded to as the positive class or host class) is very much described by occasions in the preparation information [1,2]. Alternate class (non-target), it will either have no occurrences by any means, not very many of them, or they don't frame a measurably illustrative specimen of the contrary idea. To propel the significance of one-class characterization, just think the situations. This order will be significant in distinguishing automation issues, take an example. A classifier ought to identify where the engine is demonstrating strange/defective conduct. Illustrations in the complex functionality of the engine (through class preparing information) are anything but difficult to acquire. Then again, most blames won't have happened so one will have next to zero preparing information for the adverse class. As in other case, a customary paired classification for content or site content necessary laborious auto-handling to gather gloomy preparing illustrations. For instance, keeping in mind the end goal to build a landing page classifier [3], gathering test of landing pages (positive preparing cases) is moderately simple, however gathering tests of non-homepages (negative preparing cases) is exceptionally testing since it may not speak to the negative idea consistently and may include human inclination.

One Class Classifier (OCC) Versus Multi-class Classification

In this routine multiple-class gathering issue, information from (minimum two) classes are accessible and as far as possible is reinforced by the proximity of outline tests for all class. Moya et al. [4] start the term OCC in the examination detail. Assorted pros used distinctive terminlogy to demonstrate near thoughts, for instance, Outlier Detection [5], Novelty Detection [6] or Concept Learning [7]. These keywords begin as an eventual outcome of multiple operations to which One-Class Classifier has been associated. The disadvantages which are knowledgeable about the typical request issues, for instance, the figuring of oversight rates, calculating the multifaceted idea of an answer, the scourge of individuality, the hypothesis of the system, and what not, moreover displayed in One-Class-Classifier, and once in a while end up being substantially more unmistakable. As communicated some time recently, in OCC endeavors, the gloomy information is obliged in the scattering, so only a solitary issue of as far as possible can be settled completely by using the information. This results issue of OC gathering difficult when compared to conventional parallel course of action. The endeavour in One class classifier is to describe a gathering limit round the target class, with the ultimate objective that it recognizes however many inquiries as could be normal the situation being what it is from the affirmative class, meanwhile the class restrains the shot of enduring gloomy (or special case) instances. So, there is only a solitary part of the point of confinement can be settled, in one-classclassifier, it is difficult to pick, on begin of just a single class how solidly the cut-off must apt in all of the headings round the information. It is furthermore difficult to pick which credits are used to identify the best division of the affirmative and negative class instances. In specific, where the breaking point of the information is prolong and non-raised, the specified number of planning things may be huge. Therefore it isn't strange that OC arrange figuring’s will need a greater number getting ready cases in regard to conventional multi-class gathering counts [2].

In my proposed Research As system based PC frameworks contribute imperative parts in show day society, they have turned into the objective of interruptions by adversaries and culprits. With the advancement of web, arrange security turns into an imperative factor of PC innovation. In this manner, the part of Intrusion Detection Systems (IDS), as exceptional reason gadgets to recognize abnormality lies and assaults in the system is winding up more essential as it accumulates and investigates data from different zones inside a PC or a system to distinguish conceivable security breaks. The examination in the interruption location field has been for the most part centered on irregularity based and abuse based recognition procedures. Information mining systems are utilized to investigate and break down expansive dataset and find valuable examples. Arrangement is the classification that comprises of recognizable proof of class marks of records that are regularly depicted by setoff includes in dataset. The term Knowledge Discovery from information (KDD) alludes to the computerized procedure of learning disclosure (information mining) from databases, it involves many advances in particular information cleaning, information reconciliation, information determination, information change, information mining, design assessment and learning portrayal. The point of this paper is to build up a framework which utilizes different pre-processing techniques, for example, Feature Selection and Discretization. With the assistance of Feature determination calculation required highlights are chosen and because of Discretization the information is ruined which can be connected to different classifier calculations, for example, Logistic Regression and SVM.

Literature Survey

Datta H Deshmukh et al. built up a framework which utilizes prepreparing techniques like element determination and discretization. With the assistance of highlight choice calculation required highlights are chosen and because of discretization the informational indexes are disparaged which is then connected to classifier calculations like Naive Bayes, Hidden Naive Bayes, NB Tree. The proposed framework discloses the need to relate information mining procedures to sort out events to aggregate mastermind attacks and with a particular true objective to grow the precision of the classifier they realized pre-taking care of methodologies like Feature decision and discretization on NSLKDD dataset [3]. By using Fast Correlation based Filter Algorithm there was an undertaking to overthrown the issue of high visual of input data. Gullible Bayes has disadvantage of unexpected opportunity feel to crush the particular problem they executed HNBC and NB Tree classifier. Thusly this upgraded the precision of the classifier and reduced the misstep rate. The yield of the implementing system are verified for bona fide positive, honest to goodness negative, false positive and false negative. In light of these qualities exactness and mix-up rate of each classifier was handled.

Adetunmbi A Olusola et al. [6] considered the significance of every part in KDD 99 interference disclosure dataset for distinguishing proof of individual class. Repulsive set level of reliance and dependence degrees of every class were used to picking the frequent sharpened features for each class. Picking the perfect attribute is trying, yet it should be evaluated to diminish the measure of symptoms for powerful taking care of fast and to clear the unessential, dreary and improper data for judicious precision.

Mahbod Tavallaee et al. [8] directed a measurable investigation on KDD CUP 99 informational collection, they discovered some essential issues which to a great degree effects the introduction of framed frameworks, and results in an very dull assessment of inconsistency discovery technique. To address the problems, they have foreseen another informational index, NSL-KDD which has preferences over the first KDD informational index.

Sunitha Beniwal et al. [9] examined an assortment of CFS method in information mining. Arrangement is a technique utilized for finding classes of secretive information. Highlight choice is utilized to maintain a strategic distance from excess fitting and to improve display execution to give speedier and additionally reliable models. Different techniques for order exist like Bayesian, choice trees, and so on.

Lei Yu et al. [10] presented a novel idea, Fast relationship based channel answer for the element choice of high dimensional information which can distinguish pertinent highlights and additionally repetition among important highlights without match insightful connection examination. Highlight determination has two classes they are channel model and wrapper demonstrate.

Bolon-Canedo et al. proposed another approach that comprises of mix of discretization and channel techniques intended for enhancing classification execution in KDD CUP99 dataset. Investigation of KDD Cup 99 dataset recommends that there are a few highlights which are superfluous and connected. In this work amid highlight choice process channel strategy is utilized which permits to reduce dimensionality of the dataset. This strategy was picked in light of huge size of the KDD Cup'99 dataset.

Amudha et al. [11] Have done assessment consider on execution of information mining arrangement calculations in particular J48, Naive Bayes, NB Tree and Random Forest utilizing KDDCup'99 dataset. Information mining systems are the new approach for interruption discovery. Since the nature of the component choice techniques is an imperative factor that influences the adequacy of IDS it is important to assess the execution of information mining characterization calculations to be specific Naive Bayes, NB Tree, and Random Forest utilizing KDD Cup'99 dataset. Here the primary concentrate is on Correlation Highlight Selection so as to get ideal arrangement of highlights for characterization however the recognition execution of these strategies nearly depends on the colossal sum and high caliber of preparing tests. At the point when information mining is brought into the interruption discovery it principally concentrate on two issues that are to set up the versatile component dataset and to enhance the recognition rate. In this paper they created arrangement of analyses on KDDCup'99 dataset for characterizing the assault and to inspect the viability of relationship include choice measure the outcomes demonstrate that Random Forest gives better precision, identification rate, false alert rate and NB Tree gives better exactness.

Mrutyunjaya Panda et al. [12] did a near investigation of information digging calculations for arrange interruption discovery. In this paper, the introduction of three all around perceived information mining classifier calculations to be specific, ID3, J48 and Nave Bayes were assessed in view of the 10-overlap cross approval test. They looked at the viability of the order calculation Nave Bayes with the choice tree calculations in particular, ID3 and J48.

One Class Classification

Graph one-class portrayal has furthermore been known as interest or special case disclosure. Not the same as ought not out of the ordinary course of action, it tests data from only a solitary class, called the goal class, are all around depicted, while there are no or couple of examples from interchange class (moreover called the inconsistency class). The one-class gathering issue changes in a single essential point of view from the standard request issue. In OC game plan it is normal that solitary information among one of the classes, the goal class, is open. This infers simply case objects of the destination class can be used and that no data about exchange class of special case instance is accessible. The point of confinement midst the two classes must be surveyed from information of simply the average, real class. The errand is to portray a breaking point round the goal class, to such a degree, to the point that it recognizes however a significant part of the target dissents as could be normal, while it restrains the shot of enduring oddity questions. A classifier, i.e., a limit which yields a class name for every data challenge, can't be worked from known principles. Thusly, in outline affirmation or ML, one tries to conclude a classifier from a (compelled) game plan of planning cases. The use of representations thusly lifts the need to explicitly express the benchmarks for the gathering by the customer. The plan is to secure designs and taking in standards to pick up from the delineations and anticipate the characteristics of future articles.

The goal of the One-Class Classification is to perceive a game plan of target articles and all other possible things. It is generally used to perceive new inquiries that take after a known course of action of things. Right when another inquiry does not appear like the data, it is likely going to be an abnormality or an anomaly. When it is recognized by the information depiction, it can be used with higher condense in a following gathering Different methods have been delivered to make a data portrayal. Overall the probability thickness of the target set is illustrated. This requires a far reaching number of tests to crush the scourge of dimensionality. An unexpected methodology in comparison to surveying a probability thickness gage exists. It is possible to use the partition to model or just to assess the point of confinement around the class without assessing a probability thickness.

One Class Classification Methods there are four fundamental models, the assistance vector data depiction, k-infers gathering, k-center procedure and an auto-encoder neural framework. Here an illustrative design is fitted to the information and the similarities to this design are used. In the SVDD a hyper circle is put round the information. By implementing the bit trap the model ends up being more versatile to take after the traits in the information. Alternatively to the host thickness the partition to the point of convergence of the hyper circle is used. In the k-means and k-center system the information is assembled, and the partition to the nearest show is used. Ultimately in the auto-encoder mastermind the model is set up to address the data outline at the yield layer. The framework contains one bottleneck layer to oblige it to take in a (nonlinear) subspace through the data. The entertainment screw up of the inquiry in the yield layer is used as detachment to the model. The usage of a data space portrayal procedure is propelled by the assistance vector machine by, called the SVDD (Support Vector Domain Description) [13]. This method can be used for interest or oddity acknowledgment. A circularly shaped decision constrain around a course of action of articles is produced by a plan of assistance vectors depicting as far as possible. It has the probability of changing the data to new component spaces with less calculation cost. By using the changed data, this Support Vector Domain Description can get more versatile and more correct data delineations. The misstep of the essential kind, the piece of the planning instances which will be abandoned, can be assessed in a flash from the delineation without the usage of a free test set, which makes this methodology data capable. The SVDD is differentiated and other exemption recognizable proof procedures on honest to goodness data.

SVM Classification

A help vector machine prospers a hyperplane or set of hyperplanes in a limitless space, which can be used for arranging, or various assignments. Given an adjustment of adapting explanations, each set apart as consisting a place with one of two classifications, a Support Vector Machine preparing calculation constructs a design that allots out new cases into one classification or the other, making it a nonprobabilistic double straight classifier. A Support Vector Machine display is an imitation of the cases as concentrates in space, mapped with the goal that the cases of the distinct arrangements are segregated by an unmistakable hole that is as wide as could be expected under the precedence. New cases are then allotted into that same space and predicted to have a place with a classification in light of which side of the hole they fall on. Naturally, a great segregation is completed by the hyperplane that has the largest dissolution to the closest preparing knowledge purpose of any class (so called utilitarian edge), since all in all the larger the edge the lower the conjecture mistake of the classifier [14].

Logistic Regression

Strategic Regression Logistic relapse is utilized to foresee the likelihood of event of an occasion by fitting information to a calculated bend. A strategic relapse demonstrates is worked as the accompanying condition:

logit (y) = c0 + c1x1 + c2x2 + • + ckxk,

Where x1, x2, •, xk are preditors, y is a reaction to foresee, and

logit (y) = ln (y1−y).

The above condition can likewise be composed as

y = 1 + e−(c0+c1x1+c2x2+•••+ckxk).

Strategic relapse can be worked with funcation glm by setting family to binomial (link="logit").

The summed up direct model (GLM) sums up straight relapse by enabling the straight model to be identified with the reaction variable by means of a connection work and by permitting the size of the fluctuation of every estimation to be an element of its anticipated esteem. It brings together different other factual models, including direct relapse, calculated relapse and Poisson relapse. Capacity glm is utilized to fit summed up direct models, indicated by giving a representative portrayal of the straight indicator and a depiction of the mistake dispersion.

Discussion

Problem statements

A number of applicable data recovery characterization issues are one-class arrangement issues on a fundamental level. i.e., named information is accessible for one class, the supposed target class, and normal separation based grouping approaches, is then parallel or multiclass, and is not appropriate. Accomplishing a high adequacy when taking care of one-class issues is troublesome in any case and it turns out to be significantly all the more difficult when the objective class information is multimodal, which is frequently the case [15]. To address these worries, we propose a group based one-class gathering that comprises of four stages:

1. Applying a bunching calculation to the objective class information,

2. Preparing an individual one-class classifier for each of the distinguished groups,

3. Collecting the choices of the individual classifiers, and

4. Choosing the best fitting bunching model.

Drawbacks in K-means:

1. K-implies expect the difference of the appropriation of each quality (variable) is round;

2. All factors have a similar difference;

3. The earlier likelihood for all k bunches is the same i.e., each group has generally break even with number of perceptions;

In the event that any of these 3 presumptions are disregarded, at that point k-means will fizzle.

1. Difficult to anticipate K-value.

2. With worldwide group, it didn't function admirably.

3. Different beginning parcels can bring about various last bunches.

4. It doesn't function admirably with groups (in the first information) of Different size and Different thickness

One class classifier with logistic regression:

Relapse models for clustered data broadly; there are three general methodologies for dealing with bunching in relapse models:

1. Acquaint arbitrary impacts with represent bunching.

2. Acquaint settled impacts with represent grouping.

3. Disregard clustering...but be a "sharp ostrich".

We swing now to a computation for adjustment called multinomial strategic relapse, here and there alluded to inside dialect fitting as most extreme Max Ent entropy displaying, Max Ent for short. Computed relapse has a place with the group of classifiers known as the aumenred or log-direct classifiers. Like gullible Bayes, it log-direct classifier works by splitting some arrangement of weighted highlights from the data, taking logs, and consolidating them straight (implying that each component is replicated by a weight and afterward included). In fact, strategic relapse alludes to a classifier that arranges a perception into one of two classes, and multinomial calculated relapse is utilized while ordering into more than two classes (Table 1).

One Class Classification	SVM Classification
One Class Contains data from only one class, target class	SVM contains data of two or more classes
Goal is to create a description of one class of objects and distinguish from outliers	Goal is to create hyper plane with maximum margin between two classes
Decision boundary is estimated in all directions in the feature space around the target class.	Hyper plane is created in between datasets to indicate which class it belongs to
Software finds an appropriate bias term such that outlier fraction of the observations in the training set	Software attempts to remove 100* outlier fraction % of observations when the optimizations algorithms converges
One Class Classifier is a traditional Classifier	SVM Classifier is a linear Classifier
Used for Outlier detection Novelty Detection	Used for Classification and Regression
Less parameters, Less Training Data	More parameters, More Training Data
Term Coined by Moya and Hush	Term Coined by Vladmir, Vapnik

Table 1: Difference between One class and SVM classification.

Implementation

Anomaly detection using SVM dataset: For carrying out intrusion detection for Anomaly based attacks and Misuse based attacks we had two data sets Dataset_Anomaly.csv and Dataset_Misuse.csv in the previous phase. In the anomaly detection data set, the class or prediction variable is either Normal which represents a normal case or an Attack [16]. Contrary to the anomaly detection data set, the misuse detection data set has a class variable Normal or Name of the attack which represents a specific type of attack such as Smurf, NMap, Rootkit, etc. We carry out data cleaning on Dataset_Anomaly.csv and Dataset_ Misuse.csv using Weka's to obtain Dataset_Anomaly_Attribute Selection.csv and Dataset_Misuse_AttributeSelection.csv which has less attributes that help speed our NN (Table 1) [17].

Anomaly and misuse IDS operation using NN

The 'neuralnet' package is available in R and is open source (Figures 1 and 2). It was used for our neural network based IDS and Analysis. The package provides functions to both generate the neural network and carry out classification (Figure 3). Our attribute values were used to create formula that is supplied to 'neuralnet' function [18]. The neuralnet function returned an object that has all relevant information about the neural network and can be further used to derive our results (Figure 4).

international-journal-of-advancements-in-technology-data-1

Figure 1: Statistical data-1.

international-journal-of-advancements-in-technology-data-2

Figure 2: Statistical data-2.

international-journal-of-advancements-in-technology-data-3

Figure 3: Statistical data-3.

international-journal-of-advancements-in-technology-data-4

Figure 4: Statistical data-4.

Conclusion

In this paper we produced the effective results to classify Network Data by using SVM and also propose a hybrid classification method by integrating the “local Support Vector Machine classifiers” with logistic regression techniques; i.e., by a divide-and-conquer method. The calculation bottle neck of the hybrid method proposed here is still in the Support Vector Machine.

References

Tax D, Duin R (2001) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155-173.
Ritter G, Gallegos M (1997) Outliers in statistical pattern recognition and an application to automatic chromosome classification. Pattern Recognition Letters 18: 525-539.
Bishop C (1994) Novelty detection and neural network validation. IEEE Proceedingson vision image and signal processing, special issue on applications of neural networks 141: 217-222.
Japkowicz N (1999) Concept-learning in the absence of counterexamples: Anautoassociation-based approach to classification. PhD thesis, New Brunswick Rutgers, The State University of New Jersey, USA.
Dokas P, Ertoz L, Kumar V, Lazarevic A, Srivastava J, et al. (2002) Datamining for network intrusion detection. In: proceedings of NSF workshop on Next Generation Data mining 2: 21-30.
Olusola AA, Adeola SO,Daramola OA (2010) Analysis of NSL-KDD’99 intrusion detection dataset for selection of relevance features. Proceedings of the world congress on Engineering and Computer Science 1: 2-10.
Kumar M, Hanumanthappa M, Suresh KTV (2012) Intrusion detection system using decision tree algorithm. Communication technology (ICCT) IEEE 14thInternational Conference on IEEE, Munich.
Tavallaee M, Ebrahim B, Wei L, Ali AG (2009)A detailed analysis of the KDD CUP 99 data set. Proceedings of the second IEEE symposium on computational intelligence for security and defence applications.
Beniwal S, Jitender A (2012) Classification and feature selec-tion techniques in data mining. IntJEngg Res Technol 1: 2012.
Yu L, Huan L (2003) Feature selection for high-dimensional data: A fast correlation based filter solution. In ICML 3: 856-863.
Amudha P, Abdul RH (2011) Performance analysis of data mining approaches in intrusion detection. Process automation, control and computing (PACC),International Conference on IEEE, Seattle, USA.
Panda M, Manas RP (2008)A comparative study of data mining algorithms for network intrusion detection. Emerging Trends in Engineering and TechnologyICETET-08. First International Conference on IEEE, Nagpur, India.
Wenke L,StolfoSJ,Mok KW (1999) A data mining framework for building intrusion detection models. Security and Privacy. Proceedings of the 1999 IEEE Symposium on IEEE, India.
Nguyen HA, Deokjai C (2008) Application of data mining to network intrusion detection:classifier selection model. Challenges for next generation network operations and service management. Springer berlin Heidelberg: 399-408.
Depren O, Topallar M, Anarim E, Ciliz MK (2005) An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert systems with Applications 29: 713-722.
Kohavi R (1996) Scalingup the accuracy of naive-bayes classifiers: A decision-tree hybrid. In KDD: 202-207.
Zhang H, Liangxiao J, Jiang S (2005) Hidden naive bayes. Proceedings of the 20th national conference on artificial intelligence 2: 919-924.
Shehroz SK, Michael GM (2014) One-class classification: Taxonomy of study and review of techniques. The Knowledge Engineering Review 29: 345-374.

Citation: Sai RN, Rajesh SK (2018) A Novel Technique to Classify the Network Data by Using OCC with SVM. Int J Adv Technol 9: 201.

Copyright: ©2018 Sai RN, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

International Journal of Advancements in TechnologyOpen Access

A Novel Technique to Classify the Network Data by Using OCC with SVM

Abstract

Introduction

One Class Classifier (OCC) Versus Multi-class Classification

Literature Survey

One Class Classification

SVM Classification

Logistic Regression

Discussion

Conclusion

References

International Journal of Advancements in Technology
Open Access