CNN Analysis on Security Data: Deriving Natural Order in a Nonnatural World
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Neural Networks with back propagation are a powerful deep learning tool and were enhanced with the Convolutional Neural Networks (CNN). CNN gained notoriety in the detection and classification of objects within images. Images are grids, or matrices, of data aligned in a specific pattern. These structures are naturally defined by the light photons that were the source of the images. CNN have since proven valuable in non-image tasks, detecting and classifying variances in different data. When CNN are used with other sources, often those data have a naturally defined order, but what of cases where no natural order exists? CNN analyze the underlying structure of images, extracting features out of the mathematical patterns from which objects are identified. When data is not naturally ordered, duplicating similar mathematical structures should improve CNN performance. One example is digital data comprising of security breaches. That data rarely has a naturally derived order and is usually defined by specification or arbitrary log entry. Security is also an area where high performance is preferred so research that effects improvement has merit. This topic is examined by exploring the mathematical structure of images, followed with an analysis of how a CNN are identifying those patterns. It then includes an algorithm for mimicking a similar structure in nonnatural security data, followed by testing between the original specification structure, statistically derived image related schemes, and a set of random orders. Also included is an examination of current visualizations tools to gain an understanding of the parameter behavior. By testing this hypothesis with different data sets and multiple models of CNN it is shown that using mathematical relationships to define the matrix structure, attempting to match those found in images, has strong merit for higher performance, but understanding the strengths and weaknesses of a particular CNN model variant is imperative to maximize the benefit.