Multi-Scale Object Detection in Aerial Images with Feature Pyramid Networks
The field of Computer Vision has seen rapid development with the rise of the Artificial Neural Networks. There are have a been a plethora of scientific research and publications of Computer vision algorithms using Deep Learning after the introduction of AlexNet in the 2012 ImageNet Competition. We are witnessing new Algorithms and Architectures that are beating the existing benchmark scores year after year, some even surpassing human accuracy levels in tasks like image classification all thanks to Neural N etworks. Although these algorithms are being praised for there capabilities, one can also observe that most of the benchmark are set in the standard Competition Datasets like ImageNet. The images in a typical bench-marking dataset might not always be representative of the real world. There are a lot of post-processing and quality control applied to these images. As such, we wanted to explore the applications and performance of existing Deep Learning Architectures on a dataset that is much more representative of the real world. We focused our work on the xView Dataset. This is an object detection dataset. The images in this dataset are representative of the real world for a number of regions. One could ask why Aerial Images in particular? : its because not much literature is available on niche of Machine Learning & Deep Learning on Aerial Images. With the increasing number of commercial satellites, the potential of Automatic information retrieval capabilities is huge. We used two popular object detection Algorithms : Faster R-CNN and SSD. We also made use of a Feature Pyramid Network(FPN) based classifier that we found to be well suited to the task of object detection in a dataset with large variation of object sizes. We show experimental results that show FPN back-boned network to perform better than a typical Vanilla back-boned network.