Automatic Extraction of the Non-Urban Roads from Satellite Images using Artificial Neural Network

Road detection from the satellite images can be considered as a classification process in which pixels are divided into the road and non-road classes. In this research, an automatic road extraction using an artificial neural network (ANN) based on automatic information extraction from satellite images and self-adjusting of the hidden layer proposed. Parameters of non-urban road networks from satellite images using a histogram-based binary image segmentation technique are also presented. The segmentation method is implemented by determining a global threshold, which is obtained from a statistical analysis of a number of sample satellite images and their ground truths. The thresholding method is based on two major facts: first, the points corresponding to non-asphalt roads are brighter than other areas in non-urban images. Second, it is observed that in an aerial image, the area covered by roads is only a small fraction of total pixels. It is also observed that pixels corresponding to roads are generally populated at the very bright end of the image greyscale histogram. In this method, at first, the possible road pixels are selected by the proposed segmentation method. Then different parameters, including color, gradient, and entropy, are computed for each pixel from the source image. Finally, these features are used for the artificial neural network input. The results show that the accuracy of the proposed road extraction method is around 80%.


Introduction
Aerial and satellite images provide a rich source of information about the form of ground, location, and characteristics of plants such as trees and man-made objects such as buildings, roads, and bridges. Without these kinds of images extracting such information would have been a costly and time-consuming process. The extraction of roads and other routes from digital images is a well-studied subject, and a variety of techniques can be found in the literature. For road extraction, two kinds of algorithms are available; semi-automatic and fully-automatic. Fully automated methods do not need human intervention. On the contrary, semi-automatic methods require help from a human operator and need a number of seed points that are usually chosen by the operator. During the last two decades, many approaches for automatic road extraction have been reported in the literature. The goal of automation is to increase the speed and to lower the costs of extraction [1]. In [2], different methods for automatic road extraction from aerial and satellite imagery are discussed, e.g., fuzzy logic [3], artificial neural networks [4,5], genetic algorithms [6], the Radon transform [7,8], and the application of local or global strategies [9,10]. For the detection of edges, the algorithms such as Canny [11], Prewitt [12], and Sobel [13] are presented by [14,15]. Methods of the detection of lines are presented by [16,17]. Spectral characteristics and linear spatial structure of roads are generally used in road detection from satellite imageries. Roads cannot be found reliably by using one of these characteristics alone, for example, roads share spectral characteristics with different types of terrain such as clear-cut forest areas, or linear features occur in satellite images such as furrows, therefore, mistakenly can be chosen as roads [18,19]. In this work, automatic road extraction is performed on satellite images obtained from the Google Maps application, using artificial neural networks. In designing the ANN, the size of the input layer is the same as the input features and the output layer is just one neuron that shows whether the input features represent a road pixel or not. The important factor for designing ANN in road detection is selecting what type of information should be extracted from the image to use as input parameters. The differentiation ability of the network is profoundly affected by the chosen input features. Spectral characteristics and linear spatial structure information are used as the input feature.

Materials and Method
Road networks in non-urban areas in satellite images have distinct brightness differences in comparison to non-road areas and presented as elongated homogeneous areas ( Figure 1). Input features that are to be fed to the ANN should be selected in such a way that they should be discriminative enough to let the ANN learn the distinction between the road and non-road regions. This research consists of two steps; first, detection of the road pixels, followed by extraction of the road by using the road and non-road pixels. The designed methodology was performed on different satellite images ( Figure 2). The second step of the methodology is concentrated on the differentiation between the road and non-road pixels. In this step, the pixel color values is used for distinguishing between road and non-road pixels. The road features extraction step aims to find all possible road pixels in an image.

Road detection
Assuming that the road pixels in satellite image are brighter than non-road areas in unstructured terrains makes it possible to obtain the road and paths by binarization the satellite image. In order to make a distinction between the road and non-road pixels in the segmentation process, the value for threshold should be computed precisely [20,21]. Determining the value for the binarization threshold using a greyscale histogram includes the following steps. 1. A hundred sample satellite images are captured in five different zoom levels from Google maps.
Ground truths for the sample images are obtained. Table 1 shows three different sample images and their ground truths. 2. The number of roads (white) pixels is counted for each ground truth image, and Road Pixel Ratio (RPR) is computed as follows.
RPR% = ( number of road pixels total number of pixels ) × 100 3. All sample images in five different zoom level classes are investigated and the average value of RPR for each class is computed (Figure 3). The numerical values indicate that 2 to 6% of total pixels in each satellite image correspond to road areas. The average value for RPR is chosen 4% which is a proper approximation of the number of road pixels in all images.  4. In order to implement the algorithm on a satellite image, first, its greyscale histogram is obtained. Then, according to the part 3 and the assumption that the roads have higher intensities, the value of the threshold is calculated such that 4% of total pixels have intensities higher than the threshold and turn white after binarization. Remaining pixels (96% total) correspond to non-road pixels and turn black in the resultant binary image (Figure 4).

Road Extraction
In this research, a back-propagation (BP) neural network with one hidden layer is used for the road extraction. Red, Green, and Blue (RGB) pixels, entropy and gradient are entered to the ANN as the input features. The input layer is divided into two parts; the road and non-road. Thus, ten neurons are designed in the input layer in charge of receiving spectral and textural values for each pixel in the entire image; five neurons for road pixels and five neurons for non-road pixels ( Figure 5). Input pixels' information is found as the following steps: 1. The local entropy of the grayscale image is obtained. 2. Gradient magnitude and direction of the grayscale image using the Sobel gradient operator are obtained ( Figure 6). 3. In literature, different methods are used for finding edges in images such as Canny, Prewitt, Sobel and Roberts [22]. For the proposed thresholding method, Canny edge detection is used and the edges of the image using Canny edge detection with the proposed thresholding method are obtained (Figure 7 (Left)). 4. The longest connectivity, which is shown the most possibility of the roads in the image, is obtained (Figure 7 (Middle)). 5. For the non-road pixels, the local minima intensity of the image is used (Figure 7 (Right)). In this research, the lowest 20% of the pixels are selected. The number of these pixels are more than the number of possible road pixels obtained in part 4. For the input layer of the ANN, corresponding pixels for the road and non-road pixels must be equal, therefore, to make equality between them, the possible non-road pixels from lowest 20% are selected randomly.  By using the coordinates of these pixels, spectral and textural information from the original image is obtained. Selecting of the hidden neurons in order to solve a specific task is an essential factor that affects network ability. Usually, one hidden layer is sufficient, although the number of neurons in the hidden layer is often not readily determined [23]. The network, with few hidden neurons, is very unstable and it might have an extensive training error due to under-fitting. Increasing the size of the hidden layer, enables the network to adjust more complicated problems and might have an extensive training error due to over-fitting [5,24]. Although finding the optimal number of neurons in the hidden layer is a difficult task, but it is an important task. According to the proposed method, each sample will have an own number of the input neurons; therefore, the hidden neurons must be changed in different samples and must have a relationship with its input neurons. According to [25], a number of neurons in the hidden layer are related to a number of the neurons in the input layer.
where, n is the number of the input parameters and ℎ is the number of neurons in the hidden layer. Another important factor that affects network ability is the number of iteration, which is done in the training stage. When the number of iteration is vast and more than what is needed, the network memorizes the training stage, and it causes to reduce the differentiation ability of the network. Therefore, the network needs a termination conditions to avoid an over-fitting problem. In this research k-fold cross-validation is used to avoid over-training problem. In this procedure, a data set is divided into k equal size disjoint folds, and each of these folds is used to test the model induced from the other k-folds by a classification algorithm. In [26], Wong shows that the sampling distribution of the mean accuracy obtained by k-fold cross-validation is independent of the number of folds k. In this research, 100-fold cross-validation is chosen.

Results
The results of implementing the discussed methods are presented in this part. R, G, B, Entropy and Gradient information of pixels are used in the input layer of the ANN. This information is obtained from the pixels that are captured in the road detection part. The pack-propagation artificial neural network algorithms with the hidden layers which are obtained from Eq. (2) are used. 10-fold cross validation is used to overcome the over-training problem. Figure 8, 9 and 10 show the result of the proposed method. Confusion matrix analysis, which is known to be a reliable statistical evaluation method, has also been used to numerically compare the methodologies. Table 2 show the values of Kappa [27] and G-mean [28] coefficients proposed method applied to the different test images.

Conclusions
In this study, a methodology is completed for the purpose of extraction roads from satellite images in the unstructured terrains. The first step called road detection was performed based on histogram segmentation. The assumption that the roads have higher intensities, therefore the pixels with higher intensities has higher possibility to be the pixels of the road. This information uses for both the road and background detection in the input layer of the neural networks. In the second step of the road extraction, the back-propagation artificial neural network algorithms are used for finding all possible road pixels in the image. The results of this method show that accuracy is around 80%.