Handwritten digit recognition based on ghost imaging with deep learning*

2021-05-24XingHe何行ShengMeiZhao赵生妹andLeWang王乐

Chinese Physics B 2021年5期

Xing He(何行), Sheng-Mei Zhao(赵生妹),,†, and Le Wang(王乐)

1Institute of Signal Processing and Transmission,Nanjing University of Posts and Telecommunications,Nanjing 210003,China

2Key Laboratory of Broadband Wireless Communication and Sensor Network Technology(Ministry of Education),Nanjing 210003,China

Keywords: ghost imaging,handwritten digit recognition,ghost handwritten recognition,deep learning

1. Introduction

In recent years, handwritten digit recognition is becoming an active research topic because it has many practical applications. However, handwritten digit recognition is still of challenge due to different handwriting qualities and different styles.[1]Several methods have been proposed for solving this problem, such as deep learning-based classification algorithms,[2]artificial neural networks,[3]and support vector machine classifier.[4]

Among them,images of target handwritten digits should be firstly obtained, then the classification can be achieved with the characteristic information.[5]Although the satisfactory recognition results could be achieved with the existing recognition methods, several digits misrecognitions are still inevitable due to the large variation of the individual writing style. Furthermore, in some cases, characteristic information of handwritten digits is impossibly achieved before recognition.

In another context, Ghost imaging (GI), also called correlated imaging, is an intriguing optical imaging technique,where the images of objects can be achieved by correlating the fluctuations between the separated optical fields, without the need to record the image itself. GI offers great promise for its robustness against harsh environment, higher spatial resolution and higher detection sensitivity, and now has great applications in remote sensing,[6]underwater imaging,[7]object edges extraction,[8,9]image hiding methods,[10,11]and optical encryption schemes.[12,13]

Initially, GI was experimental demonstrated using entangled-photon pairs, generated by spontaneous parametric down-conversion (SPDC) in 1995.[14]Then, GI was also found to be successfully realized using a classical pseudothermal light source,thermal source[15]and daylight.[16]With the development of GI,computational ghost imaging(CGI)[17]was introduced to compute the intensity offline,which greatly simplified GI’s configuration and generalized GI’s applications. It was also demonstrated that the single-pixel imaging(SPI) is the technique as same as GI. Recently, researchers have studied the image classification using single-pixel detectors,for example,the author proposed that reconstructionfree multi-class image classification framework to reveal the classification only based on the sequence of photodetector measurements is possible.[18]Moreover, an optical diffractive neural network to perform machine learning tasks, such as number digit recognition, in an all-optical manner was presented.[19]The single-pixel non-imaging object was experimentally demonstrated to achieve recognition by acquiring the Fourier spectrum.[20]Additionally, a single neural network with multi-rate property for compressed domain classification,[21]and a neural network for simultaneously learning the linear binary sensing matrix and the non-linear classification parameters[22]were discussed for single-pixel camera (SPC). Zhang et al.[23]used the spatial light modulation to acquire the feature information and the convolutional neural network to learn the spatial features.

Most of the natural images, including the handwritten digit images, are sparse in frequency spectrum domain and mainly focus on the low-frequency components,implying that their most coefficients in discrete cosine transform(DCT)basis are close to or equal to zero. At the same time, artificial neural networks are good at object recognitions.[2–4,18,19]Hence, we propose a novel handwritten digit recognition scheme using GI technique and DNN,named ghost handwritten digit recognition (GHDR), where a fewer detection signals,obtained from the DCT speckle pattern illuminations,are used as the feature information of the unknown handwritten digit image,and the input of the designed deep neural network(DNN).The recognition result is output from the DNN,which has one input layer,three hidden layers and one output layer.

The advantages of the proposed scheme are described in the following. With the non-locality of GI technique,the proposed scheme can obtain the recognition without obtaining the clear handwritten digit image at first. Secondly,the handwritten digit image(two-dimension)is projected and characterized by a smaller one-dimension set using GI technique,which will simplify the designed DNN configuration, furthermore, this will greatly reduce the complexity of the recognition.

2. Ghost handwritten digit recognition using deep neural network

2.1. The ghost handwritten digit recognition scheme

The schematic diagram of the proposed scheme is illustrated in Fig. 1, which has the speckle patterns’ generation part, detector signal achievement part, and the handwritten digit recognition part. At speckle patterns’ generation part,some computed rows are selected from the normal DCT matrix and reshaped into two dimensions for constructing the patterns of special speckles. At detector signal achievement part,these patterns are firstly modulated on the beam to generate the special speckles I1(x,y),...,IM(x,y)by a digital light projector (DLP). Then, these speckles are illustrated on the unknown handwritten digit image successively, after the expansion with a projector lens. The resultant beam is subsequently detected by a bucket detector to achieve the detection signals Bi, i=1,...,M. At handwritten digit recognition part, the detection signals are used as the input to a designed DNN network,and the output is the classification result of the unknown handwritten digit image.

Fig.1. The schematic diagram of the ghost handwritten digit recognition.

Since the classified object is a handwritten digit image,it should be inherently sparse in frequency spectrum domain and mainly focus on the low-frequency components. Hence, it is reasonable to use their low-frequency components for the feature extraction. To obtain the low-frequency component of the unknown handwritten digit image based on GI technique,the DCT speckles are preferable.[24]One method to obtain the cosine speckle is from Discrete cosine Transform(DCT),which is a discrete Fourier transform (DFT) with real transform coefficients. For one-dimensional DCT,the frequency spectrum can be expressed as

It is known that low-frequency components are the most important frequency spectrum values of handwritten digit images. We find that the upper left corner of the frequency spectrum contains a lot of low-frequency information when T is transformed by one-dimensional DCT and then reshaped to the same size as the image by row. Hence, it is suitable to consider that the first elements in b rows and a columns in the frequency spectrum (total a×b elements) are important frequency spectrum values. Consequently,the location of the corresponding speckle Ii(x,y)in DCT matrix can be computed as

where(fa,fb),1 ≤fa≤a,1 ≤fb≤b,is the coordinate of the element in the two-dimensional image frequency spectrum. In order to quantify the number of important frequency spectrum values used in the recognition,we define a sampling ratio SR as

For simplification, a is commonly chosen as same as b, Nxis equal to Ny.

2.2. Deep neural network for the proposed scheme

After the most important frequency spectrum values are obtained using GI technique,these detection signals then form a vector, and as the input to the designed and trained deep neural network (DNN) for handwritten digit image recognition. Figure 2 shows the structure of the designed DNN for handwritten digit image recognition in simulation. In the process of the experiment, by adjusting the number of neurons and learning rate of deep neural network, we can get a better recognition accuracy of experiment. The DNN is a feed forward artificial neural network with three hidden layers,and the full connections are used between the layers. The first layer is the input layer,which has M neurons designed for M detection signals B1,B2,B3,...,BM. There are 2000 neurons in the first two hidden layers and 1000 neurons in the third hidden layer.There are 10 neurons in the output layer which correspond to the ten digits.

where ξ is the learning rate to update weights and biases. The dropout function is also used to prevent the overfitting.[25]

Fig.2. The designed deep neural network for handwritten digit image recognition.

3. Experimental and simulation

In this section, we present experiments and simulations to testify the proposed handwritten digit recognition scheme.The numerical simulations were implemented by Keras framework based on TensorFlow with CPU of Intel Core i7-4790(Dell Optiplex 920 with intel core 3.6 GHz and memory 24 GB).The language was Python version 3.5(64 bits).All the handwritten digit images were from MNIST handwritten digit database,including 60000 training images and 10000 test images,all of which were 28 pixel×28 pixel grayscale images.

3.1. The schematic diagram of experiment

The experiment is developed with the schematic diagram in Fig.3. With the designed method proposed above,the DCT patterns with the locations for the most important frequency spectrum values, were firstly produced from DCT transform.Then,they were regarded as the speckle pattern,realized by a digital light projector(DLP)(TI Digital light Crafter 4500),to modulate the beam. The beam was expanded by a projecting lens(focal length is 200 mm)and illuminated on the unknown handwritten digit image, and was then focused by collecting lens(focal length is 200 mm). Finally,the detection result Biwas obtained from the bucket detector(Thorlabs power meter S142C).Here,the bucket detector recorded the most important frequency spectrum values for the unknown handwritten digit image. The above process was repeated for all the handwritten digit images in the training set,and all the bucket detectors output data were as the training data for the designed DNN.When the designed DNN was trained, the above process was executed for the testing handwritten digit images.

Fig.3. The experimental setup of the proposed scheme.

3.2. Experimental and simulation results

Figure 4 shows the experimental and simulation results with the proposed handwritten digit image scheme for different sampling ratios (SR); (a) by simulations, where 60000 handwritten digit images are adopted for training the designed DNN network, and 10000 handwritten digit images are used for testing; (b)by experiments,where 9000 handwritten digit images are used for training, and 1000 handwritten digit images are adopted for testing. Here,9000 handwritten digit images are randomly selected from the MNIST database. The suitable neural network parameters are set for each sampling ratio, and the number of iterations is 120 while the accuracy of each digit may not be in the same iteration. The results showed that the proposed scheme had a higher recognition accuracy for the handwritten digit images. When SR=12.76%,the average recognition accuracy was 98%for the simulation results,and was 91%for the experimental results. It was also shown that some digits were easy for recognition,such as the digit 0 and 1,and some digits were hard to recognize,such as the digit 3,5,8 and 9,due to various styles of these handwritten digits in the database. The recognition accuracies by simulations were greater than those by experiments for the larger training set in simulations.

Fig.4.The experimental and simulation results with the proposed handwritten digit image scheme for different sampling ratios (SR), (a) the accuracy of ten digits by simulations, where 60000 handwritten digit images from MNIST handwritten digit database are adopted to train the designed DNN network, 10000 handwritten digit images are used to testify;(b)the accuracy of ten digits by experiment,where 9000 handwritten digit images from MNIST handwritten digit database are used for training,and 1000 handwritten digit images are adopted to testify.

In order to demonstrate the performance of our proposed scheme effectively, we present the accuracy of those hardly recognized handwritten digits (such as 3, 5, 8, 9) versus the sampling ratio in Fig. 5. The experimental results show that the accuracies of those hardly recognized handwritten digits increase greatly with the increase of the sampling ratio, and they all approach to 90%when the sample ratio is greater than 12.76%. It is indicated that the proposed recognition scheme has a better performance even for the hardly recognized handwritten digits with only bucket detection signals. When SR is less than or equal to 8.1% for handwritten digit 3, the DCT spectrum values would be confused with those of handwritten digits 2,4,5,6,7 and 8,so the accuracy is relatively smaller,say less than 70%. However, the false recognition is greatly reduced when SR is increased to 10.33%,because the further added DCT spectrum values contain the key information to distinguish digit 3. Now, handwritten digit 3 would only be misidentified with handwritten digits 2,5,8.

Then,we compare the accuracy of the proposed recognition scheme with different classification algorithms in Fig.6,where deep neural network, k-nearest neighbor, multiple Bernoulli model naive, Gaussian naive Bayesian, decision tree classifier, gradient boosting classifier, and random forest classifier[26–29]are discussed. In the experiment, we also use the DCT spectrum values of 9000 handwritten digits and their corresponding digit categories into these machine learning methods for training,and use 1000 sets of test data to test the recognition accuracy. The experimental results show that the proposed recognition scheme has different accuracies for the different classification algorithms even though the input data, and the sample ratios are the same. From the results, it is seen that the deep neural network(our scheme)and the gradient boosting classifier have a better performance among all the classification algorithms. The gradient boosting classifier is a machine learning method,which is suitable for small-scale data, has natural processing ability for mixed data, and has strong prediction ability (good general performance). Compared with other methods, this method has the advantage of stable recognition rate.

Fig. 5. The accuracy of hardly recognized handwritten digits (3, 5, 8,9)versus the sample ratio.

Fig.6.The average accuracy of handwritten digit image with different machine learning classification algorithms,such as k-nearest neighbor,multiple bernoulli model naive,gaussian naive bayesian,decision tree classifier,gradient boosting classifier and random forest classifier algorithm.

Finally,we compare the performance of our scheme with the traditional recognition method using DNN.The traditional DNN recognition method has two schemes, one is the traditional DNN recognition scheme 1,and the other is traditional DNN recognition scheme 2.All schemes in Fig.7 adopt 60000 training data sets and 10000 test data sets.All the hidden layer adopts a three-layer structure. The number of neural nodes is 2000, 2000, 1000, and the network output is 10 digital categories. Among them,the traditional DNN recognition scheme(1 and 2)input data is the pixel value of the whole handwritten digital image(expanded into a column by row),the number is 28×28=784, the loss function of traditional DNN recognition scheme 1 is the cross entropy function, the optimization loss function is using Adam algorithm. The traditional DNN recognition scheme 2 is the mean square error function of output and corresponding digital category, and the optimization loss function is gradient descent algorithm. The input of our proposed recognition scheme is the barrel detector value obtained by DCT speckle irradiation, which is DCT spectrum value m.Here m is far less than 784(M=100 in Fig.7),which uses cross entropy as loss function and Adam algorithm as optimization for loss function. The numerical simulation results show that our proposed scheme has almost the same well performance with the traditional DNN recognition scheme when the sampling ratio is 12.76%. With the increase of the iteration number, the accuracy of all methods tends to be stable.Because the input data of traditional DNN recognition scheme 1 are all the pixel values of handwritten digits, the information contained is very complete. However, the input data of our scheme is 100 DCT spectrum values,which contains some low frequency informations of the image, and the amount of data is far less than that of traditional methods,so the accuracy of recognition is lower than the accuracy of traditional DNN recognition scheme 1. However,the average recognition time for our proposed scheme is greatly reduced since only a few bucket detection signals are input to the DNN in our proposed scheme. The cost time is approximately 115 s for the traditional DNN recognition scheme 1, and is approximately 88 s for the traditional DNN recognition scheme 2,while it is 58 s for our proposed scheme.

Fig.7. The performance comparison between our proposed recognition scheme,the traditional DNN recognition scheme 1 and traditional DNN recognition scheme 2.

4. Conclusion

In summary,we have proposed a ghost handwritten digit image recognition scheme based on DNN network using GI in the study,where the bucket detection signals in GI,generated by the cosine transform speckle, have been used as the feature information of the handwritten digit image and the input vector to the DNN. The experimental and simulation results have shown that the proposed scheme has a higher recognition accuracy for the unknown handwritten digit image with a smaller sampling ratio. With the increase of the sampling ratio,the recognition accuracy has been increased. Compared with the traditional recognition scheme using the same DNN structure,our proposed scheme has almost better or almost the same well performance and less consumed time. Importantly,our proposed scheme has a lower complexity and is feasible for the unknown handwritten digit image. It is believable that the proposed scheme can be used for other image recognitions when the pictures of unknown objects are from other datasets,such as Fashion-MNIST dataset and CIFAR-10 dataset, and the neural network are trained at first.

Chinese Physics B

2021年5期