Cameras in cities can help security departments track the whereabouts of targets. Manual methods and traditional image processing methods have low accuracy and slow efficiency, and supervised learning requires a large amount of labeled data and is less robust. However, unlabeled data in a large number of application scenarios can be obtained through surveillance video.
In a study published in Applied Intelligence, a research group led by WANG Yanjie from the Changchun Institute of Optics, Fine Mechanics and Physics (CIOMP) of the Chinese Academy of Sciences (CAS) proposed a pedestrian re-identification algorithm, which was developed by integrating semi-supervised learning and similarity-preserving generative adversarial networks (SPGAN). This method is mainly divided into three stages. The styles of the labeled source domain dataset are first converted to the styles of the target domain using SPGAN. The model is then pre-trained to enable the model to extract some features of the target domain images. Combine Transformer and IBN-Net with ResNet50 to improve the generalization ability of the model. Finally, the model is converged through the teacher-student model. Mutually supervised learning by generating two synergistic networks. This approach avoids the network directly using the generated pseudo-labels for self-supervised learning. Mitigate the interference of pseudo-label noise on model training by using soft pseudo-labels.
Experimental results indicated that the maps of the applied method on the market-to-duke, duke-to-market, market-to-MSMT, and duke-to-MSMT domains were 70.2, 79.3, 30.2, and 33.4, respectively. The Top-1 of the applied method on them were 83.4, 93.2, 58.2, and 62.1, respectively.
Based on this method, neural network can be trained using unlabeled data in the application scenario, and can meet the accuracy requirements in the target domain scenario.