In contrast to the walking-cycle-extraction methods based on one single step, the proposed method relaxes the limit on step number for a walking cycle, thus making it robust and flexible. Second, a walking cycle extraction algorithm is designed for video-based person re-id. To summarize the main contributions of this paper, first, a key-frame-detection method based on PCANet is proposed. Finally, the walking cycle feature is fused with the average pooling of LOMO features of a video to enhance the robustness of the pedestrian representation. The feature of walking cycle conveys detailed information of person's appearance. Then, local features of Local maximal occurrence (LOMO) are extracted from the walking cycle. Second, at the second level, a pedestrian walking cycle is extracted based on the detected key frame. It aims to select the most discriminative frame from the entire video sequence as the key frame to reflect the fundamental content of the video. At the first level, a novel key-frame-detection method based on the PCANet features is proposed. First of all, a lightweight Convolutional neural network (CNN) of PCANet is adopted to obtain the robust representation of the pedestrian, which takes video sequences of different views as input. Therefore, in this paper, a two-level hierarchical scheme is proposed by jointly exploiting keyframe-detection and walking-cycle-extraction. So far, only a few video-based methods have been proposed and the problems mentioned above remain unsolved. Some sample frames captured from video sequences are illustrated in Fig. For example, redundant information in videos may have an influence on the recognition result considerably, not to mention the distraction information such as video noise. In addition, sequential frames in the video can provide a large number of samples, which makes it convenient to train machine-learning algorithms that lead to better re-id performance.Īlthough more information can be utilized through videos, more challenges come along. First, the video inherently contains rich spatial-temporal information, which can facilitate to obtain the more robust appearance representation of the pedestrian. The use of videos for person re-id has several advantages over still images. Very recently, some researchers consider performing person re-id by using video sequences. However, in the image-based person re-id, only a few still images are contained for each person, which is inadequate to overcome the problems of occlusion, viewpoint and pose change, and generate a robust pedestrian representation. In the past few years, many efforts have been made in person re-id, with most of them based on still image. Basically, the main challenge in person re-id lies in the fact that intra-class variability is often significantly larger than inter-class variability. However, person re-id is a challenging task due to the cluttered background and large variations of illumination, viewpoint, and pose. Person re-id has been widely applied in various intelligent surveillance and public security systems, which has attracted the attention of researchers from both academic and industrial fields. The goal of person re-identification (re-id) is to identify and associate a certain target person when he/she appears across non-overlapping camera views. The experimental results demonstrate that our proposed scheme outperforms the six state-of-art video-based re-id methods, and is more robust to the severe video noises and variations in pose, lighting, and camera viewpoint. Experiments are conducted on two benchmark datasets: PRID 2011 and iLIDS-VID. In contrast to the existing walking-cycle-based person re-id approaches, the proposed scheme relaxes the limit on step number for a walking cycle, thus making it flexible and less affected by noisy frames. Moreover, local features of Local maximal occurrence (LOMO) of the walking cycle are extracted to represent the pedestrian' s appearance information. At the second level, on the basis of the detected key frame, the pedestrian walking cycle is extracted from the long video sequence. Specifically, given a video with consecutive frames, the objective of the first level is to detect the key frame with lightweight Convolutional neural network (CNN) of PCANet to reflect the summary of the video content. IET Generation, Transmission & DistributionĪ two-level hierarchical scheme for video-based person re-identification (re-id) is presented, with the aim of learning a pedestrian appearance model through more complete walking cycle extraction.IET Electrical Systems in Transportation.IET Cyber-Physical Systems: Theory & Applications.IET Collaborative Intelligent Manufacturing.CAAI Transactions on Intelligence Technology.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |