快捷分类

Pedestrian attribute classification with multi-scale and multi-label convolutional neural networks①

更新时间：2016-07-05

0　Introduction

At present, research of pedestrian attribute classification has attracted a lot of attention. Pedestrian attributes, such as gender, darkhair and skirt, can be used as soft biometric traits in the surveillance field for public security. For example, pedestrian attributes are useful clues for person retrieval[1, 2], subject identification[3], human identification[4,5] and person re-identification[6,7]. In practical surveillance scenarios, pedestrian attribute classification is a challenging task in computer vision, since pedestrian images are usually of low resolution, blurred and partially occluded and contain variations of illumination and viewpoint. Therefore, how to develop an effective pedestrian attribute classification method becomes a very challenging and desirable topic.

The most popular method for pedestrian attribute classification is the one using hand-crafted features (e.g., MBLBP[8], RGB, HSV and YCbCr color histograms, Gabor and Schmid features[6]) and support vector machine (SVM) based attribute independent classifiers[3,6,9-11], which cannot fully solve the pedestrian attribute classification problem. Because hand-crafted features have a limited representation ability for large appearance variations, and attribute independent SVM classifiers cannot investigate interactions of different attributes. Moreover, along with the increasing numbers of pedestrian attributes, training SVM-based attribute classifier one by one is very tedious.

In this study, a multi-scale and multi-label convolutional neural network (MSMLCNN) is proposed to solve the pedestrian attribute classification problem. Following the VGGNet architecture[12] that applies small sized filters for each convolutional layer and emplaces multiple convolutional layers before one pooling layer, a very deep network with a strong feature learning ability can be obtained in this paper. However, it is difficult to train this very deep network, because the gradient vanishing problem[13] may appear in the back propagation process. Moreover, for those attributes with complex localizing characteristics and different scales, the way of only using the features learned in the last layer is not completely suitable. Because features learned in the last layer are too global to some local attributes, such as hassunglasses, upperbodyvneck and footwearsandals. Therefore, the proposed MSMLCNN is designed by fully connecting each attribute with multiple pooling layers at different scales, which not only adds the supervisory signal to multiple intermediate layers, but also combines local and global features for the attribute classification.

总体组是铁路勘察设计院最基本的生产单元，鉴于总体在项目推进过程中发挥着最基础、最完整、最重要的统筹协调和把关作用，总体是团队运作的“领军人”、信息传递的“中间人”、项目控制的“关键人”、形象展示的“代言人”。这个表述远远高过设计院规定的岗位职责，明确了总体既负总责，又抓落实，还当代言的角色定位。

The rest of this paper is organized as follows. Section 1 summarizes the related work. Section 2 introduces the proposed multi-scale and multi-label convolutional neural network. Section 3 presents experimental results to validate the superiority of the proposed method. Section 4 concludes this work.

1　Related work

1.1　Attributepedestriandatabase

[25] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]. In: Proceedings ofAdvances in Neural Information Processing Systems, Harrahs and Harveys, Lake Tahoe, USA, 2012.1097-1105

2.其二，很多企业并没有从现代经济管理活动的特点及要求出发，无论是经营成本控制，还是具体的经营效率上，都缺乏有效的管理，这就影响了企业的科学发展。特别是在服务至上的今天，企业到了自身突破的重要时期，注重选择适合企业发展的经济管理理念，是企业活动开展的前提。

Fig.1　Annotated sample images selected from the PETA[11] database.

1.2　Pedestrian attribute classification

The most popular approach for pedestrian attribute classification is to train each attribute classifier independently on hand-crafted features. In terms of hand-crafted features, there are many local features possible be applied to describe pedestrian images for different attribute classifications, such as MBLBP[8], RGB, HSV and YCbCr color histograms, Gabor and Schmid[6] features. Moreover, sparse feature representation methods[17,18] also can be used to represent pedestrian images.

In terms of attribute classifiers, support vector machines (SVMs) are most commonly used. For example, support vector machines (SVMs) were applied to train attribute independent classifiers in Refs[3, 6, 9, 10]. Moreover, the gentle AdaBoost[19] algorithm was utilized to train each attribute’s classifier independently in Ref.[8]. If the number of pedestrian attributes is small, these straight forward methods are able to train pedestrian attribute independent classifiers conveniently. However, when the number of pedestrian attributes is huge, the one by one attribute independent training progress is too tedious for human. In addition, these methods still leave a room for improving the accuracy of pedestrian attribute classification, because they do not take the interactions of different attributes into account.

Considering that pedestrian attribute classification is a multi-label classification problem, rather than a multi-classification problem[20], there are some methods learning interaction models of different attributes to improve the performance of pedestrian attribute classification. For example, Chen, et al.[21] applied a conditional random field (CRF) to learn an attribute interaction model. Deng, et al.[11] built an undirected graph with a Markov random field (MRF) to model the relationships of different attributes. In the previous work[22], pedestrian attribute classification was improved by weighting interactions from other attributes.

1.2 研究方法记录并比较两组新生儿性别、出生体质量、胎龄、急性呼吸窘迫综合征(acute respiratory distress syndrome,ARDS)、肺炎、湿肺、羊水吸入综合征、分娩方式等情况，进行危险因素分析。

1.3　Convolutional neural network

Convolutional neural networks (CNNs)[23,24] have been used in many image-related applications and exhibited good performances. Krizhevsky, et al.[25] applied AlexNet for image classification and it outperformed many state-of-the-art methods on the ImageNet database. Donahue, et al.[26] and Razavian, et al.[27] demonstrated that off-the-shelf features learned by a CNN pre-trained on the ImageNet database could be effectively adopted to attribute classifications. Sun, et al.[28] proposed a CNN named DeepID to learn a set of high-level feature representations for face verification. DeepID achieves a 97.45% face verification accuracy on the LFW database and it is almost as good as the human performance of 97.53%. Based on DeepID, DeepID2[29] and DeepID3[30], further improvement is got for the face verification accuracy on the LFW database. Gong, et al.[31] proposed a multi-label deep convolutional ranking network to address the multi-label image annotation problem. They adopted the architecture proposed in Ref.[25] as a basic architecture and redesigned a multi-label ranking cost layer for multi-label prediction tasks. Zhu, et al.[32] proposed a multi-label convolutional neural network for pedestrian attribute classification, which learns single scale features for all attributes.

一是要学习和实践马克思主义关于人类社会发展规律的思想。学习这一思想的现实意义在于坚定共产党人的理想信念，筑牢理论自信的根本。

The most popular components in the recent research of CNN include rectified linear unit (ReLU) neuron[33], dropout[34], batch normalization[35], adding supervisory signals to intermediate layers[36], multi-scale fully connection[37], joint identification-verification cost function[29] and small filter and very deep architecture[12], ResNet[38,39] and DenseNet[40].

2　Multi-scale and multi-label convolutional neural network

2.1　Networkarchitecture

As shown in Fig.2, the proposed multi-scale and multi-label convolutional neural network (MSMLCNN) uses a VGGNet[12] architecture which emplaces two continuous convolutional layers before the first two max pooling layers. Moreover, different from VGGNet[12], the proposed MSMLCNN adds supervisory signals to the last three pooling layers (i.e., layers 8, 10, 12). The VGGNet architecture has shown that continuous convolutions are able to learn features with larger receptive fields and obtain more complex nonlinearity while restrict the number of parameters.

Fig.2　　　　The architecture of the deep multi-scale and multi-label convolutional neural network (MSMLCNN)

used in the proposed method

Refs[30,36] have shown the benefits of introducing supervisory signals to multiple layers in two aspects. First, it is useful to learn more discriminative mid-level features. Second, it is able to avoid gradient vanish to make the optimization of a very deep neural network easier. However, in Refs[30,36], when training a deep CNN, supervisory signals were added by connecting some intermediate layers with multiple cost functions configured with different weights. In the testing process, only the prediction model learned on the last layer is used for predicting, while those prediction models learned on the intermediate layers are discarded. This way of adding supervisory signals is not completely suitable for pedestrian attribute classification. The reasons are listed as follows.

First, it is very difficult to assign suitable weights to different attribute cost functions that are connected with different layers in the training progress. Second, attributes having complex localizing characteristics and different scales, the way only using features learned in the last layer is inappropriate, because the features learned in the last layer are too global to local attributes

(e.g., hassunglasses, upperBodyVNeck and footwearSandals). Therefore, as shown in Fig.2, in the proposed MSMLCNN, supervisory signals are added by fully connecting each attribute with multiple pooling layers at different scales, which makes the MSMLCNN able to learn multi-scale features for different attributes and apply multi-scale features for the testing phase. Parameter details of the proposed MSMLCNN is listed in Table 1. In the table C, MP and AP represent convolutional, max pooling and average pooling layers, respectively.

Table 1　Parameter details of the proposed MSMLCNN

LayerTypeSizeNeuronFilter/stride1C128×48×64ReLU3×3/12C128×48×64ReLU3×3/13MP64×24×64⁃3×3/24C64×24×96ReLU3×3/15C64×24×96ReLU3×3/16MP32×12×96⁃3×3/27C32×12×128ReLU3×3/18MP16×6×128⁃3×3/29C16×6×160ReLU3×3/110MP8×3×160⁃3×3/211C8×3×192⁃3×3/112AP6×1×192⁃3×3/1

2.2　Cost function design

In order to make the proposed MSMLCNN able to predict all attributes simultaneously, all attributes are fully connected with the last three pooling layers, as shown in Fig.2. In practice, there are not only binary class attributes, but also multi-class attributes, such as clothing color. In order to make the design of each attribute’s cost function more consistent, the pedestrian attribute classification problem is transformed into a multi-label classification problem including multiple binary attributes needing to be classified, then all binary attribute classification problems will be solved with logistic regression models[41]. The cost function of the proposed MSMLCNN is formulated as follows:

(1)

where Gk is the cost function of the k-th attribute, K is the total number of attributes, λk≥0 is a parameter used to control the contribution of the k-th attribute classification. In this work, λk is set as λk=1/K and Gk is formulated as follows:

Gk=

(2)

where {xn, represents a training sample and is k-th attribute label of n-th sample xn, N represents the number of training samples, wk represents fully connected parameters between k-th attribute and the last pooling layers. To avoid the imbalanced classification, Eq.(2) is further extended as:

Gk=

(3)

[ 3] Jaha E S, Nixon M S. Soft biometrics for subject identification using clothing attributes[C]. In: Proceedings of IEEE International Joint Conference on Biometrics, Clearwater, Florida, USA, 2014. 1-6

3　Experiment and analysis

All images of the PETA database are scaled into 128×48 pixels. Following the evaluation protocol in Ref.[11], PETA is divided into non-overlapping train, validation and test subsets, which includes 9500, 1900, and 7600 images, respectively. Both train and validation subsets are augmented by translation and mirror operations. All multi-class attributes are transformed into multiple binary-class attributes. A binary attribute is considered imbalanced, if the number of positive samples is less than 50, and it is discarded. After discarding imbalanced attributes, 82 binary attributes were got, as shown in Table 2. For these 82 attributes, each attributes’ classification accuracy and recall rate are reported when false positive rate (FPR) is set at 10%, along with the Area Under the ROC Curve (AUC), while the two baseline methods in Ref.[42] only provide 35 attributes’ classification accuracies. For both the SSMLCNN and MSMLCNN, the average pooling layer (i.e., layer 12) is followed with a 0.5 ratio dropout layer. Moreover, both SSMLCNN and MSMLCNN use default thresholds (i.e., 0.5) to obtain the classification accuracy of each attribute.

A comparison is made for two baseline methods proposed in Ref.[42] and MSMLCNN. The first baseline method ikSVM[42] is a SVM-based method. The second method MRFr2[42] exploits the context of neighboring images by a Markov random field (MRF) to improve the performance. The MRF is an undirected graph, where each node represents a random variable and each edge represents the relation between two connected nodes. The unary energy item is the probability predicted by ikSVM, while the pairwise energy item is the similarity between two neighboring images learned by a random forest (RF) method. Both two baseline methods use foreground masks to improve feature extraction.

For the method, both multi-scale and single scale configurations are evaluated. As shown in Fig.2, the multi-scale configuration of multi-label CNN (MSMLCNN) uses the features learned in the last three pooling layers (i.e., layers 8, 10, 12). The single scale configuration of multi-label CNN (SSMLCNN) only uses the features learned in the last pooling layer (i.e., layer 12).

3.1　Setup

The challenging database PETA[11] is used to validate the superiority of the proposed MSMLCNN based pedestrian classification method. The PETA database consists of 10 subsets, such as VIPeR, PRID, GRID, CAVIAR4REID. Therefore, the PETA database is very complex which contains variations of different camera views, illuminations, resolutions and scenarios. PETA includes 19000 images and each image is annotated with 65 attribute, such as gender, age, hairlength and clothingcolor.

3.2　Convergence comparison between SSMLCNN and MSMLCNN

As shown in Fig.3, MSMLCNN converges faster than SSMLCNN on the train subset. Moreover, on the validation subset, MSMLCNN also obtains better performance than SSMLCNN. This result illustrates the way of adding supervisory signals by fully connecting each attribute with multiple pooling layers at different scales is helpful to train a deep CNN.

Fig.3　　　　The convergence comparison between SSMLCNN and MSMLCNN (TR and VAL represents the training and validation subsets)

3.3　Performancecomparison

In Table 2, the results of the two baseline methods are directly cited from Ref.[42] and only 35 attributes’ accuracies are obtained, thus for those results that were not reported in Ref.[42] are recorded as N/A. For the first 35 attributes, from Table 2, it can be found that both SSMLCNN and MSMLCNN excel the two baseline methods for most attributes and get 9.5% and 12.0% average accuracy improvements to the better baseline method (i.e., MRFr2), respectively.

Table 2　The performance comparison of ikSVM[42], MRFr2[42], SSMLCNN and MSMLCNN

AttributeAccuracyrate(%)Recallrate(%)@FPR=10%AUC(%)ikSVMMRFr2SSML⁃CNNMSML⁃CNNSSML⁃CNNMSML⁃CNNSSML⁃CNNMSML⁃CNN1．age16⁃3083．1　86．8　75．2　79．148．7　　　60．082．01　　　86．32age31⁃4577．6　83．1　75．1　79．147．5　　　56．678．86　　　82．76age46⁃6079．1　80．1　90．3　91．758．6　　　64．979．55　　　85．00ageabove6093．5　93．8　95．7　97．382．6　　　88．792．46　　　94．78

Continued

5．backpack70．7　70．5　78．3　82．344．5　　　49．680．23　　　82．83carryingother66．9　73．0　76．1　77．539．1　　　45．374．26　　　77．79lowercasual76．5　78．2　87．0　90．230．3　　　42．180．16　　　84．64uppercasual76．0　78．1　85．4　89．130．9　　　40．679．71　　　83．83lowerformal76．6　79．0　87．5　90．257．7　　　67．582．33　　　86．4810．upperformal76．8　78．7　87．6　90．757．6　　　68．181．64　　　86．30hat89．4　90．4　92．8　95．077．2　　　82．689．15　　　91．53upperjacket69．6　72．2　89．7　93．449．8　　　54．880．74　　　81．95lowerjeans79．8　81．0　77．5　82．152．2　　　65．780．83　　　86．64leathershoes84．0　87．2　80．2　83．560．2　　　68．583．37　　　88．4315．upperlogo53．4　52．7　89．2　91．933．8　　　45．074．50　　　79．76longhair79．4　80．1　81．1　84．754．9　　　66．581．86　　　87．63male84．6　86．5　76．5　81．152．6　　　66．082．91　　　88．64messengerbag74．8　78．3　75．5　76．349．4　　　53．976．83　　　80．28muffler92．2　93．7　95．3　96．180．2　　　89．491．50　　　95．3220．nocarrying72．5　76．5　75．9　78．645．9　　　49．776．69　　　80．32noaccessory79．2　82．7　82．2　82．931．8　　　46．080．27　　　83．93upperplaid65．1　65．2　94．5　95．239．6　　　51．879．09　　　84．15plasticbags79．0　81．3　90．0　93．760．5　　　70．682．56　　　86．99sandals51．9　52．2　95．1　96．944．1　　　60．579．82　　　85．5625．shoes72．0　78．4　70．8　74．644．1　　　49．075．61　　　78．93lowershorts65．2　65．2　94．3　95．558．6　　　71．184．61　　　89．86uppershortsleeve75．1　75．8　86．7　87．853．6　　　63．082．93　　　88．36lowershortskirt69．6　69．6　93．6　94．360．2　　　66．984．25　　　87．65upperthinstripes51．9　51．9　93．0　96．828．2　　　23．771．33　　　70．9230．uppersweater71．5　75．0　94．8　96．653．4　　　53．080．36　　　78．74sunglasses53．3　53．5　95．0　96．046．8　　　60．782．07　　　87．01lowertrousers77．9　82．2　70．6　74．739．9　　　52．277．49　　　82．64upperT⁃shirt71．1　71．4　90．5　91．349．7　　　61．581．25　　　87．36upperv⁃neck53．3　53．3　96．5　97．740．9　　　52．375．59　　　81．5335．upperother83．2　87．3　75．0　78．962．6　　　67．582．79　　　86．65average(1⁃35)73．6　75．6　85．1　87．650．1　　　59．280．80　　　85．02ageless15N/A　N/A　98．5　99．264．8　　　69．088．41　　　88．84hairbandN/A　N/A　94．3　94．844．2　　　46．876．91　　　79．03kerchiefN/A　N/A　99．5　99．685．7　　　87．092．60　　　93．22babybuggyN/A　N/A　97．9　99．180．0　　　90．991．90　　　96．3640．folderN/A　N/A　94．1　98．040．3　　　42．573．12　　　74．60luggagecaseN/A　N/A　96．0　98．366．9　　　79．988．25　　　93．14suitcaseN/A　N/A　95．9　97．959．8　　　63．086．89　　　89．40lowerhotpantsN/A　N/A　98．8　99．589．6　　　90．395．71　　　95．48lowercapriN/A　N/A　95．1　97．238．6　　　51．877．22　　　82．8445．lowersuitsN/A　N/A　94．4　95．659．7　　　73．784．51　　　89．53lowerlongskirtN/A　N/A　96．7　98．370．1　　　73．087．19　　　87．99

uppersuitN/A　N/A　95．2　96．359．6　　　73．584．33　　　89．01upperlongsleeveN/A　N/A　86．3　88．033．8　　　63．281．23　　　87．33uppernosleeveN/A　N/A　96．6　97．969．9　　　74．787．68　　　92．4450．upperthickstripesN/A　N/A　96．5　98．528．3　　　45．070．52　　　78．81shorthairN/A　N/A　78．4　83．338．9　　　59．780．30　　　86．14hairbaldN/A　N/A　96．8　98．153．0　　　66．977．73　　　86．55

Continued

hairblackN/A　N/A　81．0　86．056．0　　　73．486．15　　　91．19hairbrownN/A　N/A　86．5　88．866．3　　　75．385．92　　　89．8555．hairgreyN/A　N/A　92．6　95．159．9　　　69．981．33　　　87．03hairwhiteN/A　N/A　97．3　98．585．1　　　91．793．61　　　96．97hairyellowN/A　N/A　93．5　96．261．4　　　77．285．72　　　91．35upperblackN/A　N/A　82．5　84．872．1　　　77．289．73　　　92．29upperblueN/A　N/A　91．9　94．971．4　　　80．788．31　　　92．8060．upperbrownN/A　N/A　91．5　93．763．9　　　71．085．84　　　89．22uppergreenN/A　N/A　92．5　97．349．8　　　70．280．37　　　89．02uppergreyN/A　N/A　81．4　85．147．2　　　53．080．27　　　83．07upperpurpleN/A　N/A　93．5　97．850．0　　　71．277．35　　　89．77upperredN/A　N/A　93．8　96．675．0　　　87．690．94　　　94．9465．upperwhiteN/A　N/A　86．2　88．064．4　　　72．986．64　　　90．43upperyellowN/A　N/A　95．4　98．861．0　　　77．982．84　　　91．95upperpinkN/A　N/A　96．7　98．347．8　　　70．482．15　　　88．39upperorangeN/A　N/A　96．6　98．966．7　　　77．088．74　　　88．77lowerblackN/A　N/A　79．8　82．261．8　　　70．687．17　　　90．2070．lowerblueN/A　N/A　84．5　87．065．3　　　77．986．55　　　91．45lowerbrownN/A　N/A　94．5　96．461．4　　　73．684．67　　　90．15lowergreyN/A　N/A　76．8　82．943．6　　　55．776．56　　　83．23lowerredN/A　N/A　98．4　99．272．4　　　82．890．73　　　95．44lowerwhiteN/A　N/A　93．6　94．668．8　　　78．287．93　　　91．7175．footwearblackN/A　N/A　70．6　74．644．7　　　56．477．65　　　83．08footwearbrownN/A　N/A　89．0　91．255．6　　　70．081．33　　　87．42footweargreyN/A　N/A　83．7　84．644．3　　　49．277．53　　　79．98footwearredN/A　N/A　93．9　98．459．3　　　72．885．94　　　92．20footwearwhiteN/A　N/A　79．7　82．840．0　　　57．977．91　　　84．3080．bootsN/A　N/A　94．3　96．263．6　　　70．885．18　　　87．47sneakersN/A　N/A　78．1　78．939．9　　　49．278．79　　　83．00stockingN/A　N/A　95．3　96．373．3　　　78．289．53　　　91．65average(1⁃82)N/A　N/A　88．7　91．155．4　　　65．482．77　　　87．08

Comparing MSMLCNN with SSMLCNN, it can be observed that MSMLCNN obtains better performances for almost all attributes, thus higher average recall rate and AUC are obtained, as shown in Fig.4 and Fig.5. Specifically, for the first 35 attributes, the average recall rate and AUC of MSMLCNN are 9.1% and 4.22% higher than those of SSMLCNN, respectively. Moreover, for all 82 attributes, the average recall rate and AUC of MSMLCNN are 10.0% and 4.31% higher than those of SSMLCNN, respectively. The first 35 attributes’ and 82 attributes’ average ROC curves resulting from SSMLCNN and MSMLCNN are shown in Fig.4 and Fig.5, respectively. Both Fig.4 and Fig.5 show that MSMLCNN has better ROC performances than SSMLCNN. These results demonstrate that MSMLCNN learning multi-scale features is helpful for improving the pedestrian attribute classification performance.

Fig.4　　　　The average (1-35) attribute ROC curves by using SSMLCNN and MSMLCNN

Fig.5　　　　The average (1-82) attribute ROC curves by using SSMLCNN and MSMLCNN

4　Conclusion

In this paper, a multi-scale and multi-label convolutional neural network (MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The multi-attribute classification problem is transformed into a multi-label classification problem including multiple binary attribute classification problems. Then, those multiple binary attribute classification problems are simultaneously solved by fully connecting each attribute with multi-scale features learned by the MSMLCNN. The multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. The way of using multi-scale features for pedestrian attribute classification has two benefits: avoiding gradient vanish to make the optimization of a very deep neural network easier; improving attribute classification accuracies by applying both local and global features. Extensive experiments show that proposed MSMLCNN outperforms state-of-the-art methods with a large margin.

在研发阶段，可以通过网络化操作降低人工成本，网络化的操作不仅让我们在进行成本管理的时候提高效率，而且还最大可能地减少了人工费用的花费；增强研发人员成本意识，严格控制企业的成本是研发人员在互联网行业从事工作的必备要求，作为研发人员必须深刻地认识到研发费用监督的迫切性；整合信息资源缩减研发支出，利用互联网进行信息整合之后，就完全不用担忧不一样的系统、不一样的数据库之间的集成效果，进而实现信息资源配置最优化、拓宽信息资源应用领域和最大化挖掘信息价值的管理过程。

-Reference

[ 1] Vaquero D, Feris R, Tran D, et al. Attribute-based people search in surveillance environments[C]. In: Proceedings of IEEE Workshop on Applications of Computer Vision, Snowbird, Utah, 2009. 1-8

[ 2] Jaha E S, Nixon M S. Analysing soft clothing biometrics for retrieval[C]. In: Proceedings of the 1st International Workshop on Biometric Authentication, Sofia, Bulgaria, 2014. 234-245

where and are the numbers of negative and positive samples of k-th attribute, respectively. The learning toolbox for MSMLCNN is the cuda-convnet and it can be found in a google code website, https://code.google.com/p/cuda-convnet/.

政府制定产业政策的主要原因是当前的经济发展状况和产业发展状况决定的，为了更好地引导国家产业的发展方向，更好地推动产业结构升级转型，协调各方发展，最终实现国民经济持续健康发展。我国的经济发展状况在各地区都有着不同的特点，面临各种各样的问题，这些问题带有明显的地域特征，解决这些问题所采取的方法也不尽相同，因此在制定产业扶持的相关政策时也应该考虑不同地区的环境、社会、文化特征。乡镇经济作为组成我国经济的重要部分，在我国经济发展中占据重要地位。

[20] Zhang X Y, Wang S P, Zhu X B, et al. Update vs. upgrade: modeling with indeterminate multi-class active learning[J]. Neurocomputing, 2015, 162:163-170

[ 4] Reid D A, Nixon M S, Stevenage S V. Soft biometrics; human identification using comparative descriptions[J]. IEEETransactionsonPatternAnalysisandMachineIntelligence, 2014, 36(6):1216-1228

[ 5] Martinson E, Lawson W, Trafton J G. Identifying people with soft-biometrics at fleet week[C]. In: Proceedings of ACM/IEEE International Conference on Human-Robot Interaction, Chicago, USA, 2013. 49-56

[ 6] Layne R, Hospedales T M, Gong S G, et al. Person re-identification by attributes[C]. In: Proceedings of British Machine Vision Conference, Guildford, UK, 2012. 1-8

[ 7] Zhu J Q, Liao S C, Yi D, et al. Multi-label CNN based pedestrian attribute learning for soft biometrics[C]. In: Proceedings of International Conference on Biometrics, Phuket, Thailand, 2015. 535-540

[ 8] Zhu J Q, Liao S C, Lei Z, et al. Pedestrian attribute classification in surveillance: Database and evaluation[C]. In: Proceedings of Workshop on IEEE International Conference on Computer Vision, Sydney, Australia, 2013. 331-338

[ 9] Layne R, Hospedales T M, Gong S G. Attributes-based Re-identification[M]. London: Springer, 2014. 93-117

[10] An L, Chen X J, Kafai M, Yang S F, et al. Improving person re-identification by soft biometrics based re-ranking [C]. In: Proceedings of International Conference on Distributed Smart Cameras, Palm Springs, USA, 2013. 1-6

[11] Deng Y B, Luo P, Loy C C, et al. Pedestrian attribute recognition at far distance[C]. In: Proceedings of ACM Multimedia,Orlando, USA, 2014. 789-792

[12] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL]. https://arxiv.org/abs/1409.1556: arxiv 2014

艾瑞克走向赛场中央，他的动作很慢，双手抱胸，站到克里斯蒂娜身旁，轻声说了句：“抱歉，你刚才说什么？你投降？”

[13] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]. In: Proceedings of International Conference on Artificial Intelligence and Statistics,Sardinia, Italy, 2010. 249-256

[14] Gray D, Brennan S, Tao H. Evaluating appearance models for recognition, reacquisition, and tracking[C]. In: Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, Rio de Janeiro, Brazil, 2007. 1-7

[15] Hirzer M, Beleznai C, Roth P M, et al. Person Re-identification by Descriptive and Discriminative Classification[M]. London: Springer, 2011. 91-102

[16] Liu C X, Gong S G, Loy C C, et al. Person re-identification: what features are important?[C]. In: Proceedings of Workshop on European Conference on Computer Vision, Florence, Italy, 2012. 391-401

[17] Zhang X Y. Simultaneous optimization for robust correlation estimation in partially observed social network[J]. Neurocomputing, 2016, 205: 455-462

[18] Zhu X B, Liu J, Wang J Q, et al. Sparse representation for robust abnormality detection in crowded scenes[J]. PatternRecognition, 2014, 47(5):1791-1799

[19] Friedman J, Hastie T, Tibshirani R, et al. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors)[J]. TheAnnalsofStatistics, 2000, 28(2):337-407

为促进协同协作，需要创新自贸区、港内分支行机构的绩效考核机制，加大对非财务指标的考核权重。建议在对存贷款规模效益等考核的同时，兼顾对非财务指标的考核力度，如自贸区首单业务、客户数，以及行内自贸区业务协同和辐射效果等。

[26] Donahue J, Jia Y Q, Vinyals O, et al. Decaf: A deep convolutional activation feature for generic visual recognition[EB/OL]. https://arxiv.org/abs/1310.1531: arxiv, 2013

[21] Chen H Z, Gallagher A, Girod B. Describing clothing by semantic attributes[C]. In: Proceedings of European Conference on Computer Vision, Florence, Italy, 2012. 609-623

[22] Zhu J Q, Liao S C, Lei Z, et al. Improve pedestrian attribute classification by weighted interactions from other attributes[C]. In: Proceedings of Workshop on Asian Conference on Computer Vision, Singapore, 2014. 545-557

[23] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. ProceedingsoftheIEEE, 1998, 86(11):2278-2324

[24] Lee H, Grosse R, Ranganath R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C]. In: Proceedings of Annual International Conference on Machine Learning, Montreal, Canada, 2009. 609-616

Recently, several public attribute pedestrian databases have been released, such as VIPeR[14], PRID[15], GRID[16], APiS[8] and PETA[11]. In terms of the number of annotated pedestrian attributes, VIPeR is firstly annotated with 15 attributes by Layne, et al.[6]. They annotated VIPeR, PRID and GRID with 21 attributes in their further work[9]. APiS annotated with 15 attributes by Zhu, et al.[8]. PETA is a large database, including 65 attribute annotations. Fig.1 shows some annotated sample images selected from the PETA[11] database. In terms of the number of images, VIPeR, PRID and GRID are small databases and each one contains less than 1500 images. APiS includes 3661 images and PETA consists of 19000 images. It can be found that more and more databases have been released, and both the number of attributes and the number of images are increasing. This illustrates the research of pedestrian attribute classification is attracting more and more interest and attention.

浮盘密封材质(皮囊等)的抗腐蚀性、耐温性、耐油性、抗老化性能够与储存介质相符，密封部位所配弹性元件能够经得住油气或水汽腐蚀，弹性伸缩良好，这在浮盘安装前必须进行认真核验，确保材质符合长周期运行。在储罐定期检查、检验时一并进行必要的查验，发现问题及时予以处理。

[27] Razavian A S, Azizpour H, Sullivan J, et al. Cnn features off-the-shelf: an astounding baseline for recognition[C]. In: Proceedings of Workshop on IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014. 512-519

[28] Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10,000 classes[C]. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Columbus, USA, 2014.1891-1898

[29] Sun Y, Chen Y H, Wang X G, et al. Deep learning face representation by joint identification-verification[C]. In: Proceedings of Advances in Neural Information Processing Systems, Montréal, Canada, 2014.1988-1996

[30] Sun Y, Liang D, Wang X G, et al. Deepid3: Face recognition with very deep neural networks[EB/OL]. https://arxiv.org/abs/1502.00873: arxiv, 2015

[31] Gong Y C, Jia Y Q, Leung T, et al. Deep convolutional ranking for multi-label image annotation[EB/OL]. https://arxiv.org/abs/1312.4894: arxiv, 2013

[32] Zhu J Q, Liao S C, Lei Z, et al. Multi-label convolutional neural network based pedestrian attribute classification[J]. Image & VisionComputing, 2016, 58(C):224-229

[33] Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines[C]. In: Proceedings of International Conference on Machine Learning, Haifa, Israel, 2010. 807-814

[34] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[EB/OL]. https://arxiv.org/abs/1207.0580: arxiv, 2012

[35] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[EB/OL]. https://arxiv.org/abs/1502.03167: arxiv, 2015

[36] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions[EB/OL]. https://arxiv.org/abs/1409.4842: arxiv, 2014

[37] Sermanet P, LeCun Y. Traffic sign recognition with multi-scale convolutional networks[C]. In: Proceedings of International Joint Conference on Neural Networks, Alaska, USA, 2011. 2809-2813

女友没有动，只是把脸扑在我的胸膛。我扳过女友的脸，就吻……吻着，吻着，我突然一愣，推开她一看，是秀姐。

1.1 一般资料选取我科2015年1月至2017年12月间行钻孔引流术的CSDH患者68例，对照组和治疗组各34例。对照组：术后仅行常规药物治疗；治疗组：术后在常规治疗的基础上行高压氧治疗。

[38] He K M, Zhang Y Z, Ren S Q, et al. Deep residual learning for image recognition[EB/OL]. https://arxiv.org/abs/1512.03385: arxiv, 2015

[39] He K M, Zhang Y Z, Ren S Q, et al. Identity mappings in deep residual networks[C]. In: Proceedings of European Conference on Computer Vision, Amsterdam, Netherlands, 2016. 630-645

[40] Huang G, Liu Z, Maaten L V D, et al. Weinberger. Densely connected convolutional networks[C]. In: Proceedings of Computer Vision and Pattern Recognition, Honolulu, USA, 2017. 2261-2269

[41] Hosmer D W, Lemeshow S. Introduction to the Logistic Regression Model[M]. 2nd Edition. Hoboken, New Jersey, USA: John Wiley & Sons, 2000. 1-30

[42] Deng Y B, Luo P, Loy C C, et al. Learning to recognize pedestrian attribute[EB/OL]. https://arxiv.org/abs/1501.00901: arxiv, 2015

精工是世界上唯一一家能够用单体表壳制造潜水表的制造商。整体式结构非常有价值，专为70年代末制作的水肺潜水表而设计（例如，最初的PloProf也采用整体式表壳），它代表着令人难以置信的特点---没有密封件（除了从表盘侧面外无法进入机心），完美的密封性和独特的设计。SLA025就继承了伟大原型的这一特点：采用厚重的单体不锈钢表壳，表径44.8毫米，厚15.7毫米 - 与原型相比略大一些。此外，该表壳还具有超硬涂层，具有更好的防刮性；而4点钟位置的表冠是典型的向原型致敬设计。

日式唐草风格来源于谢里丹风格，但比之构图更加严谨，细节上更加较真。日式唐草风格对图案线稿的要求近乎苛刻。线条通常不会溢到边线外，对图案的完整性要求特别高。日式唐草风格偏爱涡旋图案，其涡旋图案的内外线条方向多为相反走向。刀线方面，日式唐草比美式唐草更加深，因此选皮也相对较厚，结合特殊的皮面处理方法，图案的浮雕效果较之美式唐草更为立体。其代表人物是漥田敦司、小屋敷清一、大冢孝幸等名家。

作者

ZhuJianqing，ZengHuanqiang，ZhangYuzhao，ZhengLixin，CaiCanhui

基金

分类号

出处

《High Technology Letters》 2018年第1期

上一篇：Modeling approach for axial contact stiffness of ball screws①

下一篇：Effect and sensitivity analysis of the rotational speed on the fluid-induced force characteristics of the straight-through labyrinth gas seal①

《High Technology Letters》2018年第1期文献

Study on effect analysis and parameter optimizing of stepless capacity control system on reciprocating compressors① 作者：WangYao，ZhangJinjie，，ZhouChao，LiuWenhua

A tunable passive mixer for SAW-less front-end with reconfigurable voltage conversion gain and intermediate frequency bandwidth① 作者：TaoJian，FanXiangning，ZhaoYuan

Abnormal driving behavior identification based on direction and position offsets① 作者：ZhangXiaorui，SunWei，XuZiqian，YangCuifang，LiuXinzhu

Theoretical analysis of pilotless frame synchronization for LDPC code using Gaussian approximation① 作者：XueWen，BanTian，WangJianxin，LuJinhui，YuHai，ShuFeng

Modeling and nonlinear analysis of 14 bit 100MS/s pipelined ADC① 作者：ZhengHao，FanXiangning

Modeling approach for axial contact stiffness of ball screws① 作者：KongDeshun，WangMin，LiuXuebin，GaoXiangsheng

Pedestrian attribute classification with multi-scale and multi-label convolutional neural networks① 作者：ZhuJianqing，ZengHuanqiang，ZhangYuzhao，ZhengLixin，CaiCanhui

Effect and sensitivity analysis of the rotational speed on the fluid-induced force characteristics of the straight-through labyrinth gas seal① 作者：WangQingfeng，HeLidong

Initial residual stress experiment and simulation of thin-walled parts for layer removal method① 作者：ZanTao，GaoXiangsheng，ZhangYanlin，LiuYunan，WangMin

Analysis of energy consumption characteristics of hydraulic system for wrecker truck based on CPR① 作者：ChaoZhiqiang，NingChuming，ShenWei，LiHuaying，HanShousong

TPS based on trilateration for the feed measurement of FAST① 作者：YuanHui，ZhuLichun

An empirical research on the field of wireless charging based on patent life length① 作者：ShiMin，WangJuncheng，ZhouXiaodan，HuYiyuan，ZhaoZelong

Knowledge presentation by the MNSM-based controller for swimming motion of a snake-like robot① 作者：LuZhenli，XieYafei，XuHuigang，BorovacBranislav，LiBin

Hillock Sn whiskers growth behaviors in Sn0.3Ag0.7Cu/Cu solder joints during corrosion① 作者：ZhangLiang，LiuZhiquan，SunLei，YangFan，ZhongSujuan

杂志信息网

Pedestrian attribute classification with multi-scale and multi-label convolutional neural networks①

0 Introduction

1 Related work

1.1 Attributepedestriandatabase

1.2 Pedestrian attribute classification

1.3 Convolutional neural network

2 Multi-scale and multi-label convolutional neural network

2.1 Networkarchitecture

2.2 Cost function design

3 Experiment and analysis

3.1 Setup

3.2 Convergence comparison between SSMLCNN and MSMLCNN

3.3 Performancecomparison

4 Conclusion