Metric Learning for Semantic‑Based Clothes Retrieval

doi:10.12142/ZTECOM.202201010

ZTE Communications ›› 2022, Vol. 20 ›› Issue (1): 76-82.DOI: 10.12142/ZTECOM.202201010

• Research Paper • Previous Articles

Metric Learning for Semantic‑Based Clothes Retrieval

YANG Bo¹, GUO Caili^1,²(), LI Zheng¹

^1.Beijing Laboratory of Advanced Information Networks, Beijing University of Posts and Telecommunications, Beijing 100876, China
^2.Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China

Received:2021-11-15 Online:2022-03-25 Published:2022-04-06
About author:YANG Bo received the B.S. degree in communication engineering from Beijing University of Posts and Telecommunications (BUPT), China in 2019. He is currently pursuing the M.S. degree in information and communication engineering at BUPT. His current research interests include computer vision and image retrieval.|GUO Caili (guocaili@bupt.edu.cn) received the Ph.D. degree in communication and information systems from Beijing University of Posts and Telecommunication (BUPT), China in 2008. She is currently a professor in the School of Information and Communication Engineering, BUPT. Her general research interests include machine learning and statistical signal processing, with current emphasis on semantic communications, deep learning, and intelligence visual computing. In the related areas, she has published over 200 papers and holds over 30 granted patents. She won Diamond Best Paper Award of IEEE ICME 2018 and Best Paper Award of IEEE WCNC 2021.|LI Zheng received the B.S. degree in telecommunication engineering from Shandong University, China in 2016, and the M.S. degree in information and communication engineering from Beijing University of Posts and Telecommunications (BUPT), China in 2019. He is currently pursuing the Ph.D. degree in information and communication engineering at BUPT. His current research interests include computer vision and multimedia retrieval.

Abstract

Abstract:

Existing clothes retrieval methods mostly adopt binary supervision in metric learning. For each iteration, only the clothes belonging to the same instance are positive samples, and all other clothes are “indistinguishable” negative samples, which causes the following problem. The relevance between the query and candidates is only treated as relevant or irrelevant, which makes the model difficult to learn the continuous semantic similarities between clothes. Clothes that do not belong to the same instance are completely considered irrelevant and are uniformly pushed away from the query by an equal margin in the embedding space, which is not consistent with the ideal retrieval results. Motivated by this, we propose a novel method called semantic-based clothes retrieval (SCR). In SCR, we measure the semantic similarities between clothes and design a new adaptive loss based on these similarities. The margin in the proposed adaptive loss can vary with different semantic similarities between the anchor and negative samples. In this way, more coherent embedding space can be learned, where candidates with higher semantic similarities are mapped closer to the query than those with lower ones. We use Recall@K and normalized Discounted Cumulative Gain (nDCG) as evaluation metrics to conduct experiments on the DeepFashion dataset and have achieved better performance.

Key words: clothes retrieval, metric learning, semantic-based retrieval

YANG Bo, GUO Caili, LI Zheng. Metric Learning for Semantic‑Based Clothes Retrieval[J]. ZTE Communications, 2022, 20(1): 76-82.

Figures/Tables 7

Figure 1 Ideal relationship between clothes in different categories

Figure 2 Frequency statistical graph of semantic similarity

Figure 3 Normalized semantic similarity visualization

Table 1 Comparison with state-of-the-art methods on DeepFashion consumer-to-shop benchmark

Methods	R@1	R@20	R@50
FashionNet^[11]	7.0	18.8	22.8
VAM+Nonshared^[26]	11.3	38.8	51.5
VAM+Product^[26]	13.4	43.6	56.7
VAM+ImageDrop(192, 48)^[26]	13.7	43.9	56.9
DREML^[27]	18.6	51.0	59.1
KPM^[28]	21.3	54.1	65.2
GRNeT^[17]	25.7	64.4	75.0
SCR (Adaptive)	29.2	51.0	61.4

Table 2 Instance-based retrieval results on DeepFashion

Loss Functions	R@1	R@5	R@1	mAP	Mean
Triplet	26.9	35.9	41.0	33.9	1 275
Quadruplet	26.3	35.1	40.3	33.3	1 348
Adaptive	29.2	38.6	44.0	36.6	1 091

Table 3 Semantic-based retrieval results on DeepFashion

Loss Functions	NDCG@1	NDCG@10	NDCG@50
Triplet	22.2	21.3	16.4
Quadruplet	21.8	20.9	16.1
Adaptive	23.9	22.8	17.5

Figure 4 Qualitative retrieval comparison between triplet loss and our adaptive loss on consumer-to-shop benchmark

References 28

1	WIECZOREK M, MICHALOWSKI A, WROBLEWSKA A, et al. A strong baseline for fashion retrieval with person re-identification models [EB/OL]. (2020-03-09) [2021-11-05].
2	KIM S, SEO M, LAPTEV I, et al. Deep metric learning beyond binary supervision [EB/OL]. (2019-04-21) [2021-11-05]. . DOI: 10.1109/CVPR.2019.00239 DOI URL
3	ZHOU M, NIU Z X, WANG L, et al. Ladder loss for coherent visual-semantic embedding [J]. Proceedings of the AAAI conference on artificial intelligence, 2020, 34(7): 13050–13057. DOI: 10.1609/aaai.v34i07.7006 DOI
4	WRAY M, DOUGHTY H, DAMEN. On semantic similarity in video retrieval [EB/OL]. [2021-11-05].
5	WANG X W, ZHANG T. Clothes search in consumer photos via color matching and attribute learning [C]//MM '11: Proceedings of the 19th ACM international conference on Multimedia. ACM, 2011: 1353–1356. DOI: 10.1145/2072298.2072013 DOI
6	DI W, WAH C, BHARDWAJ A, et al. Style finder: fine-grained clothing style detection and retrieval [C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 2013: 8–13. DOI: 10.1109/CVPRW.2013.6 DOI
7	FU J L, WANG J Q, LI Z C, et al. Efficient clothing retrieval with semantic preserving visual phrases [C]//Asian Conference on Computer Vision. ACCV, 2012: 420–431
8	GARCIA N, VOGIATZIS G. Dress like a star: Retrieving fashion products from videos [C]//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. IEEE, 2017: 2293–2299. DOI: 10.1109/ICCVW.2017.270 DOI
9	HUANG J, FERIS R S, CHEN Q, et al. Cross-domain image retrieval with a dual attribute-aware ranking network [C]//Proceedings of the IEEE International Conference on Computer Vision. IEEE, 2015: 1062-1070. DOI: 10.1109/ICCV.2015.127 DOI
10	KIAPOUR M H, HAN X F, LAZEBNIK S, et al. Where to buy it: matching street clothing photos in online shops [C]//Proceedings of 2015 IEEE International Conference on Computer Vision. IEEE, 2015: 3343–3351. DOI: 10.1109/ICCV.2015.382 DOI
11	LIU Z W, LUO P, QIU S, et al. DeepFashion: powering robust clothes recognition and retrieval with rich annotations [J]. 2016 IEEE conference on computer vision and pattern recognition (CVPR), 2016: 1096–1104. DOI: 10.1109/CVPR.2016.124 DOI
12	JI X, WANG W, ZHANG M H, et al. Cross-domain image retrieval with attention modeling [C]//Proceedings of the 25th ACM international conference on Multimedia. ACM, 2017: 1654–1662. DOI: 10.1145/3123266.3123429 DOI
13	SONG Y, LI Y, WU B, et al. Learning unified embedding for apparel recognition [EB/OL]. (2017-07-19) [2021-11-05].
14	CORBIÈRE C, BEN-YOUNES H, RAMÉ A, et al. Leveraging weakly annotated data for fashion image retrieval and label prediction [C]//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. IEEE, 2017: 2268–2274. DOI: 10.1109/ICCVW.2017.266 DOI
15	CHENG Z Q, WU X, LIU Y, et al. Video2Shop: exact matching clothes in videos to online shopping images [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017: 4169–4177. DOI: 10.1109/CVPR.2017.444 DOI
16	ZHANG Y H, PAN P, ZHENG Y, et al. Visual search at Alibaba [C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018: 993–1001. DOI: 10.1145/3219819.3219820 DOI
17	KUANG Z H, GAO Y M, LI G B, et al. Fashion retrieval via graph reasoning networks on a similarity pyramid [C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019: 3066–3075. DOI: 10.1109/ICCV.2019.00316 DOI
18	LUO H, JIANG W, GU Y Z, et al. A strong baseline and batch normalization neck for deep person re-identification [J]. IEEE transactions on multimedia, 2020, 22(10): 2597–2609. DOI: 10.1109/TMM.2019.2958756 DOI
19	HADSELL R, CHOPRA S, LECUN Y. Dimensionality reduction by learning an invariant mapping [C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2006: 1735–1742. DOI: 10.1109/CVPR.2006.100 DOI
20	WEINBERGER K Q, SAUL L K. Distance metric learning for large margin nearest neighbor classification [J]. Journal of machine learning research, 2009, 10(2): 207–244
21	GORDO A, LARLUS D. Beyond instance-level image retrieval: leveraging captions to learn a global visual representation for semantic retrieval [C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017: 5272–5281. DOI: 10.1109/CVPR.2017.560 DOI
22	ARANDJELOVIC R, GRONAT P, TORII A, et al. NetVLAD: CNN architecture for weakly supervised place recognition [C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 5297–5307. DOI: 10.1109/CVPR.2016.572 DOI
23	BALNTAS V, LI S D, PRISACARIU V. RelocNet: continuous metric learning relocalisation using neural nets [C]//Computer Vision. ECCV, 2018: 751–767. DOI: 10.1007/978-3-030-01264-9_46 DOI
24	JÄRVELIN K, KEKÄLÄINEN J. Cumulated gain-based evaluation of IR techniques [J]. ACM transactions on information systems, 2002, 20(4): 422–446. DOI: 10.1145/582415.582418 DOI
25	LIU T Y. Learning to rank for information retrieval [M]. Berlin, Germany: Springer. 2011
26	WANG Z H, GU Y J, ZHANG Y, et al. Clothing retrieval with visual attention model [C]//Proceedings of 2017 IEEE Visual Communications and Image Processing. IEEE, 2017: 1–4. DOI: 10.1109/VCIP.2017.8305144 DOI
27	XUAN H, SOUVENIR R, PLESS R. Deep randomized ensembles for metric learning [C]//Computer vision. ECCV, 2018: 723–734. DOI: 10.1007/978-3-030-01270-0_44 DOI
28	SHEN Y, XIAO T, LI H, et al. End-to-end deep kronecker-product matching for person re-identification [C]//Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition. IEEE, 2018: 6886–6895. DOI: 10.48550/arXiv.1807.11182 DOI

Metric Learning for Semantic‑Based Clothes Retrieval

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 28

Related Articles 0

Recommended Articles 0

Metrics