上一条: Action-Aware Embedding Enhancement for Image-Text Retrieval
下一条: Memorize, Associate and Match: Embedding Enhancement via Fine-grained Alignment for Image-Text Retrieval