上一条: Memorize, Associate and Match: Embedding Enhancement via Fine-grained Alignment for Image-Text Retrieval
下一条: Hallucinating uncertain motion and future for static image action recognition