上一条: Video Semantic Segmentation with Sparse Temporal Transformer
下一条: Zero-Shot Learning via Category-Specific Visual-Semantic Mapping and Label Refinement