上一条: Image Composition Assessment with Saliency-augmented Multi-pattern Pooling
下一条: Video Semantic Segmentation with Sparse Temporal Transformer