上一条: DTQAtten: Leveraging Dynamic Token-based Quantization for Efficient Attention Architecture
下一条: EBSP: Evolving Bit Sparsity Patterns for Hardware-Friendly Inference of Quantized Deep Neural Networks