Optimizing LLMS for efficient inference on Edge

(Haider et al., 2024)

Details coming soon…

References

2024

Optimized Transformer Models: ℓ′ BERT with CNN-like Pruning and Quantization

Muhammad Hamis Haider, Stephany Valarezo-Plaza, Sayed Muhsin, and 2 more authors

In 2024 IEEE International Symposium on Circuits and Systems (ISCAS), 2024

DOI Project