Researcher, Huawei Noah's Ark Lab
Hong Kong SAR, China
Email: haolibai [at] gmail.com
I am currently a researcher at Huawei Noah's Ark Lab. I obtained my Ph.D. degree from The Chinese University of Hong Kong in 2021, supervised by Prof. Michael R. Lyu and Prof. Irwin King. Prior to that, I received the B.Eng. Degree from Yingcai Honors College of University of Electronic Science and Technology of China in 2017.
[Hiring] I am looking for research/engineering interns with strong machine learning and natural language processing background. Please drop me an email if you are interested. Base: HK or Shenzhen.
Acceleration of Large Language Models, Multi-modal Pre-training
Yuxuan Sun*, Ruikang Liu*, Haoli Bai†, Han Bao, Kang Zhao, Yuening Li, Jiaxin Hu, Xianzhi Yu, Lu Hou, Chun Yuan, Xin Jiang, Wulong Liu, Jun Yao
FlatQuant: Flatness Matters for LLM Quantization.
Preprint arXiv:2410.09426, 2024
[Code]
Zhiming Mao, Haoli Bai†, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
Visually Guided Generative Text-Layout Pre-training for Document Intelligence
NACCL'24, the North American Chapter of the Association for Computational Linguistics, 2024
[Code]
Ruikang Liu, Haoli Bai, Haokun Lin, Yuening Li, Han Gao, Zhengzhuo Xu, Lu Hou, Jun Yao, Chun Yuan
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
ACL 2024, Findings
[Code]
Haokun Lin, Haoli Bai, Zhili Liu, Lu Hou, Muyi Sun, Linqi Song, Ying Wei, Zhenan Sun
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
CVPR'24: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Yingtao Zhang, Haoli Bai, Haokun Lin, Jialin Zhao, Lu Hou, Carlo Vittorio Cannistraci
Plug-and-Play: An Efficient Post-training Pruning Method for Large Language Models
ICLR'24: The Twelfth International Conference on Learning Representations, 2024
[Code]
Haoli Bai*, Zhiguang Liu*, Xiaojun Meng*, Wentao Li, Shuang Liu, Nian Xie, Rongfu Zheng, Liangwei Wang, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
ACL'23: The 61th Annual Meeting of the Association for Computational Linguistics, 2023
Chaofan Tao, Lu Hou, Haoli Bai, Jiansheng Wei, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong
Structured Pruning for Efficient Generative Pre-trained Language Models
ACL'23 Findings: The 61th Annual Meeting of the Association for Computational Linguistics, 2023
Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Irwin King, Michael Lyu
Towards Efficient Post-training Quantization of Pre-trained Language Models
NeurIPS'22: Proceedings of the 36th conference on Neural Information Processing Systems, 2022.
Haoli Bai, Hongda Mao, Dinesh Nair
Dynamically pruning segformer for efficient semantic segmentation
ICASSP'22: IEEE International Conference on Acoustics, Speech and Signal Processing, 2022.
Haoli Bai, Wei Zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael Lyu, Irwin King
BinaryBERT: Pushing the Limit of BERT Quantization
ACL'21: The 59th Annual Meeting of the Association for Computational Linguistics, 2021. Accepted with scores 5, 5, 4.
[Code]
Haoli Bai*, Jiaxing Wang*, Jiaxiang Wu, Xupeng Shi, Junzhou Huang, Irwin King, Michael Lyu, Jian Cheng
Revisiting Parameter Sharing for Automatic Neural Channel Number Search
NeurIPS'20: Proceedings of the 34th conference on Neural Information Processing Systems, 2020.
[Code]
Haoli Bai, Jiaxiang Wu, Irwin King, Michael Lyu
Few Shot Network Compression via Cross Distillation
AAAI'20: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020.
[Code]
[Poster]
Haoli Bai, Zhuangbin Chen, Michael Lyu, Irwin King, Zenglin Xu
Neural Relational Topic Models for Scientific Article Analysis
CIKM'18: Proceedings of The 27th International Conference on Information and Knowledge Management, 2018.
[Code]
Haoli Bai, Zenglin Xu, Bin Liu, Yingming Li
Hierarchical Probabilistic Matrix Factorization with Network Topology for Multi-relational Social Network
ACML'16: Proceedings of The 8th Asian Conference on Machine Learning, 2016, Best Student Paper Runner-up.
PocketFlow automatically searches for optimal model compression strategies such as network pruning, quantization, knowledge distillation with little human efforts, and also supports TFLite deployment on Andriod devices. It has collected 2600+ stars and 480+ folks.