-
AtRec: Accelerating Recommendation Model Training on CPUs
Siqi Wang*, Tianyu Feng*, Hailong Yang, Xin You, Bangduo Chen, Tongxuan Liu, Zhongzhi Luan, Depei Qian
*Equal contribution.
[TPDS 2024] IEEE Transactions on Parallel and Distributed Systems
[source code] [html] [pdf] -
dgQuEST: Accelerating Large Scale Quantum Circuit Simulation through Hybrid CPU-GPU Memory Hierarchies
Tianyu Feng, Siyan Chen, Xin You, Shuzhang Zhong, Hailong Yang, Zhongzhi Luan, Depei Qian
[NPC 2021, Best Paper] Network and Parallel Computing: 18th IFIP WG 10.3 International Conference
[source code] [html] [pdf] -
Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU
Jianjin Liao, Mingzhen Li, Hailong Yang, Qingxiao Sun, Biao Sun, Jiwei Hao, Tianyu Feng, Fengwei Yu, Shengdong Chen, Ye Tao, Zicheng Zhang, Zhongzhi Luan, Depei Qian
2023 IEEE International Parallel and Distributed Processing Symposium [IPDPS 23]
[source code] [html] [pdf] -
Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding
Siqi Wang, Hailong Yang, Xuezhu Wang, Tongxuan Liu, Pengbo Wang, Xuning Liang, Kejie Ma, Tianyu Feng, Xin You, Yongjun Bao, Yi Liu, Zhongzhi Luan, Depei Qian
arXiv 2024
[html] [pdf]