2026 Preprint FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving Wenyan Chen, Chengzhi Lu, Yanying Lin, and Dmitrii Ustiugov Preprint, 2026 DOI PDF Preprint CodecSight: Leveraging Video Codec Signals for Efficient Streaming VLM Inference Yulin Zou, Yan Chen, Wenyan Chen, JooYoung Park, Shivaraman Nitin, Tao Luo, and 2 more authors Preprint, 2026 DOI PDF EuroSys High Throughput and Low Latency LLM Serving via Adaptive KV Caching Wenyan Chen, Chengzhi Lu, Huanle Xu, Kejiang Ye, and Chengzhong Xu In proceedings of European Conference on Computer Systems, 2026 DOI PDF Slides INFOCOM FedSUV: Validity and Utility-guided Client Selection for Federated Learning Xiaosong Chen, Wenyan Chen, Yuanhang Chen, and Huanle Xu In proceedings of IEEE International Conference on Computer Communications, 2026 2025 SoCC FedDance: Efficient Participant Selection for Federated Learning in Highly Dynamic Environments Yuanhang Chen, Xiaosong Chen, Wenyan Chen, and Huanle Xu In proceedings of the annual ACM Symposium on Cloud Computing, 2025 DOI PDF EuroSys Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters Wenyan Chen, Chengzhi Lu, Huanle Xu, Kejiang Ye, and Chengzhong Xu In proceedings of European Conference on Computer Systems, 2025 DOI PDF Slides 2024 SC SMIless: Serving DAG-based Inference with Dynamic Invocations under Serverless Computing Chengzhi Lu, Huanle Xu, Yudan Li, Wenyan Chen, Kejiang Ye, and Chengzhong Xu In proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2024 DOI PDF CSCWD EINS: Edge-Cloud Deep Model Inference with Network-Efficiency Schedule in Serverless Shijie Peng, Yanying Lin, Wenyan Chen, Yingfei Tang, Xu Duan, and Kejiang Ye In proceedings of the International Conference on Computer Supported Cooperative Work in Design, 2024 DOI PDF 2023 SC Interference-aware multiplexing for deep learning in gpu clusters: A middleware approach Wenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, and Chengzhong Xu In proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2023 DOI PDF Code Slides 2021 CLUSTER Rptcn: Resource prediction for high-dynamic workloads in clouds based on deep learning Wenyan Chen, Chengzhi Lu, Kejiang Ye, Yang Wang, and Chengzhong Xu In proceedings of IEEE International Conference on Cluster Computing, 2021 DOI PDF 2020 HPCC HySync: Hybrid federated learning with effective synchronization Guomei Shi, Li Li, Jun Wang, Wenyan Chen, Kejiang Ye, and Chengzhong Xu In proceedings of IEEE International Conference on High Performance Computing & Communications, 2020 DOI PDF HDIS Understanding the workload characteristics in alibaba: A view from directed acyclic graph analysis Chengzhi Lu, Wenyan Chen, Kejiang Ye, and Chengzhong Xu In proceedings of International Conference on High Performance Big Data and Intelligent Systems, 2020 DOI PDF JCST Interference analysis of co-located container workloads: a perspective from hardware performance counters Wenyan Chen, Kejiang Ye, Chengzhi Lu, Dongdai Zhou, and Chengzhong Xu Journal of Computer science and Technology, 2020 DOI PDF 2019 HPCC Co-locating online workload and offline workload in the cloud: An interference analysis Wenyan Chen, Kejiang Ye, and Chengzhong Xu In proceedings of IEEE International Conference on High Performance Computing & Communications, 2019 DOI PDF ICPADS Adgs: Anomaly detection and localization based on graph similarity in container-based clouds Chengzhi Lu, Kejiang Ye, Wenyan Chen, and Chengzhong Xu In proceedings of the IEEE International Conference on Parallel and Distributed Systems, 2019 DOI PDF 2018 ICPADS How does the workload look like in production cloud? analysis and clustering of workloads on alibaba cluster trace Wenyan Chen, Kejiang Ye, Yang Wang, Guoyao Xu, and Chengzhong Xu In proceedings of the IEEE International Conference on Parallel and Distributed Systems, 2018 DOI PDF