Malachy Xinyu Yang

He/They

Ph.D. Student

Electrical and Computer Engineering Department

Carnegie Mellon University

Welcome! I'm delighted to have you here on this page. Sending you a virtual but heartfelt greeting! 👋

I am currently a second-year Ph.D. student in InfiniAI LabInfiniAI Logo and CatalystInfiniAI Logo at Carnegie Mellon UniversityCMU Logo, where I am fortunated to be advised by Prof. Beidi Chen. I also collaborate closely with Prof. Tianqi Chen at CMU and Prof. Huaxiu Yao at UNCUNC Logo. Previously, I obtained my bachelar’s degree from ACM Honors Class, Zhiyuan College, Shanghai Jiao Tong UniversitySJTU Logo, where I conducted research with Prof. Junchi Yan at ThinkLabThinklab Logo. I had a wonderful time through internships with Prof. Song Han in HAN LabHAN Logo at MITMIT Logo, Prof. Chelsea Finn in IRIS LabIRIS Logo at StanfordStanford Logo, and Dr. Chen Luo at Amazon SearchAmazon Logo.

Additionally, I am a passionate community builder, which I founded the series of Foundation Models in the Wild workshops. Please follow us on Twitter for the latest news, or join us on the Slack for workshop issues and active discussions. I also co-organize the ASAP Seminar Series, focusing on sequence modeling from algorithmic perspectives.

News

I will dedicate 30 mins every week for meetings to chat about life, career plan, or research ideas related to foundation models. I especially welcome and encourage students from underrepresented groups to reach out and will prioritize these meetings. Please email me to schedule a meeting if you are interested.
Jan 13, 2025
📢 We welcome submissions to The 2nd Workshop on Foundation Models in the Wild at ICLR 2025. More details are available on our webpage.
Dec 20, 2024
📣 Excited to share that our proposal for The 2nd Workshop on Foundation Models in the Wild has been accepted to ICLR 2025. Stay tuned for updates and details.
Dec 09, 2024
📢 Our NeurIPS paper released: S2FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity. Check out the code and blog for more details.
Sep 30, 2024
July 26, 2024
I'm lead-organizing Workshop on Foundation Models in the Wild at ICML 2024. Thanks to all co-organizers and speakers for their support. See you at Vienna.

Research Highlights

My research is centered on the intersection of machine learning system and foundation model, with a specific focus on the development of scalable and generalizable foundation model systems in the wild. Recently, I am particularly interested in hardware-aware algorithm design with sub-linear complexity.

  1. Infinite-length Retrieval: For each single query, how can we retrieve relevant informtion from an $O(n)$-length contextual cache in foundation models using $O(\log(n))$ computations and positions?
  2. Infinite-depth Reasoning: An $O(n)$-depth reasoning problem requires either a Transformer with $O(\log(n))$-layer or $O(n)$-width. Can we achieve equivalent capabilities through repeating layers?
  3. Infinite-volume Memory: How to encode $O(n)$-volume knowledge into model parameters? The model size can grow linearly, but computational costs should be $O(\log(n))$ using sparse activation.

Additionally, I am fascinated with structural contextual cache architectures that transcend traditional sequential patterns. This is important for multi-modal foundation models with more diverse structures.

  1. Uni-directional and Bi-directional Relations: Typically, relations can be categorized as uni- or bi-directional. How can we seperate them with different attention masks and position embeddings?
  2. One-to-many and Many-to-one Relations: Current cache architectures only support one-to-many relations. How can we further enable many-to-one relations to fostering information aggregation?
  3. Static and Dynamic Relations: Current cache establishes static relations, but operations involve dynamic relations, as they might not depend on prior information or affect subsequent information.

If you would like to chat more about these topics, please feel free to email me to schedule a meeting.

Publications

Demo
ICLR 2025

APE: Faster and Longer Context‑Augmented Generation via Adaptive Parallel Encoding

Xinyu Yang, Tianqi Chen, Beidi Chen
Demo
ICRA 2025

FlatFusion: Delving into Details of Sparse Transformer-based Camera-LiDAR Fusion for Autonomous Driving

Yutao Zhu, Xiaosong Jia, Xinyu Yang, Junchi Yan
Demo
Preprint

VcLLM: Video Codecs are Secretly Tensor Codecs

Ceyu Xu*, Yongji Wu*, Xinyu Yang*, Beidi Chen, Matthew Lentz, Danyang Zhuo, Lisa Wu Wills
Demo
Preprint

It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

Taiming Lu*, Lingfeng Shen*, Xinyu Yang*, Weiting Tan, Beidi Chen, Huaxiu Yao
Demo
ICLR 2025

Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity

Wentao Guo, Jikai Long, Yimeng Zeng, Zirui Liu, Xinyu Yang, Yide Ran, Jacob R Gardner, Osbert Bastani, Christopher De Sa, Xiaodong Yu, Beidi Chen, Zhaozhuo Xu
Demo
NeurIPS 2024

S2FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

Xinyu Yang, Jixuan Leng, Geyang Guo, Jiawei Zhao, Ryumei Nakada, Linjun Zhang, Huaxiu Yao, Beidi Chen
Demo
COLM 2024

Triforce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Hanshi Sun, Zhuoming Chen, Xinyu Yang, Yuandong Tian, Beidi Chen
Demo
ICML 2024

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen
Demo
ICLR 2024 (Spotlight)

Improving Domain Generalization with Domain Relations

Huaxiu Yao*, Xinyu Yang*, Xinyi Pan, Shengchao Liu, Pang Wei Koh, Chelsea Finn
Demo
Preprint

Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges

Chenhang Cui*, Yiyang Zhou*, Xinyu Yang, Shirley Wu, Linjun Zhang, James Zou, Huaxiu Yao
Demo
TMLR

Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations

Xinyu Yang*, Huaxiu Yao*, Allan Zhou, Chelsea Finn
Demo
ICRA 2023

Bevfusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Zhijian Liu*, Haotian Tang*, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L Rus, Song Han
Demo
CVPR 2023

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Zhijian Liu*, Xinyu Yang*, Haotian Tang, Shang Yang, Song Han
Demo
KDD 2022

Variational Inference for Training Graph Neural Networks in Low-Data Regime through Joint Structure-Label Estimation

Danning Lao*, Xinyu Yang*, Qitian Wu, Junchi Yan

Experience

Education

Carnegie Mellon University logo
Carnegie Mellon University
Ph.D. student in ECE, advised by Prof. Beidi Chen.
Aug 2023 - Present
SJTU Logo
Shanghai Jiao Tong University
B.Eng. in CS, advised by Prof. Yong Yu. (Ranking: 1/29, GPA: 4.0/4.3)
Sep 2019 - Jun 2023

Research

Stanford University logo
Stanford University
Research Intern, advised by Prof. Chelsea Finn.
Mar 2022 - Oct 2023
MIT logo
Massachusetts Institute of Technology
Research Intern, advised by Prof. Song Han.
Nov 2021 - Jun 2023

Industry

Amazon logo
Amazon Search
Applied Scientist Intern, mentored by Dr. Chen Luo.
May 2024 - Aug 2024

Service

Seminar Organization

Workshop Organization

  • Lead Organizer and Program Chair, The 2nd Workshop on Foundation Models in the Wild, ICLR 2025
  • Lead Organizer and Program Chair, Workshop on Foundation Models in the Wild, ICML 2024
  • Organizer, The 3rd Workshop for Out-of-Distribution Generalization in Computer Vision Foundation Models, ECCV 2024

Conference Review

  • International Conference on Machine Learning (ICML), 2024-2025
  • International Conference on Learning Representations (ICLR), 2025
  • Conference on Neural Information Processing Systems (NeurIPS), 2024
  • Conference on Language Modeling (COLM), 2024
  • Empirical Methods in Natural Language Processing (EMNLP), 2024
  • IEEE International Conference on Robotics & Automation (ICRA), 2025

Journal Review

  • Transactions on Machine Learning Research (TMLR)
  • IEEE Robotics and Automation Letters (RA-L)

Contact Me

Misc

  • Before moving to the US, I studied and lived in Shanghai, China for the first two decades of my life.
  • Aside from research, I am passionate about traveling, dining, gaming, shopping, anime and manga.