Lu Dai
Ph.D. Student, HKUST ยท Agentic Search ยท RAG ยท Long Context LLMs
Hong Kong University of Science and Technology
Division of Emerging Interdisciplinary Areas
Hong Kong
I am a Ph.D. student at the Hong Kong University of Science and Technology (HKUST), advised by Prof. Hui Xiong and co-advised by Prof. Hao Liu. My research lies at the intersection of agentic search, retrieval-augmented generation (RAG), and large language models (LLMs), with a focus on building more effective agentic search and retrieval-augmented systems, advancing long-context LLMs, and understanding how knowledge is stored and generalized in LLMs.
I have published at top-tier venues including ICLR (Spotlight), ICCV, EMNLP, KDD, CVPR, and NeurIPS. Prior to my Ph.D., I interned at Google Cloud AI and Baidu Research, and received my B.E. in Computer Science from the University of Science and Technology of China (USTC) where I graduated as an outstanding graduate (top 5%).
Beyond research, I have been playing electronic piano and violin for over 10 years ๐น๐ป.
News
| Feb 20, 2026 | ๐ Our paper VL-Eraser on machine unlearning in VLMs has been accepted at CVPR 2026! |
|---|---|
| Jan 15, 2026 | ๐ Our paper on global temporal retrieval for time series forecasting has been accepted at ICLR 2026! |
| Sep 20, 2025 | ๐ Our paper on Foundation Models for Scientific Discovery has been accepted at NeurIPS 2025! |
| Sep 15, 2025 | ๐ Our paper MolErr2Fix has been accepted at EMNLP 2025! |
| May 01, 2025 | ๐ Our paper ScIRGen has been accepted at KDD 2025! |
Selected Publications
- ICCVCloth2Body: Generating 3D Human Body Mesh from 2D ClothingIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023
- EMNLPImprove Dense Passage Retrieval with Entailment TuningIn Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
- ICLRSePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity ReductionIn International Conference on Learning Representations (ICLR), 2025
- KDDScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific ResearchIn Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2025
- ICLREnhancing Multivariate Time Series Forecasting with Global Temporal RetrievalIn International Conference on Learning Representations (ICLR), 2026
- CVPRVL-Eraser: Vacuum Distillation for Machine Unlearning in Vision-Language ModelsIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
- EMNLPMolErr2Fix: Benchmarking LLM Trustworthiness in ChemistryIn Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025