Chunbin Lin, Ph.D.

Staff Software Engineer @ MongoDB
ExAamzon, ExVisa, UCSD PhD
Email: chunbinlin.cs@gmail.com
LinkedIn DBLP Google Scholar

Chunbin Lin is currently a staff software engineer at MongoDB.
Before that, he worked at Visa's Genai Platform team and Amazon Web Services (AWS) in Redshift team. Besides that, he also interned at IBM, Informatica and NUS.
He obtained his PhD degree in Computer Science from University of California, San Diego (UCSD). His PhD advisor is Prof. Yannis Papakonstantinou.

  • Genai Platform - Build high performance platform to integrate various LLMs including openai models, anthropic models, and also open-source models. It provides load balancer, rate limit and content inspection services all together to offer a low latency and high availability system.
  • Machine Learning Platform - Build high performance platform to support feature creation, model training and scoring, and model depolyment. Enable AutoML to provide automatic end-to-end solution for ML life cycle.
  • Machine learning based workload management - Applying machine learning techniques in database to improve the throughput by predicting queries' execution time, memeory requirement and cpu cycles and performing the optimal scheduling.

Publications

Conference Papers

C23 Auto-WLM: ML-enhanced workload management in Amazon Redshift
Gaurav Saxena, Mohammad Rahman, Naresh Chainani, Chunbin Lin, George Caragea, Fahim Chowdhury, Ryan Marcus, Tim Kraska, Ippokratis Pandis, Balakrishnan (Murali) Narayanaswamy
Proceedings of ACM Conference on Management of Data (SIGMOD), 2023
C22 Multivariate Time Series Data Imputation Using Attention-Based Mechanism
Jingqi Zhao, Chuitian Rong, Chunbin Lin, Xin Dang
Neurocomputing, 2023
C21 Highly Efficient String Similarity Search and Join over Compressed Indexes
Guorui Xiao, Jin Wang, Chunbin Lin, Carlo Zaniolo
IEEE International Conference on Data Engineering (ICDE), 2022
C20 Workload-Aware Performance Tuning for Autonomous DBMSs
Zhengtong Yan, Jiaheng Lu, Naresh Chainani, Chunbin Lin
IEEE International Conference on Data Engineering (ICDE), 2021
Link
C19 Evaluating List Intersection on SSDs for Parallel I/O Skipping
Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson
IEEE International Conference on Data Engineering (ICDE), 2021
C18 Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees
Chunbin Lin, Etienne Boursier, Yannis Papakonstantinou
Proceedings of the VLDB Endowment (PVLDB), 2020
Link PDF
C17 Fast Error-tolerant Location-aware Query Autocompletion
Jin Wang, Chunbin Lin
IEEE International Conference on Data Engineering (ICDE), 2020
Link PDF
C16 Motif Discovery Using Similarity-Constraints Deep Neural Networks
Chuitian Rong, Ziliang Chen, Chunbin Lin, Jianming Wang
International Conference on Database Systems for Advanced Applications (DASFAA), 2020
Link PDF
C15 Synergy of Database Techniques and Machine Learning Models for String Similarity Search and Join
Jiaheng Lu, Chunbin Lin, Jin Wang, Chen Li
Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM), pages: 2975-2976, 2019
Link PDF Full Version (PDF) Website
C14 MF-Join: Efficient Fuzzy String Similarity Join with Multi-level Filtering
Jin Wang, Chunbin Lin, Carlo Zaniolo
IEEE International Conference on Data Engineering (ICDE), 2019
Link PDF
C13 Scalable Metric Similarity Join Using MapReduce
Jiacheng Wu, Yong Zhang, Jin Wang, Chunbin Lin, Yingjia Fu, Chunxiao Xing
IEEE International Conference on Data Engineering (ICDE), 2019
Link
C12 An Efficient Sliding Window Approach for Approximate Entity Extraction with Synonyms
Jin Wang, Chunbin Lin, Mingda Li, Carlo Zaniolo
International Conference on Extending Database Technology (EDBT), 2019
Link PDF
C11 An Experimental Study of Bitmap Compression vs. Inverted List Compression
Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson
Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 993-1008, 2017
Link PDF
C10 MILC: Inverted List Compression in Memory
Jianguo Wang, Chunbin Lin, Ruining He, Moojin Chae, Yannis Papakonstantinou, Steven Swanson
Proceedings of the VLDB Endowment (PVLDB), Volume 10, Issue 8, pages: 853-864, 2017
Link PDF
C9 Towards heterogeneous keyword search
Chunbin Lin, Jianguo Wang, Chuitian Rong
Proceedings of the ACM Turing 50th Celebration Conference-China (ACM TUR-C), pages: 1-6, 2017
Link PDF
C8 Fast and Scalable Distributed Set Similarity Joins for Big Data Analytics
Chuitian Rong, Chunbin Lin, Yasin Silva, Jianguo Wang, Wei Lu, Xiaoyong Du
Proceedings of the International Conference on Data Engineering (ICDE), pages: 1059-1070, 2017
Link PDF
C7 Answer yes/no queries in search engines
Chunbin Lin
The Conference on Innovative Data Systems Research (CIDR), 2017
PDF
C6 Fast In-Memory SQL Analytics on Typed Graphs
Chunbin Lin, Benjamin Mandel, Yannis Papakonstantinou, Matthias Springer
Proceedings of the VLDB Endowment (PVLDB), Volume 10, Issue 3, pages: 265-276, 2016
Link PDF Website
C5 HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics
Jing Li, Hung-Wei Tseng, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson
Proceedings of the VLDB Endowment (PVLDB), Volume 9, Issue 14, pages: 1647-1658, 2016
Link PDF
C4 Sherlock: Sparse Hierarchical Embeddings for Visually-aware One-class Collaborative Filtering
Ruining He, Chunbin Lin, Jianguo Wang, Julian McAuley
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pages: 3740-3746, 2016
Link PDF
C3 String similarity measures and joins with synonyms
Jiaheng Lu, Chunbin Lin, Wei Wang, Chen Li, Haiyong Wang
Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 373-384, 2013
Link PDF
C2 Processing XML Twig Pattern Query with Wildcards
Huayu Wu, Chunbin Lin, Tok Wang Ling, Jiaheng Lu
International Conference on Database and Expert Systems Applications (DEXA), pages: 326-341, 2012
Link
C1 Optimal top-k generation of attribute combinations based on ranked lists
Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan Wang, Xinxing Chen
Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 409-420, 2012
Link PDF

Journal Articles

(* denotes corresponding author)
J3 Boosting Approximate Dictionary-based Entity Extraction with Synonyms
Jin Wang, Chunbin Lin*, Mingda Li, Carlo Zaniolo
Information Sciences, Volume 530, pages: 1-21, 2020. (Impact Factor: 5.524)
Link
J2 Optimal Algorithms for Selecting Top-k Combinations of Attributes: Theory and Applications
Chunbin Lin*, Jiaheng Lu, Zhewei Wei, Jianguo Wang, Xiaokui Xiao
The International Journal on Very Large Data Bases (VLDB Journal), Volume 27, Issue 1, pages: 27-52, 2018
Link PDF
J1 Boosting the Quality of Approximate String Matching by Synonyms
Jiaheng Lu, Chunbin Lin, Wei Wang, Chen Li, Xiaokui Xiao
ACM Transactions on Database Systems (TODS), Volume 40, Issue 3, pages: 1-42, 2015
Link PDF

Demo papers

D5 GQFast: Fast Graph Exploration with Context-Aware Autocompletion
Chunbin Lin, Jianguo Wang, Yannis Papakonstantinou
Proceedings of the International Conference on Data Engineering (ICDE), pages 1389-1390, 2017
Link Demo Video
D4 SpiderX: Fast XML Exploration System
Chunbin Lin, Jianguo Wang
Proceedings of International World Wide Web Conference (WWW), pages: 237-241, 2017
Link PDF
D3 Location-sensitive Query Auto-completion
Chunbin Lin, Jianguo Wang, Jiaheng Lu
Proceedings of International World Wide Web Conference (WWW), pages: 819-820, 2017
Link PDF
D2 Fashionista: A Fashion-aware Graphical System for Exploring Visually Similar Items
Ruining He, Chunbin Lin, Julian McAuley
Proceedings of the International Conference on World Wide Web (WWW), pages 199-202, 2016
Link PDF
D1 LotusX: A Position-Aware XML Graphical Search System with Auto-Completion
Chunbin Lin, Jiaheng Lu, Tok Wang Ling, Bogdan Cautis
Proceedings of the International Conference on Data Engineering (ICDE), pages 1265-1268, 2012
Link PDF

Academic Service

  • International Conference on Very Large Data Bases (VLDB'20)
  • ACM SIGMOD International Conference on Management of Data (SIGMOD'22)
  • ACM SIGMOD International Conference on Management of Data (SIGMOD'19, SIGMOD'21, SIGMOD'22)
  • International Conference on Very Large Data Bases (VLDB'21, VLDB'22)
  • IEEE International Conference on Data Engineering (ICDE'22, ICDE'23)
  • ACM Special Interest Group on Knowledge Discovery and Data Mining(SIGKDD'22')
  • International Conference on Extending Database Technology (EDBT'21)
  • AAAI Conference on Artificial Intelligence (AAAI'20, AAAI'21, AAAI'22, AAAI'23)
  • The Web Conference (WWW'19)
  • International Joint Conferences on Artificial Intelligence (IJCAI'19, IJCAI'20, IJCAI'21, IJCAI'22, IJCAI'23)
  • The International Journal on Very Large Data Bases (VLDB Journal)
  • World Wide Web Journal
  • Transactions on Knowledge and Data Engineering (TKDE)
  • Information Sciences
  • Information Systems