About Mingjie
Mingjie Tang is currently working on LLM systems and algorithms. Previously, he was the tech lead at Ant Group, and a member of the technical staff at Hortonworks/Cloudera He has broad research interests in RDBMS, distributed machine learning systems, big data computation engines, and distributed deep learning systems.
He earned his PhD degree from the Computer Science Department at Purdue University, West Lafayette, Indiana. His advisor was Professor Walid G. Aref. During his doctoral studies, he focused on distributed systems for spatial computation, machine learning, and artificial intelligence.
Education
2010.9 - 2016.9 | PhD of Computer Science, Purdue University, IN, USA |
2010.9 - 2012.12 | Master of Computer Science, Purdue University, IN, USA |
2007.9 - 2010.7 | Master of Computer Science University of Chinese Academy of Sciences, Beijing, China |
2003.8 - 2007.7 | Bachelor of Science, Department of Computer Science Sichuan University, Chengdu, China |
Industry Experience
2018.10 - 2022.10 | AI Engineer Ant Group , CA, USA |
2016.9 - 2018.10 | Member of tech staff Hortonworks, CA, USA |
2015.5 - 2015.8 | Research Intern IBM research Almaden, CA, USA |
2012.5 - 2012.8 | Software Engineer Intern Microsoft , Seattle, USA |
Publication (Selected)
-
mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs
Zhengmao Ye, Dengchun Li, Zetao Hu, Tingfen Lan, Sha Jian, Sicong Zheng, Lei Duan, Jie Zuo, Hui Lu, Yuanchun Zhou, Mingjie Tang
under revision Proceedings of Very Large Data Bases Conference (VLDB), 2025.
-
DLRover-RM: Resource Optimization for Deep Recommendation Models Training in Cloud
Qinlong Wang, Tingfeng Lan, Yinghao Tang, Bo Sang, Ziling Huang, Yihen Du, Haitao Zhang, Shajian, Ke Zhang, Hui Lu, Yuanchun Zhou, Mingjie Tang
in Proceedings of Very Large Data Bases Conference (VLDB), 2024.
-
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks.
Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu,Mingjie Tang, Xiangyu Zhang
in International Conference on Machine Learning (ICML), 2024.
-
A Demonstration of GPTuner: A GPT-Based Manual-Reading Database Tuning System.
Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Yuanchun Zhou, Mingjie Tang, Jianguo Wang
Demo in Proceedings of ACM Conference on Management of Data (SIGMOD), 2024.
-
GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization.
Jiale Lao, Yibo Wang, Yufei Li, Jianping Wang, Yunjia Zhang, Zhiyuan Cheng, Wanghu Chen, Mingjie Tang, Jianguo Wang
in Proceedings of Very Large Data Bases Conference (VLDB), 2024.
-
Couler: Unified Machine Learning Workflow Optimization in Cloud
Xiaoda Wang, Yuan Tang, Tengda Guo, Bo Sang,Jingji Wu, Jian Sha, Ke Zhang, Jiang Qian, Mingjie Tang
in 40th IEEE International Conference on Data Engineering (ICDE) 2024
-
Cougar: A General Framework for Jobs Optimization In Cloud
Bo Sang, Shuwei Gu, Xiaojun Zhan, Mingjie Tang, Jian Liu, Xuan Chen, Jie Tan, Haoyuan Ge, Ke Zhang, Ruoyi Ruan, Wei Yan
in 39th IEEE International Conference on Data Engineering (ICDE) 2023
-
STULL: Unbiased Online Sampling for Visual Exploration of Large Spatiotemporal Data
Guizhen Wang, Jingjing Guo, Mingjie Tang, José Florencio de Queiroz Neto, Calvin Yau, Anas Daghistani, Morteza Karimzadeh, Walid G. Aref, and David S. Ebert.
in IEEE Conference on Visual Analytics Science and Technology (VAST 2020)
-
LocationSpark: In-memory Distributed Spatial Query Processing and Optimization
Mingjie Tang, Yongyang Yu, Walid G. Aref, Qutaibah Malluhi, and Mourad Ouzzani
in Frontiers in Big Data, section Data Mining and Management 2020
-
A Natural-language-based Visual Query Approach of Uncertain Human Trajectories
Zhaosong Huang, Ye Zhao, Wei Chen, Shengjie Gao , Kejie Yu, Weixia Xu, Mingjie Tang, Minfeng Zhu, Mingliang Xu
in IEEE Transactions on Visualization and Computer Graphics (TVCG 2019)
-
SAC: A System for Big Data Lineage Tracking
[code]
Mingjie Tang, Saisai Shao, Weiqing Yang, Yanbo Liang, Yongyang Yu, Bikas Saha, Dongjoon Hyun
in 35th IEEE International Conference on Data Engineering (ICDE) 2019
-
Efficient Parallel Skyline Query Processing for High-Dimensional Data
Mingjie Tang,Yongyang Yu, Walid G. Aref, Qutaibah Malluhi, and Mourad Ouzzani
in 35th IEEE International Conference on Data Engineering (ICDE) 2019
-
Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster
Ahmed R. Mahmood, Anas Daghistani, Ahmed M. Aly, Walid G. Aref, Mingjie Tang, Saleh M. Basalamah, Sunil Prabhakar
in 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2018
-
Optimizing Generalized Linear Models with Billions of Variables
[code]
Yanbo Liang, Yongyang Yu, Mingjie Tang, Weiqing Yang, Weichen Xu, Chaozhuo Li and Ruifeng Zheng
in ACM International Conference on Information and Knowledge Management(CIKM) 2018
-
Efficient Parallel Skyline Query Processing for High-Dimensional Data
Mingjie Tang,Yongyang Yu, Walid G. Aref, Qutaibah Malluhi, and Mourad Ouzzani
in IEEE Transactions on Knowledge and Data Engineering(TKDE) 2018
-
SHC: Distributed Query Processing for Non-Relational Data Store
[code]
Weiqing Yang*, Mingjie Tang*, Yongyang Yu, Yanbo Liang, Bikas Saha
in 34th IEEE International Conference on Data Engineering (ICDE) 2018
* both are the leading authors
-
COACT: a query interface language for collaborative databases
[link]
K Mershad, QM Malluhi, M Ouzzani, Mingjie Tang, M Gribskov, WG Aref, Deo Prakash
in Distributed and Parallel Databases 2017
-
AUDIT: approving and tracking updates with dependencies in collaborative databases
[link]
K Mershad, QM Malluhi, M Ouzzani, Mingjie Tang, M Gribskov, WG Aref
in Distributed and Parallel Databases 2017
-
In-memory Distributed Matrix Computation Processing and Optimization
[code]
Yongyang Yu, Mingjie Tang, Walid Aref, Qutaibah Malluhi, Mostafa Abbas and Mourad Ouzzani
in 33rd IEEE International Conference on Data Engineering (ICDE) 2017
-
LocationSpark: A Distributed In-Memory Data Management System for Big Spatial Data
[code]
Mingjie Tang, Yongyang Yu, Walid G. Aref, Qutaibah Malluhi, and Mourad Ouzzani
in 42th International Conference on Very Large Data Bases (VLDB) 2016
-
Atlas: On the Expression of Spatial-Keyword Group Queries Using Extended Relational Constructs (Systems Paper)
Walid Aref, Ahmed Mahmood, Ahmed Aly, Mingjie Tang
in 24th International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2016
-
Cruncher: Distributed In-Memory Processing for Location-Based Services
Ahmed S. Abdelhamid, Mingjie Tang, Ahmed M. Aly, Ahmed R. Mahmood, Walid G. Aref,Saleh Basalamah
in 32nd IEEE International Conference on Data Engineering(ICDE)2016
-
Efficient Processing of Hamming-Distance-Based Similarity-Search Queries Over MapReduce
[pdf] [ppt] [code]
Mingjie Tang, Yongyang Yu, Walid G. Aref, Qutaibah Malluhi, and Mourad Ouzzani
in 18th International Conference on Extending Database Technology (EDBT) 2015
-
The Similarity-aware Relational Intersect Database Operator [pdf]
Wadha J. Al Marri, Qutaibah Malluhi, Mourad Ouzzani, Mingjie Tang and Walid G. Aref
in 7th International Conference on Similarity Search and Applications (SISAP) 2014 Best Paper Award
full journal version Information Systems 2015 [paper]
- Similarity Group-by Operators for Multi-dimensional Relational Data [pdf] [code]
Mingjie Tang, Ruby Y. Tahboub, Walid G. Aref, Mikhail Atallah, Qutaibah Malluhi, Mourad Ouzzani, Yasin. Siva
in IEEE Transactions on Knowledge and Data Engineering(TKDE) 2015
-
Bird Flu Outbreak Prediction via Satellite Tracking
[pdf]
YuanChun Zhou, Mingjie Tang,Weike Pan,Jinyan Li, Weihang Wang, Jing Shao, Liang Wu, Jianhui Li, Qiang Yang, BaoPing Yan
in IEEE Intelligent Systems(IS) 2013
-
Exploring the Wild bird's Role as Potential Vector for H5N1 transmission by Clustering and Association Analysis,
[pdf]
Mingjie Tang, YuanChun Zhou, Weihang Wang ,Peng Cui, Jinyan Li, Yuan- Sheng Hou, BaoPing Yan
in Knowledge and Information Systems(KAIS) 2010
-
Birds Bring Flues? Mining Frequent and High Weighted Cliques from Birds Migration Networks
[pdf]
Mingjie Tang, Weihang Wang ,Yexi Jiang, YuanChun Zhou, Jinyan Li, Peng Cui,Ying Liu, BaoPing Yan
in 15th the Database Systems for Advanced Applications (DASFAA) 2010
-
Discovery of Migration Habitats and Routes of Wild Bird Species by Clustering and Association Analysis
[pdf]
Mingjie Tang, Weihang Wang , YuanChun Zhou, Yexi Jiang, Jinyan Li, Peng Cui, Yuan- Sheng Hou, BaoPing Yan,
in 5th International Conference on Advanced Data Mining and Applications (ADMA) 2009 Best Application Paper Award.