I'm interested in all things data storage (e.g., file systems, non-volatile memories, key-value stores, data-intensive computing, data-centric infrastructures). I'm leading the Data Storage Lab where we play with the latest storage technologies and strive to advance the reliability, security, scalability, usability etc for data, for people, or just for fun.

Publications

  • PROV-IO+: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems.
    Runzhou Han, Mai Zheng, Suren Byna, Houjun Tang, Bin Dong, Dong Dai, Yong Chen, Dongkyun Kim, Joseph Hassoun, and David Thorsley
    IEEE Transactions on Parallel and Distributed Systems (TPDS), 2024
    [Paper]
  • λFS: Elastically Scaling Distributed File System Metadata Service using Serverless Functions.
    Benjamin Carver, Runzhou Han, Jingyuan Zhang, Mai Zheng, and Yue Cheng
    Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
    [Paper]
  • ARA PAWR: Wireless Living Lab for Smart and Connected Rural Communities
    Taimoor U. Islam, Joshua O. Boateng, Guouing Zu, Mukaram Shahid, Md Nadim, Wei Xu, Tianyi Zhang, Salil Reddy, Xun Li, Ataberk Atalar, Yung-Fu Chen, Sarath Babu, Hongwei Zhang, Daji Qiao, Mai Zheng\, Yong Guan, Ozdal Boyraz, Anish Arora, Mohamed Selim, Myra B. Cohen
    Demos Session, 29th ACM Annual International Conference on Mobile Computing and Networking (MobiCom-Demo), 2023
    [Paper]
  • Understanding Persistent-Memory Related Issues in the Linux Kernel
    Om R. Gatla, Duo Zhang, Wei Xu, and Mai Zheng
    To appear in ACM Transactions on Storage (TOS), 2023
    [Paper][BugBenchk]
  • ConfD: Analyzing Configuration Dependencies of File Systems for Fun and Profit.
    Tabassum Mahmud, Om Rameshwar Gatla, Duo Zhang, Carson Love, Ryan Bumann, and Mai Zheng
    Proceedings of the 21st USENIX Conference on File and Storage Technologies (FAST), 2023
    [Paper | Open-Source Prototype]
  • Drill: Log-based Anomaly Detection for Large-scale Storage Systems Using Source Code Analysis.
    Di Zhang, Chris Egersdoerfer, Tabassum Mahmud, Mai Zheng, and Dong Dai
    Proceedings of the 37th IEEE International Parallel and Distributed Processing Symposium(IPDPS), 2023
    [Paper]
  • FaultyRank: A Graph-based Parallel File System Checker.
    Saisha Kamat, Abdullah Al Raqibul Islam, Mai Zheng, and Dong Dai
    Proceedings of the 37th IEEE International Parallel and Distributed Processing Symposium(IPDPS), 2023
    [Paper]
  • Analyzing Configuration Dependencies of DAX File Systems.
    Tabassum Mahmud, Om Rameshwar Gatla, Duo Zhang, Carson Love, Ryan Bumann, and Mai Zheng
    The 14th Annual Non-Volatile Memories Workshop (NVMW), 2023
  • On the Scalability of Testing the Crash Consistency of PM Systems.
    Duo Zhang, Om Rameshwar Gatla, Abdullah Al Raqibul Islam, Dong Dai, and Mai Zheng
    Work-in-Progress reports (WiPs) & Poster Sessions, the 21st USENIX Conference on File and Storage Technologies (FAST-WiP), 2023
  • Data Distribution for Heterogeneous Storage Systems.
    Jiang Zhou, Yong Chen, Mai Zheng, Weiping Wang.
    IEEE Transactions on Computers (TC), 2022
    [Paper]
  • Wireless Guard for Trustworthy Spectrum Management.
    Mukaram Shahid, Sarath Babu, Hongwei Zhang, Daji Qiao, Yong Guan, Joshua Ofori Boateng, Taimoor Ul Islam, Guoying Zu, Ahmed Kamal, and Mai Zheng.
    Proceedings of the 16th ACM Workshop on Wireless Network Testbeds, Experimental evaluation & CHaracterization (WiNTECH) in conjunction with ACM MobiCom, 2022
    [Paper | ARA Platform]
  • On the Reproducibility of Bugs in File-System Aware Storage Applications
    Duo Zhang, Tabassum Mahmud, Om Rameshwar Gatla, Runzhou Han, Yong Chen, and Mai Zheng
    Proceedings of the 16th IEEE International Conference on Networking, Architecture, and Storage(NAS), 2022
    [Paper]
  • Understanding Configuration Dependencies of File Systems.
    Tabassum Mahmud, Duo Zhang, Om Rameshwar Gatla, and Mai Zheng
    Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage), 2022
    [Paper | Slides | Video] Best paper nominee!
  • PROV-IO: An I/O-Centric Provenance Framework for Scientific Data on HPC Systems.
    Runzhou Han, Suren Byna, Houjun Tang, Bin Dong, and Mai Zheng
    Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2022
    [Paper | Open-Source Prototype]
  • Towards Bug-Free DBMS Ecosystems.
    Mai Zheng, Jack Clark, and Miryung Kim
    Technical Report of Dagstuhl Seminar 21442: Ensuring the Reliability and Robustness of Database Management Systems (Dagstuhl), Volume 11, Issue 10, 2022. Invited!
  • A Study of Failure Recovery and Logging of High-Performance Parallel File Systems.
    Runzhou Han, Om R. Gatla, Mai Zheng, Jinrui Cao, Di Zhang, Dong Dai, Yong Chen, and Jonathan Cook
    ACM Transactions on Storage (TOS), Volume 18, Issue 2, 2022
    [Paper]
  • Towards Unified FAIR Metadata Services for Scientific Data.
    Mai Zheng, Runzhou Han, and Haojun Tang
    Department of Energy ASCR Workshop on the Management and Storage of Scientific Data (DOE-ASCR-Data), 2022. Invited!
  • Understanding Configuration Issues in Storage Systems
    Tabassum Mahmud and Mai Zheng
    Work-in-Progress reports (WiPs) & Poster Sessions, 20th USENIX Conference on File and Storage Technologies (FAST-WiP), 2022
  • Towards A Practical Provenance Framework for Scientific Data on HPC Systems
    Runzhou Han, Suren Byna and Mai Zheng
    Work-in-Progress reports (WiPs) & Poster Sessions, 20th USENIX Conference on File and Storage Technologies (FAST-WiP), 2022
  • Benchmarking for Observability: The Case of Diagnosing Storage Failures.
    Duo Zhang, Mai Zheng
    BenchCouncil Transactions on Benchmarks, Standards and Evaluations (TBench), 2021
    [Paper | Video]
  • ARA: A Wireless Living Lab Vision for Smart and Connected Rural Communities.
    Hongwei Zhang, Yong Guan, Ahmed Kamal, Daji Qiao, Mai Zheng, et.al.
    Proceedings of the 15th ACM Workshop on Wireless Network Testbeds, Experimental evaluation & CHaracterization (WiNTECH) in conjunction with ACM MobiCom, 2021
    [Paper | ARA Platform]
  • SentiLog: Anomaly Detection on Parallel File Systems via Log-based Sentiment Analysis.
    Di Zhang, Dong Dai, Runzhou Han, and Mai Zheng
    Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage), 2021
    [Paper | Slides] Best paper nominee!
  • A Study of Persistent Memory Bugs in the Linux Kernel
    Duo Zhang, Om R. Gatla, Wei Xu, and Mai Zheng
    Proceedings of the 14th ACM International Systems and Storage Conference (SYSTOR), 2021
    [Paper | Video]
  • Fingerprinting the Checker Policies of Parallel File Systems
    Runzhou Han, Duo Zhang, and Mai Zheng
    Proceedings of the 5th ACM/IEEE International Parallel Data Systems Workshop (PDSW) at ACM/IEEE Supercomputing (SC), 2020
    [Paper | Slides]
  • On Failure Diagnosis of the Storage Stack
    Duo Zhang, Om R. Gatla, Runzhou Han, Mai Zheng
    Position Paper & Poster Sessions, 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage-P), 2020
    [Slides | Video]
  • A Cross-Layer Approach for Diagnosing Storage System Failures
    Duo Zhang, Chander Gupta, Mai Zheng, Adam Manzanares, Filip Blagojevic, and Cyril Guyot
    Work-in-Progress reports (WiPs) & Poster Sessions, 18th USENIX Conference on File and Storage Technologies (FAST-WiP), 2020
  • Lessons and Actions: What We Learned from 10K SSD-Related Storage System Failures
    Erci Xu, Mai Zheng, Feng Qin, Yikang Xu, and Jiesheng Wu
    Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC), 2019.
    [Paper]
  • A Performance Study of Lustre File System Checker: Bottlenecks and Potentials
    Dong Dai, Om R. Gatla, and Mai Zheng
    Proceedings of the 35th International Conference on Massive Storage Systems and Technology (MSST), 2019.
    [Paper]
  • Let the Device Talk
    Om R. Gatla, Yealim Sung, and Mai Zheng
    Work-in-Progress reports (WiPs) & Poster Sessions, 17th USENIX Conference on File and Storage Technologies (FAST-WiP), 2019
  • On the Recoverability of Persistent Memory Systems
    Ryan Chartier, Om R. Gatla, Mai Zheng, and Henry Duwe
    Work-in-Progress reports (WiPs) & Poster Sessions, 17th USENIX Conference on File and Storage Technologies (FAST-WiP), 2019
  • Data Storage Research Vision 2025
    George Amvrosiadis, Ali R. Butt, Vasily Tarasov, Erez Zadok, Ming Zhao, et al.
    Technical Report on an NSF-Sponsored Community Visioning Workshop, 2019. Invited!
  • Towards Robust File System Checkers
    Om R. Gatla, Mai Zheng, Muhammad Hameed, Viacheslav Dubeyko, Adam Manzanares, Filip Blagojevic, Cyril Guyot, and Robert Mateescu
    ACM Transactions on Storage (TOS), Volume 14 Issue 4, 2018
    [Paper] Fast-tracked!
  • Understanding SSD Reliability in Large-Scale Cloud Systems
    Erci Xu, Mai Zheng, Feng Qin, Yikang Xu, and Jiesheng Wu
    Proceedings of the 3rd ACM/IEEE Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW) at ACM/IEEE Supercomputing (SC), 2018
    [Paper | Slides]
  • Fault Tolerance Performance Evaluation of Large Scale Distributed Storage Systems: HDFS and Ceph Case Study
    Yehia Arafa, Atanu Baraiv, Mai Zheng, and Abdel-Hameed Badawy
    Proceedings of the 22nd IEEE High Performance Extreme Computing Conference (HPEC), 2018
  • PFault: A General Framework for Analyzing the Reliability of High-Performance Parallel File Systems
    Jinrui Cao, Om R. Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, and Yong Chen
    Proceedings of the 32nd ACM/SIGARCH International Conference on Supercomputing (ICS), 2018
    [Paper | Slides | Open-Source Prototype]
  • Analysis and Prediction of Storage Error Events for High Performance Computing Systems
    Panika Valecha, Huiping Cao, Qixu Gong, Mai Zheng, Feng Yan, Xing Lin, and Art Harkin
    Poster Session, Department of Energy (DOE) Conference on Data Analysis (CoDA), 2018
  • Towards Robust File System Checkers
    Om R. Gatla, Muhammad Hameed, Mai Zheng, Viacheslav Dubeyko, Adam Manzanares, Filip Blagojevic, Cyril Guyot, and Robert Mateescu
    Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST), 2018
    [Paper | Slides | Open-Source Prototype] Best paper nominee!
  • eDelta: Pinpointing Energy Deviations in Smartphone Apps via Comparative Trace Analysis
    Li Li, Bruce Beitman, Mai Zheng, Xiaorui Wang, and Feng Qin
    Proceedings of the 8th International Green and Sustainable Computing Conference (IGSC), 2017
  • Selective Checkpointing for Minimizing Recovery Energy and Efforts of Smartphone Apps
    Li Li, Yunhao Bai, Xiaorui Wang, Mai Zheng, and Feng Qin
    Proceedings of the 8th International Green and Sustainable Computing Conference (IGSC), 2017
  • Understanding the Fault Resilience of File System Checkers
    Om R. Gatla and Mai Zheng
    Proceedings of the 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage), 2017
    [Paper | Slides]
  • On Fault Resilience of File System Checkers
    Om R. Gatla and Mai Zheng
    Work-in-Progress reports (WiPs) & Poster Sessions, 15th USENIX Conference on File and Storage Technologies (FAST-WiP), 2017
    [WiP | Slides | Discussion@Linux FAST Summit'17 | Media]
  • Do Not Blame Devices for All Failures
    Simeng Wang, Jinrui Cao, Om R. Gatla, Muhammad Hameed, and Mai Zheng
    Poster Session, 8th Annual Non-Volatile Memories Workshop (NVMW), 2017
  • A Command-Level Study of Linux Kernel Bugs
    Yiliang Shi, Danny V. Murillo, Simeng Wang, Jinrui Cao, and Mai Zheng
    Proceedings of the 3rd National Workshop for REU Research in Networking and Systems (REUNS) at IEEE International Conference on Computing, Networking and Communications (ICNC), 2017
  • Reliability Analysis of SSDs under Power Fault
    Mai Zheng, Joseph Tucek, Feng Qin, Mark Lillibridge, Bill W Zhao, and Elizabeth S Yang
    ACM Transactions on Computer Systems (TOCS), Volume 34 Issue 4, 2016
    [Paper]
  • A Generic Framework for Testing Parallel File Systems
    Jinrui Cao, Simeng Wang, Dong Dai, Mai Zheng, and Yong Chen
    Proceedings of the 1st ACM/IEEE Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW) at ACM/IEEE Supercomputing (SC), 2016
    [Paper]
  • Emulating Realistic Flash Device Errors with High Fidelity
    Simeng Wang, Jinrui Cao, Danny V. Murillo, Yiliang Shi, and Mai Zheng
    Proceedings of the 11th IEEE International Conference on Networking, Architecture, and Storage (NAS), 2016
  • Rethinking Networking in a Non-volatile, Heterogeneous World
    Satyajayant Misra and Mai Zheng
    Department of Energy (DOE) Workshop on Network Research Problems and Challenges (DOE-Network), 2016
  • An Adaptive Algorithm for Scheduling Parallel Jobs in Meteorological Cloud
    Yongsheng Hao, Lina Wang, and Mai Zheng
    Journal of Knowledge-based Systems (KBS), Volume 98, 2016
  • A Reliability Analysis Framework for Cloud Storage Systems
    Mai Zheng, Joseph Tucek, Feng Qin, and Mark Lillibridge
    National Science Foundation (NSF) Workshop on Experimental Support for Cloud Computing (NSF-Cloud), 2014
  • Torturing Databases for Fun and Profit
    Mai Zheng, Joseph Tucek, Dachuan Huang, Feng Qin, Mark Lillibridge, Elizabeth S Yang, Bill W Zhao, and Shashank Singh
    Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2014
    [Paper | Slides]
  • GMRace: Detecting Data Races in GPU Programs via A Low-Overhead Scheme
    Mai Zheng, Vignesh T. Ravi, Feng Qin, and Gagan Agrawal
    IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 25 Issue 1, 2014
    [Paper]
  • Understanding the Robustness of SSDs under Power Fault
    Mai Zheng, Joseph Tucek, Feng Qin, and Mark Lillibridge
    Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST), 2013
    [Paper | Slides]
  • LiU: Hiding Disk Access Latency for HPC Applications with a New SSD-Enabled Data Layout
    Dachuan Huang, Xuechen Zhang, Wei Shi, Mai Zheng, Song Jiang, and Feng Qin
    Proceedings of the 21st IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2013
    [Paper]
  • GMProf: A Low-Overhead, Fine-Grained Profiling Approach for GPU Programs
    Mai Zheng, Vignesh T. Ravi, Wenjing Ma, Feng Qin, and Gagan Agrawal
    Proceedings of the 19th IEEE Annual International Conference on High Performance Computing (HiPC) , 2012
    [Paper]
  • Modeling Software Execution Environment
    Dawei Qi, William Sumner, Feng Qin, Mai Zheng, Xiangyu Zhang and Abhik Roychoudhury
    Proceedings of the 19th Working Conference on Reverse Engineering (WCRE), 2012
    [Paper]
  • 2ndStrike: Towards Manifesting Hidden Concurrency Typestate Bugs
    Qi Gao, Wenbin Zhang, Zhezhe Chen, Mai Zheng, and Feng Qin
    Proceedings of the 16th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2011
    [Paper]
  • GRace: A Low-Overhead Mechanism for Detecting Data Races in GPU Programs
    Mai Zheng, Vignesh T. Ravi, Feng Qin, and Gagan Agrawal
    Proceedings of the 16th ACM/SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP), 2011
    [Paper | Slides]
  • A Phase Fitting Method for Sub-pixel Displacement Measurements Using Digital Speckle Images
    Mai Zheng, Jian Ji, Li Guo, and Junzhu Zhu
    Proceedings of 9th IEEE International Conference on Signal Processing (ICSP), 2008
  • Stitching Video from Webcams
    Mai Zheng, Xiaolin Chen, and Li Guo
    Proceedings of 4th International Symposium on Visual Computing (ISVC), 2008