
Mai Zheng
(Mike)
Associate Professor
Dept. of Electrical and Computer Engineering
Data Storage Lab (DSL)
Center for Cybersecurity Innovation & Outreach (CyIO)
Center for Wireless, Communities and Innovation (WiCI)
Dept. of Electrical and Computer Engineering
Data Storage Lab (DSL)
Center for Cybersecurity Innovation & Outreach (CyIO)
Center for Wireless, Communities and Innovation (WiCI)
Office: 349 Durham Hall
Phone: (515) 294-6285
Email: mai AT iastate DOT edu
www.ece.iastate.edu/~mai
Phone: (515) 294-6285
Email: mai AT iastate DOT edu
www.ece.iastate.edu/~mai
I'm interested in all
things data storage (e.g., file systems,
non-volatile memories, key-value stores, data-intensive computing,
data-centric infrastructures). I'm leading the Data
Storage Lab where we play with the latest storage
technologies and strive to advance the integrity, security,
scalability, usability etc for data, for people, or just for fun.
Data Storage Lab
Our research are motivated by real problems of mission-critical systems that jeopardize data, e.g.:
- Crash, Corruption, & Bug across system layers ==> [SSD: TOCS'16/FAST'13 | FS: FAST'23, FAST'18 | OS: TOS'23 | DB: OSDI'14]
- Data Lost & Service Disruption @HPC/Data centers at scale ==> [HotStorage'24, TOS'22/ICS'18, IPDPS'23, ATC'19]
- Data Provenance, Observability, & Scalability challenges ==> [TPDS'24/HPDC'22, ASPLOS'23, ARA Platform]
|
|
We build
systems/tools to attack such problems and open source our
research prototypes/datasets.
- Current Members
- Faculty: Mai Zheng
- Students:





Wei Xu Chao Shi Roop Kiran Ahmed Dajani Varun Girimaji 


Vidhya Mannathu Parambil Joshua John Aaron Trelstad Zeren Yang
- Selected Projects
- Large-Scale
High-Performance Storage Systems
- Large-scale distributed storage systems are critical infrastructures typically deployed in data centers and high-perfomrance computing (HPC) centers to empower data-intensive computations and data-driven discoveries (e.g., AI/ML applications). Unfortunately, due to the complexity and scale, even state-of-the-art systems may experience correctness and/or performance issues. We are building frameworks to improve data storage at scale for robust and high-performant data-driven discoveries.
- Relevant Papers: [IPDPS'25 | CN'25 | HotStorage'24 | ASPLOS'23 | IPDPS'23a | IPDPS'23b | TOS'22 | HPDC'22 | HotStorage'21 (video) | PDSW'20 (slides) | USENIX ATC'19 (slides) | MSST'19 | PDSW'18 (slides) | ICS'18 (slides) | PDSW'16 (slides)]
- Open Source Artifacts: [PFault | ARA Platform]
- Main Sponsors: [NSF | Samsung | NIFA | US Ignite]
- Single-Machine Storage
- Local storage systems running on a single machine serve as the cornerstone of many large-scale systems and applications. Unfortunately, despite decades of evolution and various protections, the classic local storage systems may still run into issues in practice (e.g., data corruptions, metadata inconsistencies). What's worse, the same issues that occur to lcoal storage systems may also affect their maintenance/recovery utilities and/or other applications using them, leading to cascading problems. We are investigating the fundamental problems and developing mitigation solutions.
- Relevant Papers: [TOCS'25 | FAST'23 | HotStorage'22 (slides) | FAST'22-WiP (video) | TOS'18 | FAST'18 (slides) | HotStorage'17 (slides) | FAST'17-WiP]
- Open Source Artifacts: [ConfD | RFSCK]
- Main Sponsors: [NSF | Western Digital]
- Cross-Layer
Issues on Data Path
- The storage stack contains a deep data I/O path involving complicated layers (e.g., solid-state drives (SSD), persistent memories (PM), device drivers, local and parallel file systems, databases, blockchain storage with cloud backends). Besides latent issues in individual layers, many desired properties may be violated due to cross-layer dependencies, hurting the end-to-end gurantees and data integrity. We take a holistic view to investigate such daunting challenges across system layers.
- Relevant Papers: [IPDPS'25 | TOS'23 | FAST'23 | Dagstuhl'22 | TOS'22 | NVMW'23 | TC'22 | SYSTOR'21 (video) | FAST'20-WiP | ATC'19 | TOS'18 | TOCS'16 | OSDI'14 (slides) | FAST'13 (slides) ]
- Open Source Artifacts: [BugBenchk | ConfD | PFault]
- Main Sponsors: [NSF | Western Digital]
- Data Provenance &
Observability
- Understanding the origin and quality of data (e.g., the root cause of an anomaly, the reproducibility of the best result) becomes more and more challenging due to the ever-increasing data volume and system complexity. Data provenance, or data lineage, describes the life cycle of data, which is essential to address the challenge. We are building frameworks to capture rich metadata and generate provenance to ensure the observability, reproducibility, explainability, auditability, trustworthiness, etc of systems and data.
- Relevant Papers: [TPDS'24 | HPDC'22 | NAS'22 | FAST'22-WiP | DOE-ASRC-Data'22 | TBench'21 (video) | HotStorage'20 (slides) ]
- Open Source Artifacts: [PROV-IO | BugBenchk]
- Main Sponsors: [NSF | LBNL | Samsung]
- Data Infrastructure for
Rural America
- ARA is a data infrastructure for advanced wireless research being deployed across the ISU campus, City of Ames, and surrounding research and producer farms as well as rural communities in central Iowa, spanning a rural area with diameter over 60km. It serves as a wireless living lab for smart and connected rural communities and rich experimental data, enabling the research and development of rural-focused wireless technologies that provide affordable, high-capacity connectivity to rural communities and industries such as agriculture.
- Relevant Papers: [ CN'25 | MobiCom'23 | WiNTECH'22 | WiNTECH'21]
- Open Source Artifacts: [ARA Platform]
- Main Sponsors: [NSF | NIFA | US Ignite | PAWR Industry Consortium]
- Large-Scale
High-Performance Storage Systems
- Selected Outreach Activities
-
- Full
Publication List
- Alumni
- Acknowledgements