The MEDE Data Science Cloud: SciServer Based Data Science for Materials Scientists and Engineers

At Hopkins we’ve developed the Materials in Extreme Dynamic Environments Data Science Cloud (MEDE-DSC) to address the need for robust, sustainable data-science tools in the materials domain. The MEDE-DSC combines computing infrastructure with collaborative integration into the materials design loop. The focus of the project aligns with MGI strategic goals to facilitate access to materials data; to build data science skills in the materials domain; and to create tools that help materials scientists link experiments, computation, and theory. This focus guides the project commitment to bring data science tools to materials domain researchers where domain knowledge and expertise guide meaningful materials research.

MEDE-DSC infrastructure is built on the SciServer platform. SciServer, an NSF Data Infrastructure Building Block (DIBB) center, combines core components for Big Data storage and computation to bring the computation to the data. In our implementation we focus on delivering materials science tools in a simple, robust package. The computing environment utilizes preloaded Docker containers built on the SciServer virtual machine, Linux architecture. Materials scientists and engineers access computing tools and data through a versatile, expandable Jupyter Notebook architecture. The combination of containers and notebooks brings power, consistency and clarity while moving towards reproducible, narrated computation. Ultimately, our hope is that MEDE-DSC’s Big Data tools provide materials scientists the opportunity to design a new class of research that fully utilizes modern instrumentation and simulation capabilities.

SciServer Compute MEDE Notebook

 

For more information, please contact Tamas Budavari (budavari@jhu.edu) or David Elbert (elbert@jhu.edu), or visit the JHU CMEDE website (https://hemi.jhu.edu/cmede/).