University of California

Pilot Project

Overview

The pilot is a collaborative effort of the UC CIOs and VCRs to identify a limited set of projects that will benefit from shared compute resources to demonstrate how greater research capacity and capability can be provided in a shared environment. UC will be among the first to do this as a public university system.

Why is this pilot named ShaRCS?

The pilot project name, ShaRCS pronounced sharks, is a play on an acronym for Shared Research Computing Services pilot. The pronunciation of this word inspired the names of the northern and southern cluster, Mako and Thresher, respectively. Mako and Thresher are both types of shark species.

Goals and Objectives

This pilot project has been designed to define, demonstrate and measure how shared research computing and storage clusters residing in regional data centers can provide computing services to principal investigators (PIs). The Pilot needs to show that these research computing services provide better capabilities than can be individually developed, reduce the overall cost to UC, and retain low-barrier-to-access service to the PIs. In particular, the pilot will determine how best to deploy and sustain an economies-of-scale alternative to current costly and performance-inefficient campus facilities.

Benefits

UC as a whole will benefit by learning how to create research cyber-infrastructure that works effectively at the UC-wide level and leverages existing resources. In addition, researchers will benefit by having privileged access to specialized services and systems designed for the type of research computing projects already in progress on the campuses and the best available computing technology to develop and enhance scientific software tools, gaining recognition and exposure for their projects and the university. Fostering collaboration among researchers at all UC campuses as well as recruiting and retaining outstanding researchers are also significant benefits to having this shared cluster computing service.

High-Performance Computing Clusters

The ShaRCS pilot project is deploying two 272-node, dual-socket, quad-core Nehalem processor Linux clusters with a Quad Data Rate Infiniband interconnect; the clusters are managed by the San Diego Supercomputer Center (SDSC), a research unit of UCSD; and Lawrence Berkeley National Laboratory (LBNL), a DOE-funded national laboratory managed by UC. Both north (LBNL) and south (SDSC) centers have a long history of delivering state-of-the-art high-productivity compute facilities, and this high standard continues in the deployment of the ShaRCS high-performance computing clusters.

UC-Wide Involvement

Participation in this pilot project includes LBNL and nine of the 10 UC campuses. Thirteen research projects will utilize the North Cluster and 10 will utilize the South Cluster.

Campuses represented in the North Cluster:

  • Lawrence Berkeley National Laboratory
  • UC Berkeley
  • UC Davis
  • UC San Francisco
  • UC Santa Cruz

Campuses represented in the South Cluster:

  • UC Irvine
  • UC Los Angeles
  • UC Riverside
  • UC San Diego
  • UC Santa Barbara

Pilot Principal Investigators

The roughly two dozen projects proposed by teams from across UC were selected on the basis of their capacity to

  1. advance research in priority areas, such as global health and environmental science
  2. become more competitive for obtaining extramural support
  3. nucleate new communities of cyber-enabled research in areas like the social sciences, arts, and humanities

The following projects across the University of California were selected for the initial phase of ShaRCS.

Project Campus Principal Investigator(s)
Climate Modeling Capacity Berkeley John Chiang, Thomas Zach Powell, Inez Fung, Ron Cohen
Comparative Genomics Cyberinfrastructure Needs; Understanding Diversity in Microbial Community Sequencing Berkeley Steven Brenner
Phylogenomics Cyberinfrastructure for Biological Discovery Berkeley Kimmen Sjolander, Steven Brenner, Jasper Rine
Optimized Materials and Nanostructures from Predictive Computer Simulations Davis Giulia Galli, Francois Gygi
Hydrology Analysis Cyber-infrastructure Proposal Irvine Soroosh Sorooshian, Sue Bryant, Bisher Imam
Simulation and Modeling of biological molecules Irvine Doug Tobias
Speeding the Annotation and Analysis of Genomic Data for Biofuels and Biology Research LBNL Adam Arkin, Dylan Chivian, Paramvir Dehal, Paul Adams
CCSM to Study New Biofuels with Carbon Cycles LBNL, Berkeley Bill Collins
Research in the Physics of Real Materials at the Most Fundamental Level Using Atomistic First Principles (or ab initio) Quantum-Mechanical Calculations LBNL, Berkeley Steven Louie, Jeffrey Neaton
Universe-Scale Simulations for Dark Energy Experiments LBNL, Berkeley Martin White, David Schlegel
Nano-system Modeling and Design of Advanced Materials Los Angeles Nasr Ghoniem
Organic Reaction Mechanisms and Selectivities, Enzyme Design, and Material and Molecular Devices Los Angeles K.N. Houk
Particle-in-cell Simulations of Plasmas Los Angeles W.B.Mori, V.K.Decyk, F.S.Tsung, P.Pritchett, J.Tonge
Space Plasma Simulations Los Angeles Maha Ashour-Abdalla
Oceanic Simulation of Surface Waves and Currents Los Angeles, Santa Barbara J.C. McWilliams, A.F. Shchepetkin, and Yusuke Uchiyama
Dynamics and Allosteric Regulation of Enzyme Complex Riverside Chia-en Angelina Chang
Functional Theory for Multi-Scaling of Complex Molecular Systems and Processes Riverside Jianzhong Wu
Establishing CI Capable of Capture and Analysis of Next-Generation Sequencing Data San Diego Trey Ideker
Physics-Based Protein Structure Prediction San Francisco Ken Dill
Computational Chemistry and Chemical Engineering Projects Santa Barbara Joan Shea, Baron Peters
Development and Mathematical Analysis of Computational Methods Santa Barbara Paul Atzberger
California Current System Santa Cruz Christopher Edwards
Convection and Magnetic Field Generation Santa Cruz Gary Glatzmaier

These are the PIs currently in production on the clusters. Some changes have been made based on the original 24 pilot PI projects.