Computing for Health
From the burgeoning science of genomics to the meticulously detailed pictures created by medical imaging, almost every discipline in health-care is dealing with a "Data Deluge" influx of data. Translating this into a deliverable which provides benefit to patients requires massive amounts of computation. Not only does this environment need to use the fastest available computer hardware, but it also needs to use those resources efficiently. Most importantly, as custodians of personal health information (PHI), hospitals must do all of this in an environment which strikes a balance between access to compute resources and protecting patient confidentiality and privacy. HPC4Health is a consortium of health providers who are working together to build this next-generation of compute engine for clinical research.
Why Does Healthcare Need HPC?
Big Compute, Big Data, and Big Privacy
"Big Data", and its close cousin, "Big Compute" are buzzwords we've all heard before. In truth, they can mean different things to different people. For the modern day clinical research hospital it is not an understatement to say that both the volume of information, and the concurrent demands on computers to do useful things with that data, have increased exponentially. This is especially true for the fields of genomics and pathological image analysis, where datasets span many tens of TB of disk space and may take days to run computations on. As technologies such as DNA sequencing move into clinical practice, access to HPC and storage also needs to be guaranteed to ensure timely processing and turnaround of critical patient information into their clinicians hands. On top of this growing demand for HPC, hospitals are also the custodians of their patients PHI (personal health information) and, as such, have an obligation to ensure best practices are in place to maintain that trust and maintain patient confidentiality (including their DNA sequence and and identifiable data, for example) is kept private. HPC4Health grew out of these shared challenges hospitals were facing.
The Technology Behind HPC4Health
In the world of computers, what we mean when we say "cloud" has in truth existed in primitive form for quite some time, spanning back to thin client terminals attached to mainframes in the 1960's and 1970's. But it was really in the late 2000's when software platforms like Eucalyptus and OpenStack built solutions with virtualization technologies that basically gave us true IaaS (Infrastructure as a Service). In other words, the ability for users to request compute resources without knowing specifics about underlying infrastructure architecture in data centres. This is how we use the term cloud. In the HPC4Health, we have a base HPC resource (7000 cores of CPU) infrastructure located in our data centre, and each health care institution can access their own, fully private, cloud. Each institution is guaranteed a minimum amount of CPU cores when they need it (80% of what they put into the project). This allows the offering of service on demand for tasks such as high priority clinical applications. The rest of the infrastructure (the 20%) is shared: each institution can grow the size of their computing ability if there are free resources not being used. A unique aspect of the the HPC4Health is that we have taken the cloud concept and extended it by automating this access to HPC resource allocation through a centralized "brain" that monitors the usage of each institutions private cloud and expands and shrinks their size based on demand. Thus by having this type of shared cloud model we take advantage of each others excess resource capacity without having to have users request the services themselves.