A modern digital business must be able to collect, analyze, and derive value from data, at an increasingly large scale. Successful firms are truly data-driven. This is especially true in life science organizations, where several different forms of data analytics are essential. Maverick worked with a major life sciences organization to address this demand and support key tasks, including:
A modern digital business must be able to collect, analyze, and derive value from data, at an increasingly large scale. Successful firms are truly data-driven. This is especially true in life science organizations, where several different forms of data analytics are essential. Maverick worked with a major life sciences organization to address this demand and support key tasks, including:
The Maverick technology team worked with the customer as it substantially upgraded the technology that supported these activities
The analytic and high-performance computing tools the organization currently uses were not up to the demands being put on them. In addition, the use of cloud was very limited, leading to higher costs and limited HPC resources during peak times. The company needed a new, modern, and extensible HPC infrastructure that could support current and future analytics activities.
The Maverick team started the process by gathering a complete and comprehensive set of requirements that would drive the decisions around the HPC infrastructure. This infrastructure had to include not only on-premises resources, the inclusion of their preferred cloud service provider, AWS, was also necessary. After a complete review, the Maverick team built a design for new HPC infrastructure that includes:
A key part of this solution was the use of a colocation facility to get tens of gigabytes of connectivity to cloud resources with sub-millisecond latency. This proximity makes it possible for us to develop hybrid workflows that incorporate cloud and on-premises resources for greatest effectiveness and efficiency. We also designed high-speed parallel storage infrastructure with tiering. Most active data is kept on local, high speed NVMe, and moves less used data into S3. This tiering allows us to preserve the NVMe storage for active, in-demand data, while pushing less used “colder” data into the elastic S3 tier that lowers storage costs.
Not only does this solution give researchers the performance they need, but it was also equally important that we help scientific teams access and manage their data at scale. We implemented Starfish to help with data management. Starfish sits alongside our data storage platforms and scans the files periodically. These scans are fed into a database and there are several associated tools that allow automated management.
The solution has only recently been brought into production. The documentation on benefits from a full rollout are yet to come. As Maverick has onboarded the first teams to this new infrastructure, it has seen an eager uptake of these new capabilities. Teams are quickly recognizing the capabilities of this new platform and the benefits to their work. These teams are now able to utilize more data, generate results more quickly, and drive their processes forward faster.