VOOZH about

URL: https://docs.aws.amazon.com/solutions/digital-connected-lab-on-aws/


Overview

This Guidance helps you to connect life sciences data instruments and laboratory system files to the AWS Cloud, either through the internet or a direct connection with low latency. You can cut down on storage expenses for data that gets accessed less often or make it accessible for high-performance computing for genomics, imaging, and other intense workloads, all on AWS.

How it works

This architecture diagram helps you learn how to connect file-based life sciences instruments and laboratory systems to the cloud and provide scalable data access and computing using Amazon Web Services (AWS).

👁 Architecture diagram
Step 1
A lab technician runs an experiment or test, and results are written to a folder on an on-premises file server. An AWS DataSync task is set up to sync the data from local storage to a bucket in Amazon Simple Storage Service (Amazon S3).
Step 2
Data is transferred to the AWS Cloud either through the internet, or through a low-latency direct connection that avoids the internet, such as AWS Direct Connect.
Step 3
Electronic lab notebooks (ELN) and lab information management systems (LIMS) share experiment and test metadata bidirectionally with the AWS Cloud through events and APIs. Learn more about this integration in Guidance for a Laboratory Data Mesh on AWS.
Step 4
Partnering entities, like a contract research organization (CRO), can upload study results to Amazon S3 by using AWS Transfer Family for FTP, SFTP, or FTPS.
Step 5
You can optimize storage costs by writing instruments data to an S3 bucket configured for infrequent access. Identify your S3 storage access patterns to optimally configure your S3 bucket lifecycle policy and transfer data to Amazon S3 Glacier.
Step 6
Using Amazon FSx for Lustre, data is made accessible to high performance computing (HPC) on the Cloud for genomics, imaging, and other intensive workloads to provide a low millisecond-latency shared file system.
Step 7
Bioinformatics pipelines are orchestrated with AWS Step Functions, AWS HealthOmics, and AWS Batch for flexible CPU and GPU computing.
Step 8
Machine learning is conducted with an artificial intelligence and machine learning (AI/ML) toolkit that uses Amazon SageMaker for feature engineering, data labeling, model training, deployment and ML operations. Amazon Athena is used for flexible SQL queries.
Step 9
For researchers using on-premises applications for data analysis and reporting, they view and access data in Amazon S3 by using Network File System (NFS) or Server Message Block (SMB) through Amazon S3 File Gateway.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Building Digitally Connected Labs with AWS

This post discusses the tools, best practices, and partners helping Life Sciences labs take full advantage of the scale and performance of AWS Cloud.

Guidance for a Laboratory Data Mesh on AWS

This Guidance demonstrates how to build a scientific data management system that integrates both laboratory instrument data and software with cloud data governance, data discovery, and bioinformatics pipelines, capturing key metadata events along the way.