Читать книгу Digital Transformation of the Laboratory - Группа авторов - Страница 20

1.2.4.3 New Data Management Developments

Оглавление

So what new developments in data management will be prevalent in both the hypothesis‐ and the protocol‐driven labs of 2030? In the previous two sections we asserted that these labs will be populated by fewer people; there will be more robotics and automation, and the experiment throughput will be much higher, often on more miniaturized equipment. Building on these assertions then, perhaps the most impactful developments in the data space will be:

1 The all pervasiveness of internet of things (IoT) [25, 26]. This will lead, in the LotF, to the growth of the internet of laboratory things (IoLT) environments; this will also be driven by ubiquitous 5G communications capability.

2 The widespread adoption of the findable, accessible, interoperable, and reusable (FAIR) data principles. These state that all data should be FAIR [27].

3 The growing use of improved experimental data and automation representation standards, e.g. SiLA [28] and Allotrope [29].

4 Data security and data privacy. These two areas will continue to be critical considerations for the LotF.

5 The ubiquity of “Cloud.” The LotF will not be able to operate effectively without access to cloud computing.

6 Digital twin approaches. These will complement both the drive toward labs operating more as a service and the demand for remote service customers wanting to see into, and to directly control from afar what is happening in the lab. Technologies such as augmented reality (AR) will also help to enable this (see Sections 1.2.5 and 1.2.6).

7 Quantum computing [30–33]. This moves from research to production and so impacts just about everything we do in life, not just in the LotF. Arguably, quantum computing might have a bigger impact in the more computationally intensive parts of the hypothesis‐ and protocol‐driven LotF, e.g. Idea/Hypothesis design and Analyze/Insight, but it will still disrupt the LotF massively. We say more on this in Sections 1.2.5 and 1.2.6.

The first three of these developments are all related to the drive to improve the speed and quality of the data/digital life cycle and the overall data supply chain. That digital life cycle aligns closely to the HEAS and REAF processes outlined in Figures 1.3 and 1.4 and can be summarized as follows (see Figure 1.5):


Figure 1.5 Digital data life cycle.

IoT technology [34] will allow much better connectivity between the equipment in the LotF. This will enable better, quicker, and more precise control of the lab kit, as well as more effective capturing of the raw data off the equipment. This in turn will allow the next stage in the life cycle – “Analyze Data” – to happen sooner and with more, better quality data. This improved interconnectedness in the lab will be made possible by the same 5G communication technology which will be making the devices and products in the home of 2025 more networked and more remotely controllable.

As improved instrument interconnectedness and IoLT enable more data to be captured by more instruments more effectively, the issue of how you manage the inevitable data flood to make the deluge useful comes to the fore. The biggest initiative in 2020 to maximize the benefits of the so‐called big data [35] revolves around the FAIR principles. These state that “for those wishing to enhance the reusability of their data holdings,” those data should be FAIR. In the LotF, the FAIR principles will need to be fully embedded in the lab culture and operating model. Implementing FAIR [36] is very much a change process rather than just introducing new technology. If fully implemented, though, FAIR will make it massively easier for the vast quantities of digital assets generated by organizations to be made much more useful. Data science as a discipline, and data scientists (a role which can be considered currently to equate to that of “informatician”), will grow enormously in importance and size/number. Organizations that are almost purely data driven will thrive, with any lab work they feel the need to do being outsourced via LaaS [37] to flexible, cost‐effective LotFs that operate per the REAF process.

Supporting the growth of FAIR requires the data that is generated in these LaaS LotFs to be easily transferable back to the requester/customer in a format which the lab can generate easily, accurately, and reproducibly, and which the customer can import and interpret, again, easily, accurately, and reproducibly. This facile interchange of “interoperable” data will be enabled by the widespread adoption of data standards such as SiLA and Allotrope. We describe these new data standards in more detail in the following section.

Two additional, significant data considerations for the LotF are those of data security and data privacy, just as they are now. The more LotF services that are operated outside the “firewall” of an organization, and the more that future labs are driven by data, the more risks potentially arise from accidental or malicious activities. Making sure that those risks are kept low, through continued diligence and data security, will ensure that the LotF is able to develop and operate to its full capability. Similarly, in labs that work with human‐derived samples (blood, tissues, etc.), the advent of regulations such as the General Data Protection Regulations (GDPR) [38, 39], along with the historical stringency surrounding informed consent [40] over what can happen to human samples and the data that arises from their processing, will put even more pressure on the organizations that generate and are accountable for human data to ensure these data are effectively secured. Improved adherence to the FAIR data principles, especially Findability and Accessibility, will ensure that LotFs working with human‐derived materials can be responsive to data privacy requests and are not compromised.

Going hand in hand with the data explosion of the past decade has been the evolution of the now ubiquitous, key operational technology of “Cloud Computing.” As explained by one of the originating organizations in this area, “cloud computing is the delivery of computing services – including servers, storage, databases, networking, software, analytics, and intelligence – over the Internet (the cloud) to offer faster innovation, flexible resources, and economies of scale.” [41] In the context of LotF, assuming that the equipment in the lab is fully networked, cloud computing means that all the data generated by the lab can be quickly, fully, and securely captured and stored on remote infrastructure (servers). This book is not the place to describe cloud computing in detail, but it should be sufficient to say that the LotF will not be reliant on IT hardware close to its location (i.e. on‐site) but will be highly reliant on speedy, reliable, available networks and efficient, cost‐effective cloud computing.

Finally, there is a data and modeling technology, which has been present in industries outside life science for many years, which could play a growing role in the LotF which is more automated and more remote. This is the technology termed “digital twin.” [42, 43] We say more on this exciting new technology in Section 1.2.5.1.

Digital Transformation of the Laboratory

Подняться наверх