Читать книгу Machine Learning Techniques and Analytics for Cloud Security - Группа авторов - Страница 65
3.5 Application in Cloud Domain
ОглавлениеCloud computing is a terminology widely used in the field of information technology. It illustrates the basic idea about how an end user can avail different types of resources related to IT like software services and hardware resources. There is no standard accepted definition about cloud is available. But still, it could be defined as a set of virtualized computers which are interconnected and provisions are made dynamically to make them available as computing resources depending on service level agreement. Several categories of services are available in cloud computing domain.
Infrastructural service: Here, different computational resources like processors and storage are provided to the end users in raw format. In this model, users are allowed to install different supplications as well as the operating system in the infrastructure provided to them. This can be thought as users are getting some space for computational purpose on a rental basis. So, the cloud domain can be effectively utilized as research tool for genomic study.
Platform service: While developing and launching new software applications this type of service becomes very important as it needs proper platform for the purpose of implementation.
Table 3.2 Resultant genes (gene symbols) identified by PC-LR method.
Significant true positive genes for colon cancer | |||
---|---|---|---|
MSH2 | IGF1 | PKM | IL33 |
TP53 | CCND1 | MIF | ITGA5 |
VEGFA | VDR | TERT | CSK |
PTGS2 | IGF1R | TAC1 | SDC2 |
AKT1 | HIF1A | CDKN1A | EGFR |
Software services: This type of service is required by users for different applications like Dropbox if storage is an issue or Google Docs in case the requirement is an application which is as good as word processor.
In the present article, our focus is to avail infrastructural service in the field of cloud computing or more specifically storage as a service. Our study is based on processing of biological data and generates resultant bio markers. As we have developed a model where gene expression data of both normal and cancerous state is analyzed and identified, the cancer mediating genes can be beneficial in the field of medical science and can help biologists in different perspective.
The biggest challenge for research community in the field of genomic science is to develop infrastructure with a huge number of computers and some efficient software tools for analyzing the genomic datasets more exhaustively in the field biomedical research and to some extent in clinical practice. People who are doing research in this domain are getting toward cloud domain. To find a solution of different biomedical problems, it is very much important to analyze data effectively. Thus, integrating data from genomics, systems biology, and biomedical data mining always becomes promising one [24]. In our proposed model, we have worked on a dataset as a file (.csv format), and after processing by the developed methodology, we have produced a resultant dataset which is again sorted in a text file. So, our concerns how all these data can be made available in cloud environment so that it can be accessed by other user of the research community for further progress. But there are some parameters of concern [25].
In the domain of cloud computing maintaining the secrecy of the data is a major concern that needs special attention with utmost priority. As we are here only concerned about the confidentiality of the data at the same time in a simplified manner without going insight the architectural detail. This also attracts the other benefits and advantages of cloud computing like lowering cost and greater efficiency. Besides, these other points of concern are data security and confidentiality. In cloud service, there are many commercial offering but these are heterogeneous in nature and deals with different needs which depends on the customers. The primary contestants in this field are Microsoft Azure, Google AppEngine, Amazone Web Service (AWS), IBM cloud, and many more. Amazon Simple Storage Service also known as Amazon S3 provides an object-based storage service that offers scalability considered as industry-leading, security, performance, and, of course, the availability of data. As our requirement is to store the files and get the security over the dataset so Amazon Web Service can be a good choice as because, AWS provides a Simple Storage Service (S3) for storing of data. It provides object storage to all the software developers and group of people related to IT which is highly secured, scalable, and durable as well. It offers a web interface which is easy to use and provides facility to store and retrieve data from anywhere on the web without considering the amount of data being consumed. It is a place where we can store our files on the AWS cloud Dropbox by simplifying the user interface of S3. The Dropbox here acts as a layer built on top of S3. Data is spread across multiple devices and facilities. Although S3 can be used for many purposes but in the present context, it can be used as storing files in Buckets/Folders in a secured way.
It is to be noted that as security is a major issue so storing the data in Amazon S3 and keeping it secure from the other users is a major parameter to be considered. It has to be implemented by applying encryption features and with different access management tools.
Figure 3.8 Storing and accessing the data values in Amazon S3.
The only available object-based storage service is S3 that can block public access to all the objects stored in the bucket. It can also perform the account level restriction with S3 Block Public Access mechanism. In order to ensure that different objects will never have public access, presently or in the future, S3 Block Public Access provides various controls across in different level like the entire AWS Account or at the individual bucket stage. Objects and buckets are given public access either by access control lists (ACLs), or policies framed for the bucket, or sometimes by using both. For ensuring blocking of public access to all the S3 buckets and objects, it is required that at account level we should switch on block all public access. These settings are utilized will all the account for all the buckets used in present or in future. Although restricting the public access is suggested by AWS by turning on the block option but while doing so it must be ensured that all applications can run properly without having public access. We can configure the settings as per the requirement at individual level below to fit our unique storage use cases for some degree of public access for the objects and buckets. Public access permission of S3 can be redefined by Block Public Access defined by S3. By doing so, it becomes easier for the administrator for setting up a centralized control system which can prevent any changes in the configuration of the security mechanism, no matter in which way the insertion of an object or bucket is formed. While writing an object to an S3 bucket or AWS Account having S3 Public Access Block, and if some form of public permission is designated by any object through ACL or by means of any Policy, then blocking of those public permissions will still remain. Figure 3.8 gives the idea about how to store/access the data in AWS S3.