Читать книгу Semantic Web for Effective Healthcare Systems - Группа авторов - Страница 17

1.2 Related Work

Ontology facilitates the shared understanding among the people by formalizing the conceptualization of a specific domain. The contextual representation of data semantics is well described by the Ontology [8]. Ontology defines concepts (domain) by using the common vocabulary and describes attributes, behavior, relationships and constraints. The UML diagrams along with The interactions between proteins and genes are well explained by Ontology representation which would support the biologists for classification [9]. Reviews on hotels and movies are classified using the rule-based systems and Ontology [14–16]. Document annotation and rules were used to create knowledge base of web documents from the extraction of semantic data like named entities [14, 17, 18]. Ontology learning and RDF repositories were used for building the knowledge and information management which in turn enabled the automatic annotation and retrieval of documents [19]. Wordnet Ontology was used in extracting the sentiments based on lexicon dictionaries [20, 21].

Information extraction process uses Ontology for understanding the domain and for extracting the relevant information. Its complexity is reduced as it is domain specific. IE techniques are then used for populating and enhancing the Ontology. These Ontologies can be enriched from the useful sources of knowledge [7]. SVM classification along with SentiWordNet enabled the building of sentiment dictionary for positive and negative categorization of text documents [23]. Opinion extraction techniques along with entropy-based classification techniques are used for building structured Ontology for the datasets Digital Camera [24]. Classification of products and their attributes based on their hierarchy was done using the hierarchical learning sentiment ontology tree (HL-SOT) algorithm which in turn used for opinion mining of products and their features [25].

Knowledge base refers the dictionary for the vocabulary used to represent concepts of a specific domain. The Ontology provides the semantic knowledge for class instances like a dictionary. The meaning of the documents may be extracted using the semantic-based approach by establishing the suitable context within the document, instead of using terms present in the document. Related terms were extracted and categorized using the semantic-based approaches like LSI [11] and LDA [13] techniques. Ontology-based sentiment analysis model was developed for mining product features from customer reviews [1]. Ontology along with Genetic Algorithm, a hybrid-model, was used for automatic grouping of Chinese proposals into different clusters resulted in >90% F-measure value [26]. Sentiment lexicons of emotional categories were derived from the twitter posts of mobile products by using Ontology learning and the lexicon-based techniques [27]. Ontology and vector analysis method was used in feature selection and sentiment analysis of movie reviews [22]. Ontology-based sentiment analysis model along with rule-based classification was used in the postal services of United States and Canada [28]. Sentiment grabber model was developed using Ontology, probabilistic LDA and text annotations [13]. Hotel reviews were automatically classified by using SVM and fuzzy domain ontology [29].

Research on user-generated content was also focused on the lexicon-based or linguistic-based approach. Named entity recognition, feature extraction, reliability of content, language used are some of the challenges exist for text analysis. Information Retrieval (IR) techniques like Vector Space Modelling, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) were used for transforming unstructured free text into structured format. Words which do not represent entities were removed from consumer product reviews by using PMI measure, to improve the precision of feature extraction method [30]. In the lexicon-based approach, positive and negative words were extracted from the opinions [31–34], and overall sentiment aggregation was determined for the documents. Words or phrases in the presence of conjuncts and connectives were considered to build word dependencies. Sentiment analysis was then done using Naive Bayes classification algorithm [31, 34]. Opinion observer was built using NLP techniques for detecting the polarity of opinions and by using the opinion aggregation function [35]. Automatic extraction of adjectives related to sentiments from blogs and reviews was proposed and used association rule mining for building the dictionary [32], which resulted in the accuracy of more than 70% for positive adjectives and more than 60% for negative adjectives. Similarly, sentiment dictionaries were created using naive Bayes algorithm and NLP techniques for developing opinion mining model for film reviews [36]. Poirier et al. concluded that machine learning algorithms were suitable for larger data set, whereas linguistic methods were suitable for smaller data set.

Double propagation method was proposed for the retrieval of new sentiments from sentences and positive or negative polarity was assigned for them [33]. Product’s features were extracted using unsupervised learning techniques [11] from the review documents, and words belong to the same concept are grouped using Latent Semantic Association (LaSA) model. Text analysis and statistical techniques were used to rank the product quality from their websites [31]. NLP techniques were used to identify the most frequently used positive and negative sentiment words for the classification of movie documents [37]. Non-negative matrix factorization and clustering techniques were used for retrieving suitable answers for the given query as a text summarization technique [38]. Lexicon-based NLP techniques were used to extract conjunctions, connectives, modals and conditionals for sentiment polarity detection of tweets [34]. Basiri et al. [39] used Dempster–Shafer theory for sentiment aggregation at document level using the mass function. The probabilistic based Latent Dirichlet Allocation (LDA) was used for annotation of semantics in text documents [13].

The user-generated content, which are in unstructured or semi-structured format, can be converted into structured format using NLP and machine learning techniques, and is made available for decision-making purposes. Multi-Criteria Decision Making (MCDM) techniques are used in different sectors like in fast food restaurants for measuring service quality [40], for ranking universities [41] and in different simple and complex industrial applications [42–45]. Customer lifetime value and their loyalty were evaluated based on the hybrid approach by combining Analytic Hierarchy Process (AHP) and association rule mining [46]. The best alternative for oil project fields was evaluated using AHP for weights identification and fuzzy TOPSIS for ranking process [44] and as the service quality indicators for tourism industry in Iran [47]. Different MCDM techniques along with statistical techniques were applied in different sectors like healthcare sector [48], movie recommender systems [49] for its performance measurement so as to improve its quality of services. MCDM technique like VIKOR was used for the measurement of customer satisfaction and ranking of mobile services [50] and for ranking the suppliers [51].

Ontology learning includes extraction of domain terms from the sources, modelling of data through Ontology development and easy retrieval while querying. Manual building of Ontology takes greater effort and it is complex and challenging. This motivates the researchers to automatically generate Ontology for the domain specific terms present in the social media reviews written for a product/service. The Ontology-based Semantic Indexing (OnSI) method tags concepts and attributes, into the Ontology using the contextually related words. It enables query processing and further information retrieval processing easier in subsequent steps. This semantic-based approach of indexing improves higher accuracy while identifying the concepts or attributes (or features) from the contents of text documents [26, 27]. Ontology-based approach for mobile product review classification was resulted in precision 75% and in recall 40% [52], and recall more than 82% [27].

Semantic Web for Effective Healthcare Systems

Подняться наверх