Читать книгу Semantic Web for Effective Healthcare Systems - Группа авторов - Страница 27

1.5.3 OnSI Model Evaluation

Оглавление

OnSI evaluation module includes query processing, tagging, and Ontology mapping for feature scoring. It retrieves the relevant feature (or topic) and its score from the built domain Ontology for the set of query terms.

(i) Query Processing

Searching data is made easier and speed up when they are contextually grouped along with indexing. Else searching data in relational schema is quite expensive. Each query document is pre-processed and the PoS tagged nouns are sent to Ontology.

(ii) Ontology Mapping

The resultant features from the domain Ontology are mapped to the closest feature using their feature score. It enables to retrieve the relevant feature and its score for the query documents. Sparql, a XML-based query language is used to retrieve data from the built Ontology. It is described as follows:

Let Q be the set of query documents of product/service reviews written for F features using T terms. These documents are pre-processed and the nouns are extracted from them. Each query document is represented by q = {t1, t2, …, tn}, where t is the term representing the feature f. The function M(t, f) maps each term t ϵ T with the feature f ϵ F and returns the fScore. This module addresses the four different types of queries, and it is shown in Figure 1.10.


Figure 1.10 Ontology mapping using OnSI model.

(iii) Retrieval Process From Ontology

When Ontology is queried using Sparql, the query language for Ontology, there could be four different types of queries. The procedure for retrieving data from Ontology is explained:


Type 1: Terms under only one feature

If all terms t1, t2, …, tn mapped to the same feature f, then f is returned as the result with the cumulative LDA scores of each term. For all i, 1 < i < n, there exists f such that


For example, “treatment is good” is considered Type 1 query. Here, the word treatment is extracted. It comes under the feature (or topic) “Medicare,” and it is returned as the feature.

Type 2: Terms under multiple features, each term under only one feature

If terms { ti } mapped to feature +fa ϵ F, and terms {tj} mapped to feature fb ϵ F, then f whose fScore is higher is returned. In this case, the cumulative fScore is computed for each feature f and the feature with higher score is selected. Cumulative fScore of feature f is determined by the sum of LDA scores of terms corresponding to the feature f.


For example, “the treatment is good but the rent is costly” is considered as Type 2 query. Here, the terms treatment and rent are extracted. The term treatment comes under the topic “Medicare,” while the terms rent comes under the topic “Cost.”

Type 3: Terms under multiple features, one term under more than one feature

If a term ti is mapped to feature fa, fb ϵ F, then nearest terms, say ti-1 or ti+1 is considered to determine the feature category of term ti present in the document. In this case, the term is associated to a feature by comparing the fScore of each feature. The cumulative fScore is calculated for each feature f, and the resultant strong feature(s) is returned as the result. Cumulative fScore of feature f is determined by the sum of LDA scores of terms corresponding to the feature f.


For example, “the rooms are maintained neatly but the room rent is costly” is considered Type 3 query. Here, the terms rooms, room, and rent are extracted. The term room comes under both the topics “Infrastructure” and “Cost,” and the term rent comes under the topic “cost.” The term room needs to be fixed under a single topic (either infrastructure or cost). It is done by calculating the cumulative scores of features (or topics) under which the term occurs. Suppose, LDAscore(room, Infrastructure) = 0.17, LDAscore(room, Cost) = 0.38, and LDAscore(rent, Cost) = 0.26. Then cum_fScore(f = “Infrastructure”) is 0.17, and the cum_fScore(f = “Cost”) is 0.64. Since 0.64 > 0.17, the term room is assigned the feature as “Cost” in this context.

Type 4: Term not present in the Ontology

If given term is not present, new LDA score is computed for it and update-Ontology() is used for the new term. If ontoMap(t) is null, where t ϵ T, then the Ontology needs to be updated with the new term. CFSLDA modeling is done again for the Ontology update. The process of querying is repeated as one of the other three types described earlier.

Semantic Web for Effective Healthcare Systems

Подняться наверх