Читать книгу Semantic Web for Effective Healthcare Systems - Группа авторов - Страница 33

1.7.3 Discussion 3

Оглавление

Generally, the term-document (TD) matrix is stored in .csv format which takes megabytes of storage whereas the .owl format, the Ontology file, takes only kilo bytes of storage. For example, size of .csv file was 3.5 MB (approx.) when review documents were converted into TD matrix for the dataset DS1. Each review document consumes 1 kB (approx.) storage and also it depends on the number of terms present in the dataset. However, DS1 takes only 360 kB (approx.) when .owl format is used.

This chapter focused on building of Ontology for the contextual representation of user-generated content, i.e., the review documents. The contextually aligned documents are represented in domain Ontology along with the semantics using the Ontology-based Semantic Indexing (OnSI) model.

It has been identified that the modeling of documents greatly impacts the query processing time and its recall value. The OnSI model improves the recall value by 27% and reduces the time by 1.53 s, when compared Naïve Bayes technique. Similarly, it improves the recall value by 20% and reduces the time by 1.8 s, when compared with k-means algorithm. The LDA parameters and the right choice of their values along with the correlation analysis involved in the CFSLDA feature selection process improve the accuracy of model. Having the right of features in hand, the contextual feature-based sentiment analysis and predictive analytics are possible with the dataset using supervised machine learning techniques.

Semantic Web for Effective Healthcare Systems

Подняться наверх