An Introduction to Text Mining

An Introduction to Text Mining
Автор книги: id книги: 1937126     Оценка: 0.0     Голосов: 0     Отзывы, комментарии: 0 6481,87 руб.     (59,15$) Читать книгу Купить и скачать книгу Купить бумажную книгу Электронная книга Жанр: Социология Правообладатель и/или издательство: Ingram Дата добавления в каталог КнигаЛит: ISBN: 9781506337029 Скачать фрагмент в формате   fb2   fb2.zip Возрастное ограничение: 0+ Оглавление Отрывок из книги

Реклама. ООО «ЛитРес», ИНН: 7719571260.

Описание книги

Students in social science courses communicate, socialize, shop, learn, and work online. When they are asked to collect data for course projects they are often drawn to social media platforms and other online sources of textual data. There are many software packages and programming languages available to help students collect data online, and there are many texts designed to help with different forms of online research, from surveys to ethnographic interviews. But there is no textbook available that teaches students how to construct a viable research project based on online sources of textual data such as newspaper archives, site user comment archives, digitized historical documents, or social media user comment archives. Gabe Ignatow and Rada F. Mihalcea's new text  An Introduction to Text Mining will be a starting point for undergraduates and first-year graduate students interested in collecting and analyzing textual data from online sources, and will cover the most critical issues that students must take into consideration at all stages of their research projects, including: ethical and philosophical issues; issues related to research design; web scraping and crawling; strategic data selection; data sampling; use of specific text analysis methods; and report writing.

Оглавление

Gabe Ignatow. An Introduction to Text Mining

An Introduction to Text Mining

Brief Contents

Detailed Contents

Acknowledgments

Preface

Note to the Reader

About the Authors

1 Text Mining and Text Analysis. Learning Objectives

Introduction

Predicting the Stock Market With Twitter

Six Approaches to Text Analysis

Conversation Analysis

Analysis of Discourse Positions

Critical Discourse Analysis

Combining Critical Discourse Analysis and Corpus Linguistics

Content Analysis

Foucauldian Analysis

Analysis of Texts as Social Information

Challenges and Limitations of Using Online Data

Social Surveys

Ethnography

Historical Research Methods

Key Terms (see Glossary)

Review Questions

Discussion Questions

2 Acquiring Data. Learning Objectives

Introduction

Online Data Sources

Advantages and Limitations of Online Digital Resources for Social Science Research

Examples of Social Science Research Using Digital Data

Key Term

Discussion Questions

3 Research Ethics. Learning Objectives

Introduction

Respect for Persons, Beneficence, and Justice

Ethical Guidelines

Institutional Review Boards

Privacy

Informed Consent

Manipulation

Publishing Ethics

Scenario 1

Scenario 2

Scenario 3

Key Terms

Review Questions

Discussion Questions

4 The Philosophy and Logic of Text Mining. Learning Objectives

Introduction

Ontological and Epistemological Positions

Correspondence Theory

Coherence Theory

Pragmatism

Constructionism

Critical Realism

Metatheory

Grand Theory and Philosophical Positions

Meso Theory

Models

Substantive Theory

Making Inferences

Inductive Logic

An Inductive Approach to Media Framing

Deductive Logic

Abductive Logic

Key Terms

Discussion Questions

5 Designing Your Research Project. Learning Objectives

Introduction

Critical Decisions

Idiographic and Nomothetic Research

Levels of Analysis

The Textual Level

The Contextual Level

The Sociological Level

Texts as Social Information

Texts as Ideological Products

Qualitative, Quantitative, and Mixed Methods Research

Discourse Analysis

Content Analysis

Mixed Methods

Choosing Data

Data Selection

Data Sampling

Formatting Your Data

Key Terms

Review Questions

Discussion Questions

6 Web Scraping and Crawling. Learning Objectives

Introduction

Web Statistics

Web Crawling

Processing Steps in Web Crawling

Traversal Strategies

Crawler Politeness

Web Scraping

Software for Web Crawling and Scraping

Key Terms

Discussion Questions

7 Lexical Resources. Learning Objectives

Introduction

WordNet

WordNet Domains

WordNet-Affect

Roget’s Thesaurus

Linguistic Inquiry and Word Count

General Inquirer

Wikipedia

Wiktionary

BabelNet

Key Terms

8 Basic Text Processing. Learning Objectives

Introduction

Basic Text Processing. Tokenization

Stop Word Removal

Stemming and Lemmatization

Language Models and Text Statistics. Language Models

Text Statistics

More Advanced Text Processing

Part-of-Speech Tagging

Collocation Identification

Syntactic Parsing

Named Entity Recognition

Word Sense Disambiguation

Word Similarity

Key Terms

Discussion Topics

9 Supervised Learning. Learning Objectives

Introduction

Feature Representation and Weighting

Feature Weighting

Supervised Learning Algorithms

Regression

Decision Trees

Instance-Based Learning

Support Vector Machines

Deep Learning With Neural Networks

Evaluation of Supervised Learning

Key Terms

Discussion Topics

10 Analyzing Narratives. Learning Objectives

Introduction

Approaches to Narrative Analysis

Planning a Narrative Analysis Research Project

Analyzing Relationship Breakups

Qualitative Narrative Analysis

Mixed Methods and Quantitative Narrative Analysis Studies

Key Terms

Review Questions

11 Analyzing Themes. Learning Objectives

Introduction

How to Analyze Themes

Coulson’s Studies of Online Support Groups

Analyzing Climate Change Doubt

Examples of Thematic Analysis

Key Terms

Review Questions

12 Analyzing Metaphors. Learning Objectives

Introduction

Cognitive Metaphor Theory

Approaches to Metaphor Analysis

Qualitative, Quantitative, and Mixed Methods

Qualitative Methods Studies

Metaphors in Leadership Communication

Mixed Methods Studies

Quantitative Methods Studies

Key Terms

Review Questions

13 Text Classification. Learning Objectives

Introduction

What Is Text Classification?

A Brief History of Text Classification

Applications of Text Classification

Topic Classification

E-Mail Spam Detection

Sentiment Analysis/Opinion Mining

Gender Classification

Deception Detection

Other Applications

Approaches to Text Classification

Representing Texts for Supervised Text Classification

Feature Weighting and Selection

Text Classification Algorithms

Naive Bayes

Rocchio Classifier

Bootstrapping in Text Classification

Evaluation of Text Classification

Key Terms

Discussion Topics

14 Opinion Mining. Learning Objectives

Introduction

What Is Opinion Mining?

Studying Mood in the Humanities

Resources for Opinion Mining

Lexicons

Corpora

Eshbaugh-Soha’s Study of Presidential News Coverage

Approaches to Opinion Mining

Hand Coding Sentiment in Media

Key Terms

15 Information Extraction. Learning Objectives

Introduction

Entity Extraction

Relation Extraction

Web Information Extraction

Template Filling

Key Terms

16 Analyzing Topics. Learning Objectives

Introduction

What Are Topic Models?

Comparing the Language of Politicians and the Public

Studying Psychological Adaptation to Extreme Environments

How to Use Topic Models

Examples of Topic Modeling. Digital Humanities

Journalism Research

Political Science

Sociology

Key Terms

Review Questions

17 Writing and Reporting Your Research. Learning Objectives

Introduction: Academic Writing

Evidence and Theory

The Structure of Social Science Research Papers

Introduction

Literature Review

Methods

Results

Discussion

Conclusion

References

Appendices

Key Terms

General Undergraduate Research Journals

Anthropology Undergraduate Research Journals

Political Science Undergraduate Research Journals

Psychology Undergraduate Research Journals

Sociology Undergraduate Research Journals

Appendix A Data Sources for Text Mining. The American Presidency Project

arXiv Bulk Data Access

Category:Dataset

CMU Movie Summary Corpus

Congressional and Federal Government Web Harvests

Congressional Record

Consumer Complaint Database

Corpus of Contemporary American English

DocumentCloud

EBSCO Newspaper Source

GloWbE: Corpus of Global Web-Based English

HathiTrust

Internet Archive

JSTOR for Research

LexisNexis Academic

Observatory on Social Media

OpenLibrary

Public.Resource.Org

PubMed

Robots Reading Vogue

Text Creation Partnership

the @unitedstates project

University of Oxford Text Archive

Yahoo Webscope Program

Appendix B Text Preparation and Cleaning Software

Find and Replace

Regexes

Software

Adobe Acrobat

BBEdit

OpenRefine

TextCleanr

TextPipe

TextSoap

Trifacta Wrangler

UltraEdit

Appendix C General Text Analysis Software

Leximancer

Linguistic Inquiry and Word Count

RapidMiner

TextAnalyst

WordStat

Using TextAnalyst to Study Collective Identity

Appendix D Qualitative Data Analysis Software

Commercial Software. ATLAS.ti

Dedoose

f4analyse

HyperRESEARCH

Kwalitan

MAXQDA

NVivo

QDA Miner

Qualrus

Quirkos

Free and Open Source Qualitative Data Analysis Software

AQUAD

Cassandre

Coding Analysis Toolkit

CATMA

Compendium

FreeQDA

libreQDA

Open Code

QDA Miner Lite

RQDA

Saturate

Text Analysis Markup System

Text Analysis Markup System Analyzer

QDAS Tips

Internet Resources. CAQDAS Networking Project

Loughborough University’s CAQDAS Site

Appendix E Opinion Mining Software

Lexicoder

OpinionFinder

RapidMiner Sentiment Analysis

SAS Sentiment Analysis Studio

Appendix F Concordance and Keyword Frequency Software

Adelaide Text Analysis Tool

AntConc

Simple Concordance Program

TextSTAT

Wmatrix

WordSmith

Appendix G Visualization Software

Word Clouds

Word Trees and Phrase Nets

Matrices and Maps

Internet Resources. The Collaboration Site of Viégas and Wattenberg

“Visualizing the Future of Interaction Studies”

The Word Tree, an Interactive Visual Concordance

Wordle

TagCrowd

Appendix H List of Websites. General Text Mining Websites. The DiRT Directory

Loughborough University’s CAQDAS Site

The National Centre for Text Mining

The QDAS Networking Project

Text Analysis Portal for Research

Social Science Ethics Websites. Ethical Decision-Making and Internet Research: Recommendations From the AoIR Ethics Working Committee

The American Psychological Association Report Psychological Research Online: Opportunities and Challenges

The British Psychological Society’s Ethics Guidelines for Internet-Mediated Research

The Davis–Madsen Ethics Scenarios From the Academy of Management Blog Post “Ethics in Research Scenarios: What Would YOU Do?”

The Ethicist Blog From the Academy of Management

The Office of Research Integrity, U.S. Department of Health and Human Services

Social Science Writing Websites. The Social Science Writing Project

“What Is a Social Science Essay?”

“Becoming a ‘Stylish’ Writer: Attractive Prose Will Not Make You Appear Any Less Smart”

Open Access Journal Articles “Opening up to Big Data: Computer-Assisted Analysis of Textual Data in Social Sciences”

“Hypertextuality, Complexity, Creativity: Using Linguistic Software Tools to Uncover New Information about the Food and Drink of Historic Mayans”

“Text Mining Tools in the Humanities: An Analysis Framework”

“Mapping Texts: Visualizing American Newspapers”

Appendix I Statistical Tools

Reliability Coefficients

Analysis of Variance

Chi-Square Tests

Regression

Glossary

References

Index

Отрывок из книги

Research Design, Data Collection, and Analysis

Last but by no means least we thank our spouses and children Neva, Alex, and Sara, and Mihai, Zara, and Caius, for their patience with us and their encouragement over the many years of research, writing, and editing that went into this textbook.

.....

The philosopher and historian Foucault (1973) developed an influential conceptualization of intertextuality that differs significantly from Fairclough’s conceptualization in CDA. Rather than identifying the influence of external discourses within a text, for Foucault the meaning of a text emerges in reference to discourses with which it engages in dialogue. These engagements may be explicit or, more often, implicit. In Foucauldian intertextual analysis, the analyst must ask each text about its presuppositions and with which discourses it dialogues. The meaning of a text therefore derives from its similarities and differences with respect to other texts and discourses and from implicit presuppositions within the text that can be recognized by historically informed close reading.

Foucauldian analysis of texts is performed in many theoretical and applied research fields. For instance, a number of studies have used Foucauldian intertextual analysis to analyze forestry policy (see Winkel, 2012, for an overview). Researchers working in Europe (e.g., Berglund, 2001; Franklin, 2002; Van Herzele, 2006), North America, and developing countries (e.g., Asher & Ojeda, 2009; Mathews, 2005) have used Foucauldian analysis to study policy discourses regarding forest management, forest fires, and corporate responsibility.

.....

Добавление нового отзыва

Комментарий Поле, отмеченное звёздочкой  — обязательно к заполнению

Отзывы и комментарии читателей

Нет рецензий. Будьте первым, кто напишет рецензию на книгу An Introduction to Text Mining
Подняться наверх