Читать книгу An Introduction to Text Mining - Gabe Ignatow - Страница 65

Inductive Logic

Оглавление

Inductive logic involves making inferences that take data as their starting point and then working upward to theoretical generalizations and propositions. Researchers begin by analyzing empirical data with their preferred tools and then allow general conclusions to emerge organically from their analysis (see Figure 4.1). The first ethical scenario in Chapter 3 is an example of a researcher relying exclusively on induction.

When qualitatively oriented researchers use inductive logic, they often position their research as grounded theory (Glaser & Strauss, 1967), while more quantitatively oriented researchers refer to data mining. Both grounded theory and data mining are used extensively in text mining research.


Figure 4.1 ∎ Inductive Logic

The use of inductive inference is attractive to social scientists for several reasons. First, it allows them to work with data sets and specialized tools quickly without having to invest time mastering abstruse philosophical debates and theories or setting up complex research designs. It also allows for great flexibility and adaptability, as analysts can allow their data to speak to them and adjust their conclusions accordingly rather than imposing a priori categories and concepts onto data in an artificial manner. And inductive research designs allow quantitatively oriented researchers, in particular, to immediately make use of the very large data sets and powerful software and programming languages that are at their disposal.

In its purer forms, induction has some serious drawbacks. First, it encourages analysts to begin research projects without first formulating a research question. Researchers simply assume that the project’s purpose will become evident during its analysis phase. But there is a very real risk that this simply will not happen, and the researcher will have invested significant time and resources in a directionless and perhaps purposeless project.

Another drawback of purely inductive research is that it can encourage researcher passivity with regard to mastering the research literatures in their areas of interest. Rather than mastering the work that has been done by others so as to identify gaps in knowledge, unsolved puzzles, or critical disagreements and then designing a study to address one or several of these, induction encourages researchers to skip straight to data collection and analysis and then work backward from their findings to the pertinent gaps in the literature, puzzles, and disagreements. In practice, this is often a high-risk strategy.

Although relying exclusively on inductive inferential logic is a risky and sometimes dangerous strategy, induction does end up playing a role in most text mining research projects. The complexity of natural language data demands that researchers allow their data to alter their theoretical models and frameworks rather than forcing data to conform to their preferred theories.

An example of a text mining study with a research design based on inductive logic is Frith and Gleeson’s 2004 thematic analysis of male undergraduate students’ responses to open-ended survey items related to clothing and body image. The undergraduates in the study were recruited through snowball sampling (see Chapter 5). In order to better understand how men’s feelings about their bodies influence their clothing practices, Frith and Gleeson analyzed written answers to four questions about clothing practices and body image and discovered four main themes relevant to their research question, including men value practicality, men should not care how they look, clothes are used to conceal or reveal, and clothes are used to fit a cultural ideal.

A second example of an inductive text mining study is Jones, Coviello, and Tang’s (2011) study of academic research on the academic field of international entrepreneurship research. Jones, Coviello, and Tang constructed a corpus from 323 journal articles on international entrepreneurship published between 1989 and 2009 and then inductively synthesized and categorized themes and subthemes in their data.

Spotlight on the Research

An Introduction to Text Mining

Подняться наверх