Читать книгу Methodologies and Challenges in Forensic Linguistic Casework - Группа авторов - Страница 26
ANALYSIS
ОглавлениеAs noted already, the purpose of separating the analysis into stages was to allow TG to pass the data in the case to JG in a controlled way. Specifically, in line with the protocol published in Grant (2012) and, given the time series nature of the data, TG began by providing JG with only the two sets of known writings for Debbie and Jamie Starbuck. TG had requested from the police contact that he, too, should not be informed of any particular suspected breakpoint in the data series. In spite of this, the emails were provided to TG in two files of known and disputed emails. To resolve this, TG removed the last few emails from Debbie’s known emails and added them to the disputed set to create a blind test set of emails for JG’s analysis. The advantage of having a second party manage the data access for the primary analyst is that it allows for practical issues such as this to be taken from the hands of the police, who may not fully understand the requests to provide data in certain ways to assist in the outcome.
JG analyzed the known writings, primarily by hand, to identify a linguistic feature set that showed pairwise distinctiveness between the two possible authors—that is to say, features were identified that were consistently used by one author, but not by the other (Table 2.1). Most notably, this approach prevented confirmation bias against any hypothesis as to who had written the disputed material. This is especially important in the context of the careful stylistic analysis for texts in forensic linguistics, which relies almost entirely on the judgment of the analyst as opposed to quantitative stylometric approaches (e.g., see Grieve, 2007), which generally involve the use of preselected feature sets (e.g., function word frequencies).
Table 2.1 Linguistic Feature Examples
Feature | Debbie | Jamie |
---|---|---|
Sentence length | Long sentences (24 words per sentence average) I’m now back in Oz, after 5 weeks In NZ—had a good time, though it felt so much more remote than here (guess it is!) and I really felt that, being there. | Short sentences (10 words per sentence average) I knew I’d forget something. 2 things in fact. |
One-word sentences | No tokens | Occasional use Sorry. I thought I’d replied. |
Run-on sentences | Relatively common Are you enjoying your new car, what is it? | No tokens |
Awhile | No tokens | 3 tokens Shouldv’e done that awhile ago. |
Inserts | Relative uncommon ha ha—you’re entirely responsible for how or where it goes | Relatively common Umm….you haven’t actully apologised for anthing despite your insistence otherwise. |
Emoticon usage | No tokens | 9 tokens Its gorgeous:) hope you enjoyed your holiday.) |
One basic distinction between a stylistic approach and a stylometric approach is that the stylistic approach generally involves a data-driven generation of a case-specific feature set, whereas stylometric analysis tends to rely on predesigned feature sets. The strength of one approach can be the weakness of the other in that feature sets arising from stylistic approaches resist generic validation studies but lead to explanation-rich outcomes that are easier to explain to non-specialists like police, lawyers, or juries. In contrast, stylometric features can be validated in independent testing—such that they can be applied consistently by researchers and minimize the need for analysts to rely on their own judgment—but the abstract nature of these analyses can resist informative explanation.
The feature set provided by this initial stage of JG’s analysis in the Starbuck case was then provided to TG and also sent to the police. The purpose was to provide an evidence trail that the feature set had been “locked” prior to JG receiving the disputed material. This step in itself did not strengthen the mitigation of confirmation bias, but it did strengthen the robustness of the analysis as an evidential product.
JG identified a wide range of linguistic forms that distinguished between the styles of these two authors, features that were predominantly used by one or the other in the known sets of emails. The process through which these features were identified involved a combination of close manual stylistic analysis and also computational analysis giving rise to some stylometric features.
The stylistic approach involved carefully reading the texts and identifying apparent differences in their style. Both sets of texts were read many times over to identify seemingly distinctive linguistic forms being used relatively consistently across their own writings. The texts were read both without constraint and with a focus of specific levels of linguistic analysis—word choice, punctuation, spelling, sentence grammar, and discourse structure. Lists of unusual and distinctive features were compiled for both authors and then compared, and a list of potentially distinctive features was produced.
This procedure, the standard procedure in forensic authorship analysis, is far from perfect. It depends on the expertise of the analyst; it is based primarily on positive evidence, and it is not replicable. Two analysts may honestly come to two different subsets of linguistic features and, without attention to the broader design of the analysis, may be biased and may create unreliability. The main advantage of manual feature selection is that it seems to be better at identifying the most unusual and thus, potentially, the most distinctive features. As opposed to a computational approach to feature selection, a human reader is especially good at spotting entirely new feature types that may have never been considered before. Furthermore, once these features are identified, the writing samples can then be searched both by hand and computationally to ensure that all occurrences of these forms have been extracted. In stylistic authorship analysis, feature selection is manual, but feature counting need not be and, where possible, should not be.
JG also took a second approach, which involved conducting a stylometric, computational analysis to feature selection. The basic idea in such an analysis is to compare the frequencies range of well-established feature types or other textual measurements across the possible author writing samples—such as the relative frequency of word, character, and part of speech n-grams.5 The main advantage of this approach is that is does not require the expertise or attention of the analyst, allowing many more features to be analyzed and identified that might otherwise be missed or whose relative use is real and consistently different but not sufficiently distinctive to be identified by hand. It is also replicable and should give us far more confidence that we are looking at an unbiased feature set. Such an approach is not generally taken in forensic authorship analysis for two reasons: the courts are generally interested in the expertise and explanation of the linguist and presence of categorical patterns; and the texts and comparison texts are generally too short to warrant quantitative analysis.
In total, JG identified 51 different feature types that appeared to distinguish between the possible writings of Debbie and Jamie Starbuck, which were informally classified as belonging to nine levels of analysis:
Text level (average text length, common email openings and closings)
Paragraph level (average paragraph length, common paragraph initial words)
Sentence level (average sentence length, common sentence initial words)
Phrase level (common two-word n-grams, common three-word n-grams)
Word levels (average word length, common function word)
Abbreviations, acronyms, and emoticons (common text messaging acronyms, common emoticons)
Contractions (common standard contracted forms, common nonstandard contracted forms)
Spelling and case (common spelling errors, repetition of letters for emphasis)
Punctuation (common use of exclamation marks, nonstandard semicolon usage)
Some of these feature types consist of a single measurement (e.g., average word length in characters or spelling of “a lot” as one or two words), whereas others consisted of a large number of individual features (e.g., frequency of common function words or words that are commonly used in sentence initial position). Examples of these features are provided in Table 2.1. In addition, JG also recorded general holistic impressions of the two authors. For example, he found Debbie’s style to be more narrative and informal than Jamie’s.
We believe that combining the stylistic and the stylometric is the best way to get a meaningful, reliable, and robust feature set. Both approaches can miss important features. The computational approach is bad at identifying the highly idiosyncratic features that are often most meaningful, whereas a manual approach is bad at identifying subtle patterns that are often most robust. Using a mixed approach allows us to look at a broad range of evidence and should give more confidence that we have not missed important evidence. That said, it is important to remember that almost every feature is subject to selection bias. No stylometric analysis uses all possible stylometric features, and no two linguists will produce identical hand analyses. Furthermore, the analyses of the same feature set will not necessarily align. There is generally more than one way to count a feature, as the following examples illustrate.