The mathematical analysis of style: A correlation-based approach.

Computers and the Humanities(1988)

引用 4|浏览1
暂无评分
摘要
Mathematical models of style have focused on features which are easily quantifiable and, for computer-aided analysis, easily identifiable by machines. Most such studies are based on frequency of occurrence of word counts, vocabulary items, or grammatical forms.In this paper, we examine not the number of occurrences of a characteristic, but the pattern in which the characteristic occurs. For example, we hypothesize that the lengths of successive sentences are mathematically correlated and that the length of a sentence can be described, quantitatively, in terms of the lengths of previous sentences.Autoregressive integrated moving average (ARIMA) models are traditionally used to describe correlated time series data. Under the assumption that the number of words in one sentence is correlated with the number of words per sentence in prior sentences, we develop ARIMA models for series in different works by the same author and comparable works by different authors (James Joyce: portions of Ulysses, and Dubliners; and Ernest Hemingway: portions of In Our Time). Problems of sampling from a literary text are discussed and results presented.Although the performance of the models in predicting sentence length is only marginally better than using mean sentence length, the potential value of this technique in characterizing stylistic features, especially changes in style from the beginning of a piece to the end, is demonstrated.
更多
查看译文
关键词
ARIMA models,autocorrelation,Box-Jenkins method,correlation,Ernest Hemingway,James Joyce,literature,sentence length,stylistic analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要