"Visualizing the story plot"
Sergey Potemkin
Philological faculty, Moscow State University
potemkin@philol.msu.ru
Visual representations of linguistic data such as parsing trees, semantic relationships, statistical diagrams, etc., are widely used by linguists due to their usefulness and flexibility [Chris Curly]. At the same time, text as a whole poses a number of challenges for visualization, due to significant differences from other types of linguistic data. This paper presents a novel approach to visual representation of fiction and non-fiction texts in terms of event detection within those. Formal detection of events in the narrative is based on text itself, without considering its perception by a reader. Our hypothesis is as follows: the description of the state change is defined by the number of pairs of antonyms one of which is a word on the left from the sentence under consideration and the other one occurs on the right of the sentence. The article examines this hypothesis and the results obtained. The results are presented in graphical form on the X-Y plane. Each point of the graph has coordinates (x, y) where x = Ns is the serial number of the sentence starting from the beginning of the story; y is the count of the antonymous pairs in the whole story, one member of the pair is on the left and the other is on the right side of the sentence Ns. The resulting graph of antonyms undergoes smoothing and interpretation. On the grap the user can mark the sentence in which there is a significant change in the number of antonyms; read the sentence itself, move this dot along the curve on the graph. The results of calculations for Charles Darwin’s AUTOBIOGRAPHY [Darwin] (22000 words) are presented in the article. The first maximum of antonyms number is the event when the boy has left the school. The second is what C. Darwin mentioned as one of his best achievements. The third one is about his disagreement with some scholars. The fourth is his satisfaction with his work appreciation. And the fifth, the maximal one, is connected with his life credo. In comparison with the pure fiction text main differences are: The fiction story should comprise Exposition, Rising action, Crisis, Climax, Falling action and Denouement. In contrast the non-fiction story has a very short exposition (or all the story is a sort of exposition), it has a number of Climaxes connected to the important events of the life and no Denouement (the denouement is the death of the character – in the autobiography it is impossible). But, if we consider a large fiction story, e.g., a novel, we also can see a number of maximums on the graph of antonyms. References [Chris Culy] Some Challenges and Directions for the Visualization of Language and Linguistic Data http://avml-meeting.com/keynote-speakers/ [Darwin] The Autobiography of Charles Darwin from The Life and Letters of Charles Darwin http://manybooks.net/titles/darwinchetext99adrwn10.html