Corpus Analysis in Modern Fiction Study
Dive into the methods and tools for analyzing modern fiction through a corpus-based approach. Explore statistical analyses, linguistic tools like WordSmith and Wmatrix, and investigate semantic domains for a comprehensive understanding of modernity in literature.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction Ilina Doykova Shumen University, Shumen (Bulgaria) ilina.doykova@abv.bg
Statistical analysis Simple things may characterise different styles average sentence length average word length vocabulary richness vocabulary growth (homogeneity of text) More complex analyses give a more interesting picture specific syntactic structures degree of modification in NPs types of verbs (e.g. verbs of persuasion, speech verbs, action verbs, descriptive verbs) distribution of pronouns (1st/2nd/3rd person) themes, beliefs, etc. authorship Especially when used comparatively
Linguistic Tools: WordSmith and Wmatrix Useful features: + Tagging + WordList + Concordance = identifies and labels PoS = generates word-frequency lists = lists occurrences of a word in context and its immediate environment, gives access to collocates Identify syntactic use of word Identify range of meanings Identify relative frequency of different uses/meanings + KWIC (key word) = identification of key words through a comparison with a reference corpus = semantic tagsets in 21 domains + Word Clouds Listings can be customised to show what you want more clearly: sort according to next or previous word show more or less context highlight important information
Methodology Word Frequency List (Wmatrix)
WordSmith frequency list of predicative adjectives, Modern British Women Fiction Writers Corpus
Key words list and dispersion plot (ALONE in MBWFW corpus) Consistency analysis indicates whether a word is found consistently across lots of different texts or only in a narrow set of texts, or a specific text
Lemmatized results for relational pairs WordSmith and Wmatrix
Investigation of semantic domains through semantic tagging (Wmatrix)
Key Domain clouds (for Wmatrix only) The larger the word, the greater its keyness or uniqueness as compared to the BNC Written Sampler of imaginative texts.
Research and language learning Word frequency knowledge in present-day language textbooks (grammatical, collocational, semantic) is frequency-based; Real usage corpora represent actual, not prescribed usage; Translation find the best equivalent; Grammar investigate on word classes, specific syntactic structures; Teaching collocations trouble and strife , the elephant in the room ; blue murder Decoding specific content (sexist, racist or ideological, etc. ) Authorship identification of true authorship Analysis of texts written in any language and any alphabet
References [1]Biber, Douglas et al. (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP, 1998. [2]Campbell, R.S., & Pennebaker, J.W. (2003).The secret life of pronouns: Flexibility in writing style and physical health. Psychological Science, 14, 60-65,2003. [3]Leech, G. N. and Scott M. (1981). Style in Fiction. London: Longman, 1981. [4]Rayson, Paul. (2009). Wmatrix. A Web-basedCorpus ProcessingEnvironment, ComputingDepartment, Lancaster University, 2009. [5] Rayson, P., Archer, D., Piao, S. L., McEnery (2004). UCREL Semantic Analysis System (USAS), 2004. (http://ucrel.lancs.ac.uk/usas/) [6] Scott, M. (2012). WordSmithTools, Version 6, Liverpool: Lexical Analysis Software, 2012 (http://www.lexically.net/wordsmith/index.html). [7] Seizova-Nankova,T. (2012). Primaryschool education and computer-basedlanguage study, BETA Papers, 2012. [8] Seizova-Nankova,T. (in print). Developing collocational competence. A case study. 12th International language, Literature and Stylistics Symposium, Edirne, Trakya University, Turkey. [9] Semino, E. and Scott, M. (2004). Corpus Stylistics: Speech, writingand thought presentationin a corpus of English writing, Routledge, 2004. [10] Sinclair, J. (2007). The Search for Units of Meaning. In Corpus Linguistics: Critical Concepts in Linguistics. Vol. 3. Routledge, 2007. [11] Yasunori Nishina. (2007). A Corpus-DrivenApproach to Genre Analysis: The Reinvestigationof Academic, Newspaper and Literary Texts ,ELR Journal, 1 (2), 2007, (http://ejournals.org.uk/ELR/article/2007/2 (accessed 27 June 2013)). [12] UCREL Home Page, Lancaster, UK. 1993-2013. 23 April, 2013, (http://www.comp.lancs.ac.uk/research/) Electronic text resources http://www.stylist.co.uk/books, http://www.newyorker.com, http://narrativemagazine.com, http://www.one-story.com, http://www.teachingenglish.org.uk/teaching-resources, http://www.guardian.co.uk/books, http://gutenberg.net.au/