[PLing] Invitation to Book Launch "Statistics in Corpus Linguistics Research" by Sean Wallis

Thu Feb 24 10:12:30 CET 2022

**Apologies for double posting**

**

Dear colleagues,

You are cordially invited to a presentation by Sean Wallis (Survey of 
English Usage, UCL), in which he will talk about his *book/Statistics in 
Corpus Linguistics Research: a New Approach/ *(Routledge 2021).

The presentation will be on *17 March 2022, 15:00* and will take place 
online. A link will be sent around a few days before the event.

Below you will find the abstract of the talk, as well as some 
information about the book.

Best wishes,

Evelien Keizer (University of Vienna) & Gunther Kaltenböck (University 
of Graz)

**

*Book Launch: /Statistics in Corpus Linguistics Research/ (Routledge 
2021) - Sean Wallis, Survey of English Usage*

*Abstract:*

Why do people find 'statistics' difficult, and what can we do about 
this? What are the best methods to use in linguistics, and are there 
specific problems we must address when we apply statistical methods to 
corpora?

In his new book, Sean Wallis argues there are several reasons why we 
find statistical reasoning counter-intuitive. Probably the most 
fundamental is that we do not "see" sampling uncertainty, we have to 
count many events, which is often an impossible task. But with a 
computer we can calculate and visualise uncertainty on the same scale as 
an observed factor, which is what /confidence intervals/ do. Whereas 
traditional approaches to confidence intervals were inconsistent with 
statistical testing and sometimes obtained improbable events, modern 
methods do not suffer these defects, and may be extended into a wide 
range of testing environments.

Applying these methods to corpus linguistics requires us to address a 
number of challenges and traditions. For example, conventionally, many 
statistical approaches accepted linguistic variables with per (million) 
word baselines. Yet these are clearly suboptimal, as most phenomena can 
only occur in specific locations in a text. This is fundamentally a 
linguistic analysis problem, which must be addressed through good 
research design, well-considered queries and a careful review of data.

Other problems tackled in the book include questions of semasiological 
analysis, learning how to engage in statistical argument to reduce 
research workload and how to compensate for the fact that corpora are 
random samples of texts, rather than random samples of independent 
utterances, clauses or phrases.

*From the jacket:*

Traditional approaches to statistics focused on significance tests have 
often been difficult for linguistics researchers to visualise. 
/Statistics in Corpus Linguistics Research: A New Approach/ breaks these 
significance tests down for researchers in corpus linguistics and 
linguistic analysis, promoting a visual approach to understanding the 
performance of tests with real data, and demonstrating how to derive new 
intervals and tests.

Accessibly written for those with little to no mathematical or 
statistical background, this book explains the mathematical fundamentals 
of simple significance tests by relating them to confidence intervals. 
With sample datasets and easy- to- read visuals, this book focuses on 
practical issues, such as how to:
• pose research questions in terms of choice and constraint;
• employ confidence intervals correctly (including in graph plots);
• select optimal significance tests (and what results mean);
• measure the size of the effect of one variable on another;
• estimate the similarity of distribution patterns; and
• evaluate whether the results of two experiments significantly differ.

Appropriate for anyone from the student just beginning their career to 
the seasoned researcher, this book is both a practical overview and 
valuable resource.

-- 
Univ.-Prof. Dr. Evelien Keizer
Institut für Anglistik und Amerikanistik / Department of English
Universität Wien Campus d. Universität Wien
Spitalgasse 2-4/Hof 8.3
1090 Wien
Austria

Homepage:https://anglistik.univie.ac.at/staff/staff/keizer/
-------------- n�chster Teil --------------
Ein Dateianhang mit HTML-Daten wurde abgetrennt...
URL: <https://lists.univie.ac.at/pipermail/pling/attachments/20220224/e7519011/attachment.html>