Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Tuesday, 10 August 2021

Perhaps I'm asking a question?

I've just finished reading a book called 'Statistics without tears' by Derek Rowntree. It's a basic tutorial on statistical concepts focussing on the ideas and principles, rather than walking through actual calculations in any detail. I found it useful and would recommend. But I'm here to talk about language, not statistics! 

The book is written in what I would describe as a very 'careful' style. You know it - the way older academic writers tend to, with quite precise attention to punctuation. Even though the tone of this book was very informal, friendly and not at all stuffy, I felt that every colon and dash was considered. 

So it was interesting to me that both times Rowntree used a sentence in the form Perhaps you recall..., he ended it with a question mark: Perhaps you recall the idea of a confidence interval? (p.183). I've had a quick look around the internet and can't find much on this topic other than a few sites peeving about the use of a question mark with perhaps, saying that it is not necessary and therefore wrong. There are people asking about it in English forums, indicating that it's something that might feel natural. 

It seems likely, then, that it's a 'declarative question' - the same as if he'd written You recall the idea of a confidence interval?. These are common enough, though definitely I would say a feature of less formal writing, just as contractions like I'll or don't are, which Rowntree also uses throughout. But it is interesting that he doesn't use this form - he uses perhaps. The question mark itself is enough to allow the reader to see that it's a question, and to therefore know that they are not being told that they do recall the idea, but rather prompted to agree that yes, they do recall that idea. So perhaps adds a bit more prompting, a bit more questioning, a bit more possibility of you not in fact recalling the idea of a confidence interval but that's absolutely fine because it was a few chapters ago and it's complicated stuff so don't worry. 

Thursday, 12 February 2015

Statistics

It's more and more common for linguists of all types to use quantitative methods in their research. This used to be something that only certain people did, because it was the nature of the method/subject matter etc. Now I increasingly get the feeling that those who don't are seen by some people as somehow not doing work that is as valid. I'm still pretty well in the theoretical linguistics camp (which doesn't mean we don't use data, interestingly, but it's not quantitative data). This means that my ability to wrangle statistical packages and interpret complex facts is close to nonexistent, but even I could spot some clangers in a recent episode of More Or Less (a BBC World Service programme).

First, there was an item about the apparent rapid increase in antisemitic attacks. The organisation Campaign Against Antisemitism had carried out a survey which revealed a worryingly high rate of British Jewish people being concerned about their long-term future in this country. It's not in question that there is antisemitism to some extent, but the presenter, Tim, noted that it's hard to sample the Jewish population in a fully representative way in this country. In response to Tim asking the reasonable question 'How do you know your respondents aren't disproportionately worried about antisemitism?', the spokesman for CAA said 'If you look at the results, they represent a range of views'. Well. Maybe so, but I think it's quite obvious that you can't judge how representative your sample is just from the responses of your own sample, if you don't have anything to compare it to.

Then there was an item in which someone (I think a Manchester police spokesperson but I could be wrong) talked about 60 men found in canals over the last few years and put this high number of deaths down to an as yet unidentified killer. The programme's researchers looked into how many deaths from accidental drownings one might expect over a similar period. When this chap was told that one would expect 61 accidental drownings, he said this: 'You can't ignore the statistics - well if you want to ignore the statistics...' and went on to speculate further about these deaths being linked. But it's him who is ignoring the statistics, in this case, and speculating on the basis of misleading numbers.

I find More Or Less and similar 'behind the numbers' things really interesting, because I'm fascinated by how easy it is to confuse ourselves and others with statistics. I remember one particular example from Bang Goes The Theory where Dr Yan demonstrated (with bacon sandwiches) how nearly everyone fails to spot that 'bacon increases your risk of bowel cancer by 20%' and 'bacon increases your risk of bowel cancer from 5% to 6%' are making exactly the same claim. We are apparently very bad at this kind of thing.

Tuesday, 24 July 2012

SAT visitors: sorry

I'm getting a lot of extra page views and a lot of them are for the post on SATs, grammar and British grandmas. I presume that means that SAT time is near and people are googling for advice. If that's how you found me, then I apologise - that post was almost certainly no use whatsoever. I hope it mildly entertained you as compensation.

This post will now also come up in searches and it's even less useful, and not even entertaining. Sorry again.

Friday, 17 February 2012

Greenberg's diversity index

Well, this is interesting. It's an article from the Economist, showing language diversity in several countries. It gives the probability of two people selected at random from any one country speaking the same language. So in Papua New Guinea, two people are almost certain to speak different languages, whereas in North Korea, they will definitely both speak the same language.



The UK isn't on this list, but the fuller list at Ethnologue gives us a probability value of 0.133, so around the same as Mexico or Australia. However, we have many fewer indigenous languages spoken than either of them (Ethnologue says 12; click here if you want to know what they are) so it looks like we must have more speakers of our minority languages than they do.