Friday, 20 July 2012

'Informant incompetence'

I can't now remember where I heard the phrase 'informant incompetence', but it's a slightly cruel way of describing a perennial problem in linguistics (and presumably other disciplines too): when the people giving you linguistic data simply fail to understand what you want from them.

What we want from informants is almost invariably very simple, just straightforward production of the language they use every day. The reason I say it's cruel to call it incompetence when they get it wrong is that it's unfair to expect non-linguists to understand what we want when we don't (can't) explain to them why we want it and what we're going to do with it. But we can't tell them because either they wouldn't understand, or it would affect the data they provide.

So for instance, I wanted data about question particles, for my PhD. Question particles are functional words and all they do is attach to a sentence to make it into a question. We don't have them in English, but they're pretty common in other languages. When I started out, I hadn't yet realised that you can't just  email someone some sentences and say 'how would you ask this?'. They will email back with utterly useless data. I'm studying the syntax of question particles, how they work in the sentence, where they are, what other elements they can occur with, and so on. Informants would send me back sentences that didn't even have the question particle in them.

But that's not their fault - how should they know? It's a very difficult thing for non-linguists to understand that we don't actually care if a sentence makes sense, we care about its structure (not its meaning). The people I asked, instead of telling me about question particles, had carefully and painstakingly explained how best to ask about seeing elephants at the zoo. Why should I care about that? But to a non-linguist, I suppose it must seem more likely than that you're interested in some aspect of the grammar. You can't separate syntax from content, in a sentence that includes words, so people can't help but look at the meaning. They rate a sentence as bad because they don't think it's likely that they would go fishing (or whatever), not because it's ungrammatical. For these reasons, you have to be super careful about what sentences you present to people, and you have to be completely specific about what you want. Even then, it's usually not enough. 

By far the best way to avoid this is to collect a load of natural speech data and just pull your information from it. This is all very well if you're studying some sound, or the frequency of null subjects, but how likely am I to get an embedded question in an hour of speech? Not very, and not getting one is nowhere near good enough reason to say they're not grammatical.

Just to add to the fun, there's a thing called the observer's paradox. It means that the simple fact of you being there observing something might cause it to be different from what it usually is. In linguistic data collection, that means that sticking a microphone in someone's face and telling them you're researching language is a sure-fire way to get them to speak unnaturally. For this reason, we tell them we're researching 'the way people tell stories' or some such vague line, and never ever ever say we're studying their grammar. We also use little clip-on mics, interviewers that are known to them, or even avoid having interviewers at all. It's not ideal, but until you can conduct covert recordings in pubs it'll have to do.

All of the above is absolutely not the fault of the informant - it's our fault for not using good enough elicitation techniques, or it's just a problem that we have to live with. Sometimes, though, you do get genuine informant incompetence. Maybe you give them a scale, and you tell them that one end is good and the other bad. You will invariably get someone who mixes them up, and says that all the good sentences are bad and all the bad sentences are good. You know that this has happened, because you've put some controls in there so you know that this person's answers are off. But can you just reverse all that person's ratings? No, you can't. How can you be sure that's what happened? Maybe he really does have very different intuitions from everyone else. And how far do you go? What about a person who got it wrong just on one sentence? You might suspect that this has happened, but you can't know for sure. There's not a lot of point doing an experiment if you just ignore or manipulate all the data that doesn't fit with what you expect to find. Not to mention the scientific fraud that involves.

This happened to me once. My informant was not the brightest, and we had a task that everyone else had grasped but he had totally failed to understand. It involved turning sentences into questions, I think, or the other way round. In the end he treated it like a puzzle and just went through, moving all the auxiliary verbs to after the subject. This works sometimes, but other times you get gobbldegook. I know that the result wasn't a true reflection of what he would say, because I watched him do it and he wasn't listening to the stimuli or even thinking about them as sentences. But I don't know if his data ever got used or not (it wasn't my experiment).

Look at this idle request on the internet: they wonder about the Wisconsin dialect and ask if anyone else has any thoughts:
They mention five things specifically, two grammar (prepositions) and three vocabulary items:

  1. Using 'extra' final prepositions (Do you want to come with; Where is my dress at)
  2. Using by instead of to or at (I was by Claire's house today; Do you want to go by the library later?)
  3. TYME machine (=cashpoint)
  4. Bubblers (=drinking fountain)
  5. Soda (=any fizzy drink. See this page for a nice map of where it's used)
The respondents are helpful in their responses. Two of them report on their opinions on the topic of main query. The other two give even more information, talking about the vocabulary items mentioned in passing. But I was struck by their lack of awareness of their own language: the first one says 'We say all that' and the second 'I use all of these', and they then report that they use 2 and 3 out of the five items, respectively. That is not all of them, not by a long shot.

But then, we know people have no idea what they say. I once got a comment on a survey about null subjects from a person who said this:
Don't think I'd leave out the 'I'.
Genuinely. In the sentence saying that they would not leave out 'I', they left out 'I'.

No comments:

Post a Comment