linguistlaura: Language Log

Showing posts with label Language Log. Show all posts

Monday, 11 December 2017

They's using singular verbs as well now

There's been a bit of a flurry of discussion about the use of pronouns for nonbinary people in linguist twitter lately, because of a blog post by the well-known Geoff Pullum on the well-known platform Language Log (so, a person using their position of power and privilege to complain about something that is far less onerous than constantly being misgendered). Kirby Conrod has written a good response and it was posted on Language Log, so no need to add my own comments here, especially as I'm cisgender and so don't really have anything to contribute to the debate.

What has been interesting to me is something that I hadn't seen before: people using they+3rd person singular verb, so They is joining us later. I would have always assumed that singular they took the unmarked verb form, same as all the other pronouns apart from the third person singular he, she and it, and crucially, the same as the plural they. Then the verb form follows the pronoun form, and in this sense it's the same as you, which takes the same form in singular and plural. Similarly, in German the polite form of you is identical to they and takes the same verb forms. Using the -s form, is, is logical if we think that verb form is attached to the semantic (person and) number, rather than just the form of the verb, so it shouldn't be so surprising, but nevertheless I was surprised by it. I think we haven't settled down on this yet, so I will watch developments with interest.

Sunday, 13 September 2015

Gezellig

The Morris group that I dance with did some workshops for some Dutch children this week. One of the girls said that the evening was gezellig - and she asked me did we have this word? Well, we don't, of course. I tried to kind of mutate it into English and got this far:

-ig is an adjective suffix, meaning -y or equivalent.
ge- is a kind of verbal past tense thing, I believe, so this is an adjective formed from a verb

But then I got stuck (it also turns out I was wrong about the ge- bit anyway).

Just from the way it sounded, I suggested 'cosy', even though it didn't seem right in context. I looked it up when I got home and 'cosy' is one of the things it can mean, but the internet also tells me that this is an 'untranslatable' word.

Untranslatable words, it seems to me, come in two or three flavours. There's one kind where a language happens to have a word for a very specific concept. This is not untranslatable; it's just that language X encodes something in one single word that language Y does with a phrase. See, for example, German schadenfreude or Japanese origami (I have no idea how much Germans or Japanese people actually use these words). In this case, as with many others, the way we get round not having a word for this concept is to just borrow it. We also do this with foreign things like food (risotto, wasabi, pak choi, tea...).

There's also words where the translation isn't exact, although there's a bunch of words with similar meanings. See the Language Log entry on 'accountability', for instance. Prepositions are also a problem - they never seem to translate quite right from one language to another, partly because they don't have 'meaning', as such, but rather a grammatical function. These must be annoying for translators and make learning languages a little bit harder/more interesting, but we can learn what the nuances are.

Then there's the kind that seem somehow exotic because they refer to some concept that we hadn't thought about before. These have a great appeal on the internet. I suspect this is because they tend to refer to highly emotional states of mind. Nostalgia would be a good example of this in English, and saudade in Portuguese. They are often claimed to say something about the temperament of the nation that uses that word, so Portuguese or Brazilians are typically melancholic or nostalgic. We know the fallacy of attributing a characteristic to a whole nation, but nevertheless we like to do it because it helps us to label people.

Gezellig is said on Wikipedia to 'encompass the heart of Dutch culture', so it's a good example of one of these 'untranslatable words'. Wiktionary says it means 'companionable, having company with a pleasant, friendly atmosphere, cosy atmosphere or an upbeat feeling about the surroundings'.

It also says it comes from gezel, which means 'companion'. So much for my etymological analysis.

Wednesday, 12 September 2012

I doubt that you would suspect this

A recent Language Log post pointed out a humorously ambiguous headline:

The joke is of course that it sounds like the kebab van drove over and duffed up the hapless teen, with the by phrase expressing the agent of the jaw-breaking incident. The intended meaning is the one in which by is a preposition, and the by phrase locates the teenager at the time of the attack (ie he was by a kebab van). Much lolz ensues.

This wasn't the only humour to be wrung from the language in the headline: commenter Bobbie made the following quip, which was immediately either genuinely or wilfully misunderstood by Victor Mair:

Bobbie is taking advantage of the fact that in English, we can say that you have broken your X (if X is a body part) and it does not mean that you did the breaking. There's a term for that which I can't remember just now. Anyway, so Bobbie says what (s)he says, implying that it is unlikely to be the case that the teenager did it himself. Victor Mair's comment was initially baffling to me, because of course American English speakers wouldn't use suspect instead - that would mean the exact opposite! Other commenters said much the same lower down the comment thread.

Some of those commenters also noted that doubt used to mean roughly what suspect means. It's sense 6d in the OED, marked as archaic, and from the examples there, you can see how the semantic shift could have happened. Just add it to the long list of English verbs whose meaning has completely reversed over time.

Wednesday, 1 August 2012

Lost 'lost ''lost' sign' sign' sign

A wonderful example of centre embedding from a ridiculously silly blog, via my friend Valdemar:

The image shows a lost sign, and the lost thing that it's advertising is another lost sign. And the thing that sign was advertising (before it got lost) was a lost sign... and so on.

It's called centre embedding because, unsurprisingly, it means embedding a phrase in the centre of another one. By 'in the centre' we don't mean that it's precisely central, but rather that words from the higher-up phrase are on both sides of the embedded one. Here's an example of the more common type of embedding we find in English:

I really hate people [who don't think of others]

The bracketed part is a relative clause, which means that it tells you more about people, and it's embedded in the main clause. It's at the end, which is nice and easy to understand. We can go on for a surprisingly long time like this:

This is the farmer sowing his cornThat kept the cock that crowed in the mornThat waked the priest all shaven and shornThat married the man all tattered and tornThat kissed the maiden all forlornThat milked the cow with the crumpled hornThat tossed the dog that worried the catThat killed the rat that ate the maltThat lay in the house that Jack built!

Every line in that rhyme is a new embedded clause, but we can keep track of it all and it's not terribly remarkable. We actually do it quite a lot in normal speech. This example, which inspired Language Log's Trent Reznor Prize for Tricky Embedding, contains a whole stack of embedded clauses and other stuff but is completely understandable, and was produced in natural speech in an interview:

"When I look at people that I would like to feel have been a mentor or an inspiring kind of archetype of what I'd love to see my career eventually be mentioned as a footnote for in the same paragraph, it would be, like, Bowie."

The thing with centre embedding is that it is totally grammatical (it does not break any of the rules of English (by which I mean the rules that speakers intuitively know and that cannot be broken, rather than the prescriptive rules that we all break in our everyday speech), but not acceptable (i.e. speakers don't say things like this and if asked, don't think they are good sentences at all). This is very different from most other grammatical puzzles that we (linguists) have, which are far more often of the type 'this is ungrammatical in most dialects but some speakers produce it - why?' or 'this theory predicts this to be ungrammatical but it's not, because it occurs in language X - why?'.

It's really striking how quickly examples of centre embedding get impossible to parse (work out the grammar of). In the poster, we can of course easily understand the phrase with no embedding at all:

Lost sign

But then even just one layer of embedding, equivalent to I hate people who don't think of others, is a bit hard to work out:

Lost lost sign sign

And then when you get just one more, it's too hard:

Lost lost lost sign sign sign

The quotation marks help a bit here, but not much, and that's obviously no good in spoken language. This example is obviously designed for humour, and some are more or less easy to work out. Wikipedia (yeah, I'm being lazy today - I've got a PhD to write) cites this example of double embedding, attributing it to De Roeck et al (1982):

Isn't it true [that example-sentences [that people [that you know] produce] are more likely to be accepted]?

The double-embedded part that might cause trouble is the that people that you know produce part, but here it's not too difficult, perhaps because we're used to hearing know+verb constructions. But the Wikipedia page also says (summarising Karlsson 2007) that three is the maximum degree of embedding in written language, and even two is vanishingly rare in spoken language. It gives this example of super-tricky centre embedding, where the first one (with one level of embedding, and not centre embedding) is fine, but adding just one centre-embedded clause makes it incredibly difficult to parse:

A man [that a woman loves]

A man [that a woman [that a child knows] loves]

It means a man who is loved by a woman, who in turn is known by a child. But you try working that out while you're in full conversational flow. It's supposed to be basically just that while we're super-good at keeping track of relations and actions, we're really really bad at keeping track of a whole load of subjects without linking them to their predicates (what they did).

Finally, this completely incomprehensible paragraph from SpecGram:

An apparently new speech disorder a linguistics department our correspondent

visited was affected by has appeared. Those affected our correspondent a local grad student called could hardly understand apparently still speak fluently. The cause experts the LSA sent investigate remains elusive. Frighteningly, linguists linguists linguists sent examined are highly contagious. Physicians neurologists psychologists other linguists called for help called for help called for help didn’t help either. The disorder experts reporters SpecGram sent consulted investigated apparently is a case of pathological center embedding.

Wednesday, 11 July 2012

Taking words for granite

I was going to write a post about an 'eggcorn' (what Language Log calls misheard and reinterpreted idioms, words and phrases - it is itself an eggcorn for 'acorn'). Apparently, some people believe that the expression

to take X for granted

is actually

to take X for granite.

Which is odd. Nothing like each other, are they?

Well, not in spelling, no. But then it occurred to me that in some dialects, they might be pretty similar, if you simplify the cluster [nt] to [n] (as is common) and devoice the final [d] (which I've heard some US speakers do, on telly). I personally could only do the former, which is why it seemed such a strange mistake to me. So like I say, I was going to write a post about it, but then I googled it to get some information on it, and found that Language Log beat me to it by a good seven years.

Saturday, 2 June 2012

Boobs

A Language Log post the other day included a use of the word boob by the male blogger, Victor Mair (whose always interesting posts have featured on this blog before):

sonar-like semi-circles emanating from the model's left boob

While boob is of course in very common use, and is perhaps the most common word that I hear for these body parts (subjective statement alert), it sounds funny to me to hear a man use the word (not completely weird, just enough to notice).

Perhaps men talk less about boobs generally than women do (sounds unlikely, I know), and when they do they refer to them in a more formal manner with the more neutral, formal, breasts? (Unless it's a discussion of someone's merits or otherwise in that department, in which case it's often tits.) Discounting lads'-mag discussions and passing mentions, that leaves few occasions for a man to use such a familiar word.

(Disclaimer: My thoughts apply only to UK usage, of course, and pretty only much my own experience.)

Saturday, 11 February 2012

Correlations in linguistic data

Geoff Pullum at Language Log recently reluctantly (because it's not yet published) commented on a paper by a Yale economist, Keith Chen. In this paper, Chen argues that if your language has a grammatical future tense marker, you are less likely to save money, live healthily etc because the future seems like some other time, not to be worried about now. If your language uses present tense to refer to the future, you treat is an extension of the present and you'll be much more sensible about it. Pullum is guardedly sceptical about these claims, for reasons which you can read about yourself.

He is also sceptical about this kind of claim (made based on correlations found in large amounts of data) because

I also worry that it is too easy to find correlations of this kind, and we don't have any idea just how easy until a concerted effort has been made to show that the spurious ones are not supportable. For example, if we took "has (vs. does not have) pharyngeal consonants", or "uses (vs. does not use) close front rounded vowels", would we find correlations there too? I have some colleagues here at the University of Edinburgh, within Simon Kirby's research group, who have run some informal experiments on the data Chen uses to see if dredging up spurious correlations of this kind is easy or hard, and so far they have found it jaw-droppingly easy.

He doesn't comment further on these experiments, but it reminded me of the talk Martin Haspelmath gave when our university's linguistics research centre opened a few years ago, and he told us about the World Atlas of Language Structures (WALS). After telling us what a wonderful, useful tool it is (and it is, I've found it invaluable), he ended on a note of caution. It's easy, he said, to find false correlations. For example, you can show a map of languages which have a different word for hand and arm or use the same word for both. That map shows that the languages that don't distinguish are, broadly speaking, around the warmer areas of the globe (yellow dots) and the ones that do distinguish are in colder areas (red dots):

(Map from WALS, feature 129A)

Now might one not hypothesise, asked Haspelmath, that this language fact is due to the climate? In colder countries the distinction is important, in that one wears items of clothing that cover only the hands (gloves), or sleeves that come down to the wrist. In warm countries, sleeves are not so long and gloves are not worn, so a separate word for hands never becomes necessary. A far-fetched example, but a lesson in not putting too much faith in correlations.

Saturday, 10 December 2011

Chinese 'yellow, gambling, poison' on Language Log

What on earth is the meaning of this sign?

I had no idea, so I asked just that question of Victor Mair over at Language Log, and he obliged by answering it speedily and fully.

Wednesday, 16 November 2011

Kate Bush, Eskimos and snow words

It's a well-known linguistic myth that 'Eskimos have [insert high number here] words for snow'. This has been conclusively shown to be stupid*, and I think a lot of people now know this. But it's still a nice little 'factoid' and Kate Bush has made good use of it in her new album, 50 words for snow. Via Language Log, which documents this sort of thing, I found out about the album and now a link to listen to the song online.

Ben Zimmer at LL has put the link, together with the lyrics, online in a nice blogpost on the topic. The title song features Stephen Fry speaking the titular 50 words (English words and phrases, not Eskimo - although there is a Klingon one) and the results are really quite beautiful. The words are a mix of nice-but-nonsense and witty, like blown from polar fur, spangladasha and icyskidski.

*For many reasons, argued persuasively by Laura Martin some years ago. For instance, what do you mean by Eskimo? It's kind of a blanket term for a group of languages. What do you mean by 'word'? That family of languages is polysynthetic, which means there's a heck of a lot of affixes and you can make many words from a single root. In fact, a single 'word' can actually be a whole sentence, making the number of 'words' presumably infinite.