Experiment replication – an example of failure

Much talk is made of the crisis in replication in various areas. I report one experience.

In 2010 I was the lead author of a research paper which was presented at a conference in Melbourne. We were investigating the idea that if people said ‘fuck’ while talking, that speech recognition would improve – the Holy Grail of those working in the area. It was intuitively obvious to see reasons why that might happen.

We had available to us an online game my co-author had developed as advertising for the movie Despicable Me. It was the first (only?) speech enabled internet game. The user would tell Minions to do things and they would do them. While developing it, Manny figured that it would be used mostly by boys of an age to find it vastly amusing to say ‘Fucking play table-tennis’ as an alternative to ‘Play table-tennis’. Etc. As he tested the system he had the idea that maybe he was being recognised more when he used the word ‘fucking’.

So we devised a test where people were given a set of randomly generated orders to give the Minions. They all included expressions without swearwords, with ‘fucking’ and with other swearwords instead. Nobody knew why they were doing the exercise. For lack of resources I asked people I knew to do this for me. We had a small group, but I thought a reliable one as a consequence.

We established statistical significance for the hypothesis that ‘fuck’ as an intensifier did improve recognition. You can see What’s the Magic Word? here.

About a year later we decided to do the experiment again. I didn’t have any more friends left (I know, it’s sad) but we had the resource available of AMT – the Turk. We did not achieve statistical significance. There were various possibilities as to why that was. Of course it could have been that our first experiment did not produce accurate data. But we had to acknowledge two important differences in the nature of the data collected which might also have been at issue. Firstly, using lowly paid (though we paid them much better than going rates) Turks. Secondly our Turks were US based. In our first experiment our users had been mostly Australians with a few English thrown in.

Ideally we would have used Australians again. However, at the time (I don’t know about now) this idea of crowd-sourced labour had not come to Australia yet. Furthermore and rendering the idea of a close if not exact repetition of the experiment impossible, the Minions game was taken offline and we could no longer use it.

We do think from time to time of exploring the ideas further, but the one thing we can state for sure is that neither we nor anybody else can repeat our original experiment.

postscript:

For anybody looking at the paper, please note that it had a second agenda which is why it is oddly written by normal standards of academic work. Having proofread various papers in the same area which struck me as excruciatingly boring, I questioned the idea that one had to write in that way in order to be published. I wanted to write something that would be both interesting and intelligible to a person walking in off the street to listen to the paper being presented. In fact you have to pay a lot of money to go to academic conferences, so people off the street are excluded. I believe that is wrong and I believe that many, if not all, academic papers could and should be written in ways more accessible to the population at large.

Advertisements

Want to learn Australian?

Ever wondered what a drongo is? Daks? Barbie? Tinnie? Have Oz friends and would like to understand their lingo?

If you go here, you will find this:

The best fucking speech-enabled CALL course on the web

If you want to learn to speak the beautiful Australian language, here’s your chance. For your pleasure and edification, a crack team of software engineers and computational linguists, assisted by several attractive and highly qualified Australian native speakers, have slaved for months to create this piece of state-of-the-art web software. Follow the instructions below, which we’ve made so simple that even a Pom should be able to understand them, and you’ll be speaking Strine in no time.

Fair dinkum!

You may, ahem, find the voice sounds familiar. But I deny any further involvement other than finding it my duty to take part in the picture selection for ‘daks’.

Who is it for?

  • Australians who need a refresher.
  • Australians who need a laugh.
  • Aliens hoping to be let into Australia.
  • Aliens who can’t afford the fare but who want a taste.
  • Anybody wanting to become bilingual.
  • Anybody interested in linguistics.

Numinousness has its place.

I have to admit numinousness is a term I have come across only recently. It should be used in a religious context, but it has been taken over by sci-fi and fantasy. As far as I can see that community – sorry, maybe those communities? – don’t have any very clear idea on what it means. It means whatever it means at the time. It’s a fuzzy word. Still, read enough people using it, and even though you sort of feel like they have no idea what they are talking about, nonetheless you start to get a sense of it. I’ve got a sense of it now. Don’t ask me to define it because I can’t.

But one way or another, however, as I’ve been reading about it here and there, it did occur to me, based on a recent observation, that it has an urgently needed application: hard-core pornography. Whoever makes that stuff, please read this. Numinousness will make what you do better.

Good pattern instructions

We’ve all been there: what exactly DO those instructions mean? A little carelessness in language goes a long way to making life difficult for the knitter.

Take my latest problem. I’m knitting Liesl and it is, despite the absence of schematics, a well written pattern. BUT….

It is knitted from the top down, a cardigan in one piece. There comes a point, after finishing the yoke, where the sleeves have to be separated from the body to be picked up later.

The row reads like this for the size and version I’m doing:

sk1, k29, sl 36 into scrap yarn, co 12, k55, sl 36 onto scrap yarn, co 12, knit to the end.

Now, after you cast on 12 in the middle of the row – well, at least the way I cast on, you are at the start of those stitches, so the instruction k55 means the 12 just cast on and the next 43. But no, that is not what is meant. You are supposed to knit the 12 you have cast on and THEN the next 55.

It was only an hour of knitting later, that I saw the error of my interpretation. There followed much gnashing of teeth and wailing.

Rugby player admits difficulty with sex

My attention was caught watching the news the other day when a rugby player said to an interviewer that ‘it’s hard to come from behind’. My first thought was ‘Why would he find that hard?’

Evidently the physical inadequacies of rugby players are not necessarily limited to their missing necks. I could start feeling sorry for them.

Recently a linguist said to me that ‘You are complicit in the factitious enshrinement of an ensemble of rules-for-their-own-sake’. It was because I wasn’t willing to write the word abientot without the circumflex it requires.

Yet the fact is that I find myself regularly confused by incorrect use of language. Proper usage – if I may use two words which will get me into no end of trouble – always (should that be in inverted commas?) avoids this happening.

I’m reading Annie Proulx at the moment and came upon the following sentence (Bad Dirt p. 21):

He adjusted his Stetson, which like a Texas sheriff, he always wore in the office.

The picture which spontaneously came to mind for me was a person wearing a Texas sheriff in the office. Correct me if I’m wrong, but isn’t that what this sentence means? Replace ‘Texas sheriff’ with, for example, ‘scarf’.

I spend way too much time trying to understand badly constructed sentences which don’t actually say what the author intended.

It’s all very well to say that a squiggly bit on top of the word abientot is a rule for its own sake, but at which point is the line drawn?