Tuesday, 19 April 2016

Talking about tax avoidance: weasel words

With recent revelations about rich people legally avoiding tax, it seems a good time to tell this story.

In late 2013, I received an email telling me that I could potentially save huge amounts in tax on my pension. I thought this must be a particularly sophisticated spam: poverty in the UK was rising fast, and I have high earnings, so why would pension rules be changed to benefit me? But no, I checked it out and it was all legal and above board.

All is made clear in this piece*. The phraseology has some gems: I particularly enjoyed 'the appropriate crystallisation event, such as the point of death', but behind all the talk of accrued benefits and recovery tax charges, the message seems to be that the Government had introduced a new measure that would increase the amount of tax paid by rich people, but HMRC had immediately provided a legal means for avoiding this, 'Fixed Protection'.

I'm fascinated by the use of language: the word 'protect' has entirely positive connotations: my online dictionary defines it as 'keep safe from harm or injury'. We hear of tax 'shelters', 'tax-advantaged savings vehicles' and suchlike. But what is achieved here is tax avoidance by rich people – all aided and abetted by HMRC.

If you have money and you talk to solicitors or financial advisers, you'll find that the default assumption is that you want to pay as little tax as possible. I first came across this when making a will: the solicitor concerned started describing how I could set up a complex trust that would mean less tax would be paid on my estate. I don't have dependents so this struck me as particularly pointless. I'd be dead and wouldn't care. The solicitor looked at me aghast and clearly thought I was barking mad.

I'm not a saint and I'm not an idiot. I have some savings. But I live a comfortable life and don't need to squirrel away vast amounts of dosh. And I grew up in the 1960s when, although things weren't perfect we had a reliable National Health Service; schools had adequate resources; there were grants to support people through university. The current generation may find it remarkable that until I was about 30, I never saw a beggar on the streets of any British city. Now we have food banks. An equable society that looks after the most vulnerable costs money, and that money comes from taxes.

I'll leave the last word to J.K.Rowling, who is one of the few rich people who seems to get it:

I chose to remain a domiciled taxpayer for a couple of reasons. The main one was that I wanted my children to grow up where I grew up, to have proper roots in a culture as old and magnificent as Britain’s; to be citizens, with everything that implies, of a real country, not free-floating ex-pats, living in the limbo of some tax haven and associating only with the children of similarly greedy tax exiles. 

A second reason, however, was that I am indebted to the British welfare state; the very one that Mr Cameron would like to replace with charity handouts. When my life hit rock bottom, that safety net, threadbare though it had become under John Major’s Government, was there to break the fall. I cannot help feeling, therefore, that it would have been contemptible to scarper for the West Indies at the first sniff of a seven-figure royalty cheque. This, if you like, is my notion of patriotism. On the available evidence, I suspect that it is Lord Ashcroft’s idea of being a mug.

*I found this on Google search: I don't have any dealings with this firm of solicitors, but as far as I can tell, what they say matches the advice I had earlier, and is pretty standard

Tuesday, 12 April 2016

To: the World
From: Deevybee
Re: We have a problem

Everyone I know is exhausted by email. You can spend a day battling back the incoming tide of messages, but next morning when you wake up, there it is again. Much of it is spam and can be deleted without reading, but it still absorbs attention and energy. But there's plenty of other stuff that sits there in your inbox eyeing you balefully until you respond. People I know divide into two classes: those who have given in and just live with an oppressive burden of 500 unread messages, and those who destroy themselves trying to keep on top of it. The best advice I've read on how to manage the situation is this by Tim Harford, but even obeying his rules only reduces the pain, but does not eliminate it.

Is there a solution? I've thought of one. It probably is impossible but I'm going to put it out there and see what you all say.

The idea is that there should be a cost to sending email. This seems weird for two reasons:  when email first came out we all loved it precisely because it was free: adding a cost seems perverse. Second, how on earth could it be made to work?

Well, suppose big organisations, such as Universities, could set up their systems so as to filter incoming emails and check whether the sender was registered. I assume that since filtering already occurs at some level for spam, this should not be impossible. If the sender is not registered, they get a bounce inviting them to open an account. Once they have an account, then a v small charge is made for each email that is delivered. The charge should be adequate to cover the cost of running the filtering and billing operation, but no more.

This could be arranged so that communications within a domain would be free, so it would not save us from mass communications from admin – but given that in my department most of these have recently been about the serious matter of closure of toilets, and even gender reassignment of toilets, perhaps that is as well. I can also envisage institutions have reciprocal arrangements so that all university domains, for instance, would agree not to charge each other. We would also need to be able to set up 'whitelists' of addresses that would be exempt from a charge: in my line of work we use email to communicate with volunteers and organisations who help our research, so we'd have to find a way to indicate that if we initiate the email exchange, the recipient is not charged for replying.

I've been trying to decide whether such a set-up would be good on balance, or whether it would create more problems than it would solve. There is no doubt that it would lead to an initial period of havoc, but it would wipe out spam at a stroke. The fear is that it could also prevent genuine and important messages getting through. Would I miss the opportunity of a lifetime to collaborate with a colleague, to go to a marvellous conference, or to take on an outstanding overseas student? The answer is probably yes, though, of course, if people were really unmotivated or unable to register, there is always snail mail.

Well, this is just an early morning thought, prompted by the daily routine of deleting the mass invitations to meet a sexy lady, deliver keynote at a conference on sludge disposal in China or be the recipient of a huge donation from a distressed oligarch. There must be a better way, but what is it?

Tuesday, 22 March 2016

Better control of the publication time-line: A further benefit of Registered Reports

I’ve blogged previously about waste in science. There are numerous studies that are completed but never see the light of day. When I wrote about this previously, I focused on issues such as reluctance of journals to publish null results, and the problem of writing up a study while applying for the next new grant. But here I want to focus on another factor: the protracted and unpredictable process of peer review that can lead to researchers to just give up on a paper.

Sample Gantt chart. Source: http://www.crp.kk.usm.my/pages/jepem.htm
The sample Gantt chart above nicely illustrates a typical scenario.  Let's suppose we have a postdoc with 30 months’ funding. Amazingly, she is not held up by patient recruitment issues, or ethics approvals, and everything goes according to plan, so 24 months in, she writes up the study and submits it to a journal. At the same time, she may be applying for further funding or positions. She may plan to start a family at the end of her fellowship. Depending on her area of study it may take anything from two weeks to six months to hear back from the journal*. The decision is likely to be revise and resubmit. If she’s lucky, she’ll be able to do the revisions and get the paper accepted to coincide with the end of her fellowship.  All too often, though, the reviewers suggest revisions. If she's very unlucky they may demand additional experiments, which she has no funding for.  If they just want changes to the text, that's usually do-able, but often they will suggest further analyses that take time, and she may only get to the point of resubmitting the manuscript when her money runs out. Then the odds are that the paper will go back to the reviewers – or even to new reviewers – who now have further ideas of how the paper can be improved. But now our researcher might have started a new job, have just given birth, or be unemployed and desperately applying for further funds.

The thing about this scenario, which will be all too familiar to seasoned researchers (see a nice example here), is that it is totally unpredictable. Your paper may be accepted quickly, or it may get endlessly delayed. The demands of the reviewers may involve another six month’s work on the paper, at a point when the researcher just doesn’t have the time. I’ve seen dedicated, hardworking, enthusiastic young researchers completely ground down by this situation, faced by the choice of either abandoning a project that has consumed a huge amount of energy and money, or somehow creating time out of thin air. It’s particularly harsh on those who are naturally careful and obsessive, who will be unhappy at the idea of doing a quick and dirty fix to just get the paper out. That paper which started out as their pride and joy, representing their best efforts over a period of years is now reduced to a millstone around the neck.

But there is an alternative. I’ve recently, with a graduate student, Hannah Hobson, put my toe in the waters of Registered Reports, with a paper submitted to Cortex looking at an electrophysiological phenomenon known as mu suppression. The key difference from the normal publication route is that the paper is reviewed before the study is conducted, on the basis of an introduction and protocol detailing the methods and analysis plan. This, of course takes time – reviewing always does. But if and when the paper is approved by reviewers, it is provisionally accepted for publication, provided the researchers do what they said they would.

One advantage of this process is that, after you have provisional acceptance of the submission, the timing is largely under your own control. Before the study is done, the introduction and methods are already written up, and so once the study is done, you just add the results and discussion. You are not prohibited from doing additional analyses that weren’t pre-registered, but they are clearly identified as such. One the study is written up the paper goes back to reviewers. They may make further suggestions for improving the paper, but what they can’t do is to require you to do a whole load of new analyses or experiments. Obviously, if a reviewer spots a fatal error in the paper, that is another matter. But reviewers can’t at this point start dictating that the authors do further analyses or experiments that may be interesting but not essential.

We found that the reviewer comments on our completed study were helpful: they advised on how to present the data and made suggestions about how to frame the discussion. One reviewer suggested additional analyses that would have been nice to include but were not critical; as Hannah was working to tight deadlines for thesis completion and starting a new job, we realised it would not be possible to do these, but because we have deposited the data for this paper (another requirement for a Registered Report), the door is left open for others to do further analysis.

I always liked the idea of Registered Reports, but this experience has made me even more enthusiastic for the approach. I can imagine how different the process would have been had we gone down the conventional publishing route. Hannah would have started her data collection much sooner, as we wouldn’t have had to wait for reviewer comments. So the paper might have been submitted many months earlier. But then we would have started along the long uncertain road to publication. No doubt reviewers would have asked why we didn’t include different control conditions, why we didn’t use current source density analysis, why we weren’t looking at a different frequency band, and whether our exclusionary criteria for participants were adequate. They may have argued that our null results arose because the study was underpowered. (In the pre-registered route, these were all issues that were raised in the reviews of our protocol, so had been incorporated in the study). We would have been at risk of an outright rejection at worst, or requirement for major revisions at best. We could then have spent many months responding to reviewer recommendations and then resubmitting, only to be asked for yet more analyses.  Instead, we had a pretty clear idea of the timeline for publication, and could be confident it would not be enormously protracted.

This is not a rant against peer reviewers. The role of the reviewer is to look at someone else’s work and see how it could be improved. My own papers have been massively helped by reviewer suggestions, and I am on record as defending the peer review system against attacks. It is more a rant against the way in which things are ordered in our current publication system. The uncertainty inherent in the peer review process generates an enormous amount of waste, as publications, and sometimes careers, are abandoned. There is another way, via Registered Reports, and I hope that more journals will start to offer this option.

*Less than two weeks suggests a problem!See here for an example.

Saturday, 5 March 2016

There is a reproducibility crisis in psychology and we need to act on it

The Müller-Lyer illusion: a highly reproducible effect. The central lines are the same length but the presence of the fins induces a perception that the left-hand line is longer.

The debate about whether psychological research is reproducible is getting heated. In 2015, Brian Nosek and his colleagues in the Open Science Collaboration showed that they could not replicate effects for over 50 per cent of studies published in top journals. Now we have a paper by Dan Gilbert and colleagues saying that this is misleading because Nosek’s study was flawed, and actually psychology is doing fine. More specifically: “Our analysis completely invalidates the pessimistic conclusions that many have drawn from this landmark study.” This has stimulated a set of rapid responses, mostly in the blogosphere. As Jon Sutton memorably tweeted: “I guess it's possible the paper that says the paper that says psychology is a bit shit is a bit shit is a bit shit.”
So now the folks in the media are confused and don’t know what to think.
The bulk of debate has been focused on what exactly we mean by reproducibility in statistical terms. That makes sense because many of the arguments hinge on statistics, but I think that ignores the more basic issue, which is whether psychology has a problem. My view is that we do have a problem, though psychology is no worse than many other disciplines that use inferential statistics.
In my undergraduate degree I learned about stuff that was on the one hand non-trivial and on the other hand solidly reproducible. Take for instance, various phenomena in short-term memory. Effects like the serial position effect, the phonological confusability effect, the superiority of memory for words over nonwords, are solid and robust. In perception, we have striking visual effects such as the Müller-Lyer illusion, which demonstrate how our eyes can deceive us. In animal learning, the partial reinforcement effect is solid. In psycholinguistics, the difficulty adults have discriminating sound contrasts that are not distinctive in their native language is solid. In neuropsychology, the dichotic right ear advantage for verbal material is solid. In developmental psychology, it has been shown over and over again that poor readers have deficits in phonological awareness. These are just some of the numerous phenomena studied by psychologists that are reproducible in the sense that most people understand it, i.e. if I were to run an undergraduate practical class to demonstrate the effect, I’d be pretty confident that we’d get it. They are also non-trivial, in that a lay person would not just conclude that the result could have been predicted in advance.
The Reproducibility Project showed that many effects described in contemporary literature are not like that. But was it ever thus? I’d love to see the reproducibility project rerun with psychology studies reported in the literature from the 1970s – have we really got worse, or am I aware of the reproducible work just because that stuff has stood the test of time, while other work is forgotten?
My bet is that things have got worse, and I suspect there are a number of reasons for this:
1. Most of the phenomena I describe above were in areas of psychology where it was usual to report a series of experiments that demonstrated the effect and attempted to gain a better understanding of it by exploring the conditions under which it was obtained. Replication was built in to the process. That is not common in many of the areas where reproducibility of effects is contested.
2. It’s possible that all the low-hanging fruit has been plucked, and we are now focused on much smaller effects – i.e., where the signal of the effect is low in relation to background noise. That’s where statistics assumes importance. Something like the phonological confusability effect in short-term memory or a Müller-Lyer illusion is so strong that it can be readily demonstrated in very small samples. Indeed, abnormal patterns of performance on short-term memory tests can be used diagnostically with individual patients. If you have a small effect, you need much bigger samples to be confident that what you are observing is signal rather than noise. Unfortunately, the field has been slow to appreciate the importance of sample size and many studies are just too underpowered to be convincing.

3. Gilbert et al raise the possibility that the effects that are observed are not just small but also more fragile, in that they can be very dependent on contextual factors. Get these wrong, and you lose the effect. Where this occurs, I think we should regard it as an opportunity, rather than a problem, because manipulating experimental conditions to discover how they influence an effect can be the key to understanding it. It can be difficult to distinguish a fragile effect from a false positive, and it is understandable that this can lead to ill-will between original researchers and those who fail to replicate their finding. But the rational response is not to dismiss the failure to replicate, but to first do adequately powered studies to demonstrate the effect and then conduct further studies to understand the boundary conditions for observing the phenomenon. To take one of the examples I used above, the link between phonological awareness and learning to read is particularly striking in English and less so in some other languages. Comparisons between languages thus provide a rich source of information for understanding how children become literate. Another of the effects, the right ear advantage in dichotic listening holds at the population level, but there are individuals for whom it is absent or reversed. Understanding this variability is part of the research process.
4. Psychology, unlike many other biomedical disciplines, involves training in statistics. In principle, this is thoroughly good thing, but in practice it can be a disaster if the psychologist is simply fixated on finding p-values less than .05 – and assumes that any effect associated with such a p-value is true. I’ve blogged about this extensively, so won’t repeat myself here, other than to say that statistical training should involve exploring simulated datasets so that the student starts to appreciate the ease with which low p-values can occur by chance when one has a large number of variables and a flexible approach to data analysis. Virtually all psychologists misunderstand p-values associated with interaction terms in analysis of variance – as I myself did until working with simulated datasets. I think in the past this was not such an issue, simply because it was not so easy to conduct statistical analyses on large datasets – one of my early papers describes how to compare regression coefficients using a pocket calculator, which at the time was an advance on other methods available! If you have to put in hours of work calculating statistics by hand, then you think hard about the analysis you need to do. Currently, you can press a few buttons on a menu and generate a vast array of numbers – which can encourage the researcher to just scan the output and highlight those where p falls below the magic threshold of .05. Those who do this are generally unaware of how problematic this is, in terms of raising the likelihood of false positive findings.
Nosek et al have demonstrated that much work in psychology is not reproducible in the everyday sense that if I try to repeat your experiment I can be confident of getting the same effect. Implicit in the critique by Gilbert et al is the notion that many studies are focused on effects that are both small and fragile, and so it is to be expected they will be hard to reproduce. They may well be right, but if so, the solution is not to deny we have a problem, but to recognise that under those circumstances there is an urgent need for our field to tackle the methodological issues of inadequate power and p-hacking, so we can distinguish genuine effects from false positives.