Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations.
In my world it seems like everyone is talking about big data but when I move out of my specialist world and into the more ‘normal’ NHS front-line, and rub shoulders with nurses and other health care professionals, it doesn’t seem to have entered their world at all. But in truth big data is everything about their world – in future it could have a profound effect on care and everyone will have a role to play.
What is it?
Big data is really just lots and lots of data, from different places, that is mashed together and then analysed. It has become increasingly possible to understand data as more sophisticated computing power has come along. Modern computing power allows us to analyse what would have seemed impossible in the past. Now we can also store volumes of data that would have seemed impossible not so many years ago. We can now analyse data that is less well structured and still make it meaningful, especially by spotting patterns and trends that can then lead us to more detailed analysis.
I liken it to those fancy scanners they use on ‘time team’. The scanners give you clues what might have been underground but actually until you do the digging you may not be able to make real sense of it. Like the scanners big data can help you to see interesting patterns but often it needs much closer scrutiny – it takes a bit of digging to really understand. But if you couldn’t do the scanning you would never know there was anything interesting underground. Big data allows you to create new hypotheses and spot new relationships in care and treatments.
Of course big data isn’t just used in healthcare, it can be used in so many areas of life. Commercial companies are keen to tap into it to give them an edge to understand, for example, our purchasing behaviours; sports men and women can use it to improve performance; and we can use it in education to better understand how we develop and learn new skills. In all these areas it has the potential to transform and make a real difference. In fact it has potential in so many areas of our lives.
Why does it matter?
In healthcare it matters because the data may have the clues to many disease processes that in the past have eluded our understanding. I have had type 1 Diabetes for nearly 35 years and in truth it feels like there has been very little progress in our understanding of the ‘why’ of Diabetes. Yes, treatments have improved but it often feels like a crude guessing game – and I apologise for that statement to all the wonderful scientists working in the field but I think big data might help them to get to the point more quickly.
The very precious nature of healthcare data
Of course any debate about access and storage of healthcare data is rightly heated and contested. Data about your health is one of the most personal aspects of your life and most people have a view about what it can be used for and who should have access to it. I agree that I should have some control but I really do want someone to find a cure for Diabetes. If I thought gifting my data, with some controls for privacy, would help to stop another young person at 16, as I was, finding out they had to face a lifetime of Diabetes I would do it gladly and willingly. Yet the debate about privacy and confidentiality continues to rage in the public domain. We need to get this right – no excuses and no easy options; protecting the rights of individuals goes without saying.
If you are interested in what people who have chronic conditions want to use their data for then ‘Patients Like Me’ is a great case study to look at. I know that the data belongs to those individuals and they have the right to do with it what they will. I do not want this post to be hi-jacked by the issue of privacy or confidentiality, nor am I saying it doesn’t matter – I just believe there are also other considerations to think about.
Data quality and responsibility
For practitioners big data does have an impact. Not only has it got the potential to transform how we deliver care in the future but practitioners have a responsibility to ensure the data they collect is high quality. In the past many records were rarely reviewed and languished for decades in medical records libraries in the bowels of hospitals. Now, and in the future, information we record will have a different visibility and transparency and we would do well to remember this.
Skills we will need
So the brave new world demands that we also have new skills. Being data savvy will, I believe, become a basic skill expected of people who work in the system and will go beyond simple statistics and the ability to use spread-sheets. We need skilled specialists too, people who can really help us to get to the nub of the data meaning.
Moving from knowledge to wisdom
But the most important addition we will all make to the big data debate is that of providing the context. Moving from knowing facts to a possessing wisdom requires us to throw upon the debate the light of truth and add our tacit knowledge and experience. It is people who provide this context, the insights and the meaning, turning facts into knowledge and then applying this to achieve greater wisdom; an endeavour we should all be contributing to. Here I mean ‘everyone’ – I don’t mean just people who work in systems, I mean just that: ‘everyone’. It is only if we have this whole context will we really be able to take the meaning from the data and take the steps we need to real wisdom.
Watch this TedTalk by Susan Etlinger to understand why big data is a journey we should all be engaged with. The title of my blog relates directly to her brilliant talk:
Big Data – Orwell or Huxley | Big Data + ...
Great blog Anne.
How do we engage frontline colleagues in the benefits from big data when they receive little or no fedback from the data collection activities they undertake?. There is a lack of development in this area, systems are deployed and not fully realised an opportunity to enhance health care not yet embraced.
Thanks for comment Angela. You are right but this is a leadership issue. When talking to front line staff I always tell them, if you are aksed to provide data, always ask what it is used for and make it a condition that you will get the data back in reports etc. ward to board also needs turning on its head! Two way flow 🙂
But this isnt a reason for big data not being important. Its part pf the leadership journey for the profession.
Sadly staff feel distanced from influencing at board level by the management hierarchies that exist. They have compelling stories to tell based on their direct interactions with service users. As we have said before leadership at all levels is so important and not always embedded in organisational developmental.
Big Data – Orwell or Huxley | Hadoop Outs...
Big Data – Orwell or Huxley | Implication...
An excellent piece of inspiration again. I agree, leadership and influence are the key factors here. Ensuring that “two way flow” happens can then have impact on the quality of data that ‘feeds’ the system.
Big Data – Orwell or Huxley | Health Blog...
Big Data is more than just data that is big. There are two important aspects of Big Data you’ve not mentioned.
1) Correlation not Causation. Yes Big Data is that way round. The reason is not that there is a lot of it, but that there are a lot of independent variables. Dr Foster tried to create a league table of hospitals based on disparate data like HSMR, access to certain treatments, ratio of staff to beds, infection control, etc. They had to combine 16 values into a single ‘score’. How do you weight each value (is HSMR more ‘important’ than death rate from a broken hip)? Dr Foster only did this league table for one year and then realising that the result was largely meaningless, stopped doing it (now they just produce a league table of HSMR, one variable). Think of the data that is collected via something like care.data: age, gender, medication, use of A&E, history of secondary care referrals, history of primary care usage, etc etc. Trying to convert all of these data items into a model (for example) to determine if there should be early intervention, is not easy. You *cannot* produce a “law” that gives the chance you’ll use A&E this year given your age, the deprivation level of where you live and your usage of A&E last year. You can, however, provide a correlation. And people have, and are, using such Big Data to teach neural networks to come up with the correlation.
2) Fuzzy, incomplete data. Big Data is incomplete. It has to be because it is so big that it is impossible to fill in all of the gaps. This means that you have to accept is incompleteness when you analyse Big Data; you have to accept that it will have empty values (NULL values) for some data, and in other cases downright wrong values. The analysis has to take this into account and tell the users of the results what statistical effect this fuzziness causes. Some advocates of Big Data tell us it is n=All (ie not a population sample, but the entire population). This is a naive understanding of Big Data because as I have already said, you can never guarantee the dataset is complete.
Both of these aspects means that you have to interpret Big Data in a different way to population sample data. I am not yet convinced that we are doing that.
I agree with all your comments hence my points about it giving clues and needs context and other knowledge to create wisdom.
My blog was not created to give a theoretical exploration of all of the contexts if big data. More to start conversations in nursing about its potential relevance.
I’m chairing a panel tomorrow and Macmillan are presenting. You might like this work: http://www.macmillan.org.uk/Aboutus/Ouresearchandevaluation/Programmesofwork/Routesfromdiagnosis.aspx