Big Data is a term which is becoming more and more commonly used. For those of you who work with data, you’ve probably heard of it. For those of you who haven’t heard it yet – now you have.
As the importance of data itself is increasing, so too does the importance of Big Data. You have probably started to hear about it on the news or seen it in the media over the last couple of years – especially with the more recent Cambridge Analytica stories regarding Facebook data, and the GDPR emails you will have most definitely received.
The closest definition to big data is that of the McKinsey Global Report from 2011:
‘Big Data is data whose scale, distribution, diversity, and/or timeliness require the use of new technical architectures and analytics to enable insights that unlock new source of business value.’
There are however, lots of definitions, as ‘Big Data’ is a catch-all term. This definition is suitable to cover most situations as it is general enough to be applied across different fields, but specific enough to class as a useful working description.
To get a better understanding of Big Data it is valuable to look at the consistent characteristics of Big Data – known as the 3Vs (velocity, volume and variety).
To break this down:
- Velocity (or distribution, from the McKinsey definition above) refers to the speed at which it’s created, and its growth.
- The volume (or scale) of Big Data can be billions of data points (for example social media).
- The variety (or diversity) can be a plethora of data types and structures, really emphasising the need for new technologies, like software framework ‘Hadoop’.
Social media is a great example of a source of Big Data. For example, Facebook, Twitter, LinkedIn etc all have lots of data from the people that have signed up, post and engage with content – because there is so much of this data, it is classed as Big Data. Due to its vast amount, it would not easily fit into traditional technologies, so as a result new tools and technologies must be developed to analyse it, and enable creation, manipulation, and management.
Other examples include mobile sensors, video surveillance, smart grids, and medical imaging. Generally, these are all things which would fall under the term ‘big’, but the actual definition is still not 100% clear due to the nature and progression of what we understand as big data.
If you found this blog interesting, why not read our last blog on Open Data or check out some of the Open Data sets on the Exeter Data Mill.
Or share what you have found using the hashtag #ExeterDataMill.