Big Data Analytics (BDA) comes into the picture when we are dealing with the enormous amount of data that is being generated from the past 10 years with the advancement of the science and technology in different fields. To process this large amount of data and getting valuable meaning from it in a short span of time is a really challenging Task. Especially when four V’s that comes into the picture, when we discussing about BDA i.e. Volume, Velocity, Variety and Veracity of data.
Why and When to go for Big Data Analytics
Big data is a Revolutionary term that describes the Very large amount (Volume) of unstructured (text, images, videos), structured (tabular data) and semi-structured (json, xml) data that have the potential to be mined for information.
Volume (data at scale)
Volume is about large amount of data that is being generated daily from different type of sources, i.e. Namely, we can say like Social media data (Facebook, Twitter, Google), Satellite images, mining and sensor data, Different Type of Network logs generated from servers.
Integrating and processing these huge volumes of data, stored across a scalable and distributed environment poses a business huge challenge to analysts. Big IT Giants like Yahoo, Google generates Peta Bytes of data in less span of time. IT industry, the increase in data volume is in exponential terms compared to the past.
Velocity (Speed at which data transfer)
Processing huge amount of data in a fraction of seconds and deriving insights from it. For better understanding, let’s take the case of the telecom domain. This domain generates CDR (Call Detail Records) data in terms of Giga bytes per hour. So the network bandwidth and passage of these data for processing through network B/W is a very important factor. Big Data & Analytics shifts from analyzing data after it has landed in a warehouse or mart to analyzing data in motion as it is generated, in real time.
Variety (Different forms of data)
Variety is about managing many types of data and understanding and analyzing them in their native form. Almost 80% of all the data created daily is unstructured – videos, social media, satellite images, machine and sensor data. Big Data Analytics shifts from cleansing data before analysis to analyzing the information as is and cleansing only when needed.
Veracity (Uncertainty of data)
The veracity of the big data increases complexity and it becomes very hard to establish veracity, which is essential for confident decisions to be taken in real time situations. In the present times, then ¾ the of all available data is uncertain and in this situation dealing with this data and getting the confidence of valuable insights from it is the key business feature for the market situations. So confidence building to the industry clients dealing with the veracity of data is the key task.
Getting meaningful insights from large data sets and processing these data sets in a sensible amount of time is a challenging task. Traditionally, data have always been dominated by trial-and-error analysis and this approach has become impossible when data sets are large and heterogeneous. Machine learning will enable cognitive systems to learn reason and engage with human in a more natural and personalized way.
No comments:
Post a Comment