Big Data (Part – 2)

“Data is a precious thing because they last longer than systems” – Tim Berners Lee

In a short, Big Data term can be defined with 3V’s. Those are Volume, Velocity and Variety.

Volume: Now-a-days data is growing exponentially. This is very common to have terabytes, petabytes data. This big volume of data represents Big Data.

Velocity: There are several types of data transaction. Such as, real time data transaction, near to real time data transaction, periodical data transaction, batch data transaction. This high velocity of data represents Big Data.

Variety: We have data variety. From our previous article we have seen already about this. We have structured, semi structured, unstructured data, sensor data, log data, image, audio, video and many more. This is a real world challenge to manage these data. This massive level variety of data represents Big Data.

Why we think Big Data is hard?

Store: What if your computer can store only 1TB of data, but you need to store 1PB (1000 TB) of data?

Move: Assume that, you have 10Gb Network which takes 2 hours to copy 1TB or 83 days to copy 1PB, what if you need to move these data?

Search: Assume that, each of your record size is 1KB and your one machine can process 1000 records per second, so it needs 277 CPU days to process 1TB and 785 CPU years to process 1PB records!

 

Leave a Reply

Your email address will not be published. Required fields are marked *