Bigdata Basics
Big data is a large storage of different types of data. Here different types refer to the different formats, for example, audio, video, text, graphics images, etc. For example, Facebook stores text information, images, and videos, etc.

IBM defines big data as The data which have 4 V's is big data

High Speed Data

Velocity refers to the speedy analysis i.e. analysis of streaming data.

Bulk Data

Volume refers to the analysis of large volume datasets.

Variety of Data

Variety refers to the analysis of different kinds of data.

Uncertainity of data

Veracity refers to the analysis of uncertain data i.e. sometime data speed is very high whereas sometime poor.

Big Data Generators

Almost all the social media (like Facebook, Twitter, Skype, Linkedin etc.), Space science, sensors etc., are generating big data day by day. All are using big data for their storage purpose.

Why Big data ?

Conventional data storage systems are not able to

  • Store Large amount of data.
  • Store data in different formats (ex. a post have videos, audios ,texts, tags, likes, comments,etc).
  • Process streaming data, (data, coming non stop, with speed, ex. weather sensors data) i.e. speed processing.
  • Scale up.

And many more, due to these reasons big data is necessary to use for storage and processing. As today's world is generating big data which can not be handled and processed by conventional systems

Tools to Process Big Data

Apache Hadoop is used for the processing of BigData. It is Highly scalable, distributed and available open source. It uses Advance java for processing. And uses HDFS (Hadoop Distributed File System) for storing and managing files. It has high degree of fault tolerance and was designed to scale up many servers with shared distributed work. It runs on simple hardware(multiple systems) and treats each server or node as uses map reduce technique for the processing of data.

It is clear from the study that how important big data is, in the todays modern world
