Friday, March 11, 2016

Big Data overview and keywords

What is Big Data ?
Big Data is collection of large data-sets which cannot be processed using traditional data handling techniques.
What big data includes? 
It includes data from Black box data, Stock exchange data, Social media data, Power grid data, Transport data, Search engine data, etc.
What are big data challenges?
Big data challenges include, capturing data, transfer of data, storage, searching of data, sharing, analysis and presentation.
What is Mongo DB?
This is 4th most used database and its NoSQL, it is massively parallel processing database system.
MapReduce :- it divides task and assign it to computer connected.
Hadoop:- Belong to Apache where data is processed in parallel o different CPU nodes. It is capable of running Application on cluster of computers.
HDFS(Hadoop Distributed File System):-It uses Master Slave architecture, NameNode manages file system metadata. DataNode stores actual data.

There are few Big data provider in market which offers big data services :


Services offered by AWS  are many but few of then are as follows:

  1. Amazon public elastic compute cloud(EC2)
  2. Amazon Elastic Map Reduce(Hadoop)
  3. Amazon DynomoDB
  4. Amazon simple storage service(S3)
  5. Amazon High Performance computing
  6. Amazon Redshift
Services offered by google big data service:

  1. Google compute engine 
  2. Google Big Query 
  3. Google Prediction API


No comments:

Post a Comment