The growth in big data analytics and the Internet of Things (IoT) is estimated to add £322bn to the UK economy by 2020, in addition to creating 182,000 new jobs.


Big data is expected to result in efficiency savings of £220.4bn, innovation benefits of £12.4bn and business creation benefits of £8.1bn in the next five years, according to the SAS report.

The IoT is anticipated to bring efficiency benefits of £72.5bn, innovation benefits of £4.5bn and business creation benefits of £4.3bn.

With respect to adoption rates, around 56% of companies surveyed in the SAS report have currently rolled out big data analytics. This is expected to rise to 67% by 2020.

“Just under half of UK businesses are not using any form of Big Data analytics, and those that are will sometimes be using it infrequently in just one or a few areas of the business. Less than one in three have adopted IoT.”

Manufacturing, professional services, retail banking and telecoms will benefit from the growth in the next few years.


Big Data Analytics – Learning Objectives

After completing the course, students should be able to

  • Characterize the phenomena of Big Data and Big Data Analytics

  • Analyze and apply different visual analytics concepts and tools for a big data sets

  • Analyze and apply different concepts, methods, and tools for analyzing big data in organzizational contexts

  • Understand the linkages between business intelligence and business analytics and the potential benefits for organziations

  • Critically assess the ethical and legal issues in Big Data Analytics


  1. What is Big Data & Why Hadoop?

What is Big Data?

Characteristics of big data

Traditional data management systems and their limitations

What is Hadoop?

Why is Hadoop used?

The Hadoop eco-system

Big data/Hadoop use cases


  1. Hadoop Overview & its Ecosystem

“Anatomy of Hadoop Cluster, Installing and Configuring Hadoop

Hands-On Exercise”


  1. HDFS – Hadoop Distributed File System

HDFS Architecture

HDFS internals and use cases

HDFS Daemons

Files and blocks

Namenode memory concerns

Secondary namenode

HDFS access options

Installing and configuring Hadoop

Hadoop daemons

Basic Hadoop commands

Hands-on exercise


  1. Mapreduce Anatomy

“How Mapreduce Works?

The Mapper & Reducer, Input Formats & Output Formats, Data Type & Customer Writable”

Functional programming concepts

List processing

Mapping and reducing lists

Putting them together in MapReduce

Word Count example application

Understanding the driver, mapper and reducer

Closer look at MapReduce data flow

Additional MapReduce functionality

Fault tolerance

Hands-on exercises


  1. Developing MapReduce Program

Setting up Eclipse Development Environment

Creating Mapreduce Projects

Debugging and Unit Testing MapReduce Code

Testing with MRUnit


  1. Hive, Pig & Mahout

Pig program structure and execution process

Joins & filtering using Pig

Group & co-group

Schema merging and redefining functions

Pig functions

Understanding Hive

Using Hive command line interface

Data types and file formats

Basic DDL operations

Schema design

Hands-on exercises


  1. Introduction to Analytics

What is analytics and why is it so important?

Applications of analytics

Different kinds of analytics

Various analytics tools

Analytics methodology

Case study


  1. Basic Analytic Techniques


Introduction to R

Data Exploration with R

Data Preparation with R

Data Visualization with R

  1. Fundamentals of R

Installation of R & R Studio

Getting started with R

Basic and Advanced Data types in R

Variable operators in R

Working with R data frames

Reading and writing data files to R

R functions and loops

Special utility functions

Merging and sorting data

Case study on data management with R


  1. R and Hadoop Overview

“Introduction to R tool

R and Hadoop Integration

Hadoop Streaming using R

RHadoop Overview

RHive Overview”


  1. Getting Data into the R environment

“Builtin data

Reading local data

Web data”


  1. Predictive Modeling Techniques

Linear Regression

Logistic Regression

Cluster Analysis

Decision Trees

Time Series Analysis


  1. Descriptive statistics

“Continuous data

Scatter plot

Box plot

Categorical data

Mosaic plot



  1. Inferential statistics

“T-test and non-parametric equivalents

Chi-squared test, logistic regression

Distribution testing

Power testing”

  1. Linear Regression

“Linear models

Regression plots



  1. Sophisticated Graphics in R



Interactive graphics

Animated GIF




Big Data Hadoop Data Analytics Training in chennai is Primarily hands-On & available as

Classroom / Online / Corporate Training

Call – +91 9789968765 / +91 99627 74619 / +91 9176HADOOP / 044 – 42645495

Big Data Analytics Training in Chennai

Updated on 2016-03-05T10:29:29+00:00, by admin.