• Self-Paced + Faculty Support option (20% Off)
  • Course Curriculum
  • See Class Video
  • Contact Us
Watch a free video through the sample class tab to learn more about Hadoop & Big Data

Get A Huge 20% Off Till 19th Mar! Enrol Now!

  • Get the Globally Recognized Wiley Certificate after completing a final exam post the course
  • 42 hours self-paced, online video based course
  • Learn at your own time and pace from anywhere through recorded course class videos
  • Get faculty support through email, forum & scheduled calls
  • Get 24x7, life-time access to recorded classes and course material on our learning management system. No time limits!
  • Learn practically through hands-on examples and real world project
  • Get access to a cloud based Hadoop lab 24x7 for your practice for 6 months
  • Work on and submit real-world project at your own time and pace
  • Get career assistance post certification to get into a career in Big Data
Fees: Rs. 22,000/USD 399 Pay only Rs.17,500/- or USD 280 only (plus GST) till 19th Mar.
So enrol Now and save money!!

Enrol Now

We offer a full refund up to 3 days post your enrolment if you would like to cancel, though with such a good deal we wonder who would!

Contact us at info@edvancer.in or call us on +91 8080928948 for more details

Introduction to Big Data & Hadoop
Content Learning goal
What is Big Data? In this module, you will understand the meaning of big data, how traditional systems are limited in their ability to handle big data and how the Hadoop eco-system helps in solving this problem. You will learn about the various parts of the Hadoop eco-system and their roles.
Characteristics of big data
Traditional data management systems and their limitations
Business applications of big data
What is Hadoop?
Why is Hadoop used?
The Hadoop eco-system
Big data/Hadoop use cases
Managing a Big Data Eco-system
Content Learning goal
Big Data technology foundations In this module you will learn how a big data eco-system can be implemented in organizations and the benefits it can bring.
Big data management systems
Approach to big data analytics
Models to support big data analytics
Integrating big data in organizations
Streaming data
Big data solutions
HDFS (Hadoop Distributed File System)
Content Learning goal
HDFS Architecture In this module you will learn the various basic Hadoop shell commands. You will also learn about the distributed file storage system of Hadoop, HDFS, why it is used and how is it different and how files are read and written in the storage system. You will work hands-on in implementing what is taught in this module.
HDFS internals and use cases
HDFS Daemons
Files and blocks
Namenode memory concerns
Secondary namenode
HDFS access options
Hadoop daemons
Basic Hadoop commands
Understand HDFS federation
Hands-on exercise
HBase concepts
Content Learning goal
Architecture and role of HBase HBase is a distributed, versioned, column-oriented, multidimensional storage system, designed for high performance and high availability. Learn all about HBase in this module.
Characteristics of HBase schema design
Implement basic programming for HBase
Combine best capabilities of HDFS and HBase
Introduction to MapReduce
Content Learning goal
MapReduce basics In this module, you will understand the MapReduce framework, how it works on HDFS and learn the basics of MapReduce programming and data flow (Basic Java knowledge will be required in the MapReduce modules)
Functional programming concepts
List processing
Mapping and reducing lists
Putting them together in MapReduce
Word Count example application
Understanding the driver, mapper and reducer
Closer look at MapReduce data flow
Build iterative Mapreduce applications
Hands-on exercises
Advanced MapReduce concepts
Content Learning goal
Understand combiners & partitioners Learn advanced MapReduce algorithms to manage and manipulate data including unstructured data
Understand input/output formats & record reader/writer
Distributed cache
Understanding counters
Perform unit testing of MapReduce applications using MRUnit
Perform local testing of MapReduce applications
Logging for Hadoop testing & report metrics with job counters
Execute a MapReduce WordCount program for analyzing sentiments
Hands-On Exercise
Analyzing data with Pig
Content Learning goal
Pig architecture, program structure and execution process Pig is a platform to analyse large data sets through a high level language. In this module you will focus on learning both to query and analyse large amounts of data stored in distributed storage systems.
Introduction to Pig Latin
Joins & filtering using Pig
Group & co-group
Schema merging and redefining functions
Pig functions
Hands-on examples
Using Hive for Data Warehousing
Content Learning goal
Introduction to Hive architecture Hive is a data warehouse software for managing and querying large scale datasets. It uses a SQL like language, HiveQL to query the data.Learn Hive in-depth in this module.
Using Hive command line interface
Create & execute Hive queries
Data types, operators & functions in Hive
Basic DDL operations
Data manipulation using Hive
Advanced querying with Hive
Different join operations in Hive
Performance tuning & query optimization in Hive
Security in Hive
Hands-on exercise
Automated data processing with Oozie
Content Learning goal
Fundamentals of Oozie Oozie is a workflow co-ordination & scheduler system to manage Hadoop jobs. Learn how to use Oozie in this module to automate data processing and analysis activities.
Implement Oozie workflow
Discuss Oozie co-ordinator & bundle
Understand Oozie execution model
Access Oozie server
Design an Oozie application
Deploy, test & execute Oozie applications
Distributed process co-ordination using Zookeeper
Content Learning goal
Role and benefits of Zookeeper In this module learn to use ZooKeeper which is a high-performance coordination service that aims to store and provide different services, such as naming, maintaining configuration and location information, and providing distributed synchronization to distributed applications.
Zookeeper command line interface
Run Zookeeper
Build Zookeeper applications
Transferring bulk data using Sqoop
Content Learning goal
Basics of Sqoop & Sqoop architecture Sqoop is a tool designed to transfer data between Hadoop and relational databases. Learn how to use Sqoop in this module.
Import data into Hive using Sqoop
Export data from HDFS using Sqoop
Drivers and connectors in Sqoop
Importing and exporting data in Sqoop
Hands-on exercise
Streaming big data into Hadoop using Flume
Content Learning goal
Flume architecture In this module learn to work with Flume which is a service for efficiently collecting, aggregating, and moving large amounts of streaming data into HDFS.
Use Flume configuration file
Configure & build Flume for data aggregation
Hands-on exercise
YARN – Mapreduce 2.0
Content Learning goal
Understand advantages of YARN over Mapreduce YARN, which stands for Yet Another Resource Negotiator, is a general-purpose job scheduler and resource manager for Hadoop 2.0 and provides alternatives to Mapreduce. Learn all about YARN in this module.
Understand the YARN eco-system & architecture
Key concepts of YARN API
YARN vs. mesos
Introduction to Storm
Content Learning goal
Understand Storm architecture Storm is an open-source distributed streaming computation system. Here learn how through its simple programming abstractions, Storm makes it easier to write scalable applications for low-latency data processing.
Anatomy of Storm application
Key concepts of Storm API
Understand Storm on YARN
Spark & Scala
Content Learning goal
Difference between Spark & Hadoop frameworks Apache Spark is a new cluster computing platform, designed for fast and general purpose Big Data processing. Spark is faster than Mapreduce. Spark programs can be written in Java, Scala, or Python. Because Spark is written in JVM language Scala, therefore Scala is the primary choice of language. Learn the highly in-demand technologies of Spark and Scala in this module.
Key components of Spark eco-system
Explain a Spark program flow
Work with basic Scala constructs
Build programs in Spark
Hands-on exercise
Real-World Project
You will work on a comprehensive project based on real datasets to demonstrate your learning and qualify for the certificate. Project submission time will be 30 days from end of course.
See the first class video free

We would love to hear from you regarding any query that you may have be it about the course or about your career.

Contact us for more info

Or email us at info@edvancer.in

Or call us at +91 8080928948

  • Edvancer’s content is better than other institutes with whom I enquired and at much economical cost. After the course I got a job as a Campaign Management Analyst in ICICI Lombard.

    Rohit Kashid – Campaign Analyst, ICICI Lombard
  • It was a great experience and pleasure to learn from Edvancer.  The online class room is as good as a real class room. It was highly interactive with brainstorming on many ideas. The course content also depicts real life scenarios. Altogether it was a great learning experience.

    Vinodh S, Sr. Specialist Architect, Sapient Corp.
  • sumit kamra - Edvancer's Student

    The course was of very high quality and engaging. The interactive atmosphere and live examples were refreshing. The instructor had the real world experience to understand our needs and was easily reachable at any point of the time. I highly recommend this course.

    Sumit Kamra, Project Manager, ICICI Bank
  • The data science course provides an in-depth understanding of analytics with hands-on experience on R & Python using case studies from varied domains. You get all one needs for excelling in the field of analytics. The faculty have a very good grasp of all the concepts and the Edvancer team is very supportive.

    Girish Punjabi, Senior Business Analyst, IKen-IIT Bombay
  • I got a great job as Sr. Analyst with a 75% pay hike post this course! The course is a perfect blend of analytics tools and techniques. If you want to learn real stuff in analytics and not just the theoretical concepts, this course is for you.

    Ashish Kumar – B.Tech, IIT Madras

Benefits of taking the Hadoop & Big Data course

  • Learn to store, manage, retrieve and analyze Big Data using the Hadoop eco-system
  • Become one of the most in-demand Big Data experts in the world today
  • Learn how to analyze large amounts of data to bring out insights
  • Relevant examples and cases make the learning more effective and easier
  • Gain hands-on knowledge through the problem solving based approach of the course along with working on a project at the end of the course

Who should take this course?

This course is designed for anyone who:
  • wants to get into a career in Big Data
  • wants to analyse large amounts of unstructured data
  • wants to architect a big data project using Hadoop and its eco system components


  • You may want to comfortable with some programming language for this course. (We provide you with our 'Java Primer for Hadoop' course complimentary with this course so that you can learn basics of Java for the course)