Cloud Computing with Hadoop training in Washington DC
George Mason University, Volgenau School of Engineering

TAIT 0519: Cloud Computing with Hadoop


Cloud Computing with Hadoop course is a combination of administration and programming using Hadoop ecosystem components to show working with Big data. Topics covered in this hands-on course include Map Reduce, Hive, Pig, Zoo Keeper, Sqoop and multi node setup of Hadoop Cluster. This course focuses on training participants on setting up Hadoop infrastructure, writing Map Reduce Programs, Hive and Pig Scripts, working with HDFS, Zoo keeper and Sqoop.

Course Objectives:

1. Understanding Distributed , parallel ,cloud computing ,No sql concepts
2. Setting up Hadoop infrastructure with single and multi-node cluster
3. Understanding of concepts of Map and Reduce and functional Programming
4. Writing Map and Reduce Programs, Working with HDFS
5. Writing Hive and Pig Scripts and working with Zoo Keeper and Sqoop
6. Ability to design and develop applications involving large data using Hadoop ecosystem

Back to top

Audience and Prerequisites

This course is designed for individuals who want to learn Hadoop and who have a basic understanding Unix , Java and SQL scripting.

Back to top

Course Outline Detail

Introduction to Hadoop

  • Distributed computing
  • Parallel computing
  • Concurrency
  • Cloud Computing
  • Data Past, Present and Future
  • Computing Past, Present and Future
  • Hadoop
  • NoSQL

Hadoop Stack

  • MapReduceNoSQL
  • CAP Theorem
  • Databases: Key Value, Document, Graph
  • Hive and Pig
  • HDFS

Lab: Hadoop Hands-on

  • Installing Hadoop Single Node cluster
  • Understanding Hadoop configuration files

MapReduce Introduction

  • Functional – Concept of Map
  • Functional – Concept of Reduce
  • Functional – Ordering, Concurrency, No Lock, Concurrency
  • Functional – Shuffling
  • Functional – Reducing, Key, Concurrency
  • MapReduce Execution framework
  • MapReduce Partitioners and Combiners
  • MapReduce and role of distributed filesystem
  • Role of Key and Pairs
  • Hadoop Data Types

Lab: MapReduce Exercises

  • Understanding Sample MapReduce code
  • Executing MapReduce code

HDFS Introduction

  • Architecture
  • File System
  • Data replication
  • Name Node
  • Data Node


  • Architecture
  • Data Model
  • Physical Layout
  • DDL DML SQL Operations

Lab: Hive Hands ON

  • Installation
  • Setup
  • Exercises


  • Rationale
  • Pig Latin
  • Input, Output and Relational Operators
  • User Defined Functions
  • Analyzing and designing using Pig Latin

Lab: Pig Hands on

  • Installation
  • Setup
  • Executing Pig Latin scripts on File system
  • Executing Pig Latin scripts on HDFS
  • Writing custom User Defined Functions

Lab: Accumulo

  • Setup
  • Ingesting Data
  • Querying for Data
  • Difference between Accumulo and Cassandra
  • Writing custom User Defined Functions

Introduction to Zoo Keeper

Introduction to Sqoop

Hadoop Multi node Cluster Setup

  • Installation and Configuration
  • Running MapReduce Jobs on Multi Node cluster

Working with Large data sets

  • Steps involved in analyzing large data
  • Lab walk through


Click here to download the registration form (fax or mail)

Register online*

*Full payment by Visa or Mastercard only required at the time of online registration.


November 8, 2018 - December 13, 2018
Time: 6:00 PM - 10:00 PM
Schedule Details

Location: Loudoun
Section: L16



4.0 CEUs
40 Hours
Onsite Opportunity

Enhance your organization's competitive edge!

George Mason University's TechAdvance Program can tailor programs to meet your organization's needs. Companies or agencies interested in bringing this program on site should contact TechAdvance at 703-993-1551.

Contact Info.

  George Mason University
Volgenau School of Engineering
  3351 Fairfax Drive, Suite 448
  Arlington, VA 22201

Telephone: 703-993-1551
Volgenau School of Engineering George Mason University