Facets Demo New Batches Starting from Saturday... 22-10-2016
Search Course Here

Live Chat

Cloudera Developer Apache Hadoop


Big Data Analytics with R and Hadoop is focused on the techniques of integrating R and Hadoop by various tools such as RHIPE and RHadoop. A powerful data analytics engine can be built, which can process analytics algorithms over a large scale dataset in a scalable manner.
  • should have programming experience
  • knowledge of Java
  • It is a 16 days program and extends up to 2hrs each.
  • The format is 40% theory, 60% Hands-on.

  • It is a 4 days program and extends up to 8hrs each.
  • The format is 40% theory, 60% Hands-on.
    Private Classroom arranged on request and minimum attendies for batch is 4.
course content
  • The Motivation For Hadoop
    • Problems with traditional large-scale systems
    • Requirements for a new approach
  • Hadoop Basic Concepts
    • An Overview of Hadoop
    • The Hadoop Distributed File System
    • Hands-On Exercise
    • How MapReduce Works
    • Hands-On Exercise
    • Anatomy of a Hadoop Cluster
    • Other Hadoop Ecosystem Components
  • Writing a MapReduce Program
    • The MapReduce Flow
    • Examining a Sample MapReduce Program
    • Basic MapReduce API Concepts
    • The Driver Code
    • The Mapper
    • The Reducer
    • Hadoop's Streaming API
    • Using Eclipse for Rapid Development
  • Integrating Hadoop Into The Workflow
    • Relational Database Management Systems
    • Storage Systems
    • Creating workflows with Oozie
    • Importing Data from RDBMSs With Sqoop
    • Hands-On Exercise
    • Importing Real-Time Data with Flume
    • Accessing HDFS Using FuseDFS and Hoop
  • Delving Deeper Into The Hadoop API
    • Using Combiners
    • Using LocalJobRunner Mode for Faster Development
    • Reducing Intermediate Data with Combiners
    • The configure and close methods for MapReduce Setup and Teardown
    • Writing Partitioners for Better Load Balancing
    • Directly Accessing HDFS
    • Using The Distributed Cache
  • Using Hive and Pig
    • Hive Basics
    • Pig Basics
  • Common MapReduce Algorithms
    • Sorting and Searching
    • Indexing
    • Machine Learning with Mahout
    • Term Frequency - Inverse Document Frequency
    • Word Co-Occurrence
  • Practical Development Tips and Techniques
    • Testing with MRUnit
    • Debugging MapReduce Code
    • Using LocalJobRunner Mode for Easier Debugging
    • Eclipse development techniques
    • Retrieving Job Information with Counters
    • Logging
    • Splittable File Formats
    • Determining the Optimal Number of Reducers
    • Map-Only MapReduce Jobs
    • Implementing Multiple Mappers using ChainMapper
  • More Advanced MapReduce Programming
    • Custom Writables and WritableComparables
    • Saving Binary Data using SequenceFiles and Avro Files
    • Creating InputFormats and OutputFormats
  • Joining Data Sets in MapReduce Jobs
    • Map-Side Joins
    • The Secondary Sort
    • Reduce-Side Joins
  • Graph Manipulation in Hadoop
    • Introduction to graph techniques
    • Representing Graphs in Hadoop
    • Implementing a sample algorithm: Single Source Shortest Path
  • Creating Workflows with Oozie
    • The Motivation for Oozie
    • Oozie's Workflow Definition Format
For Videos Click Here Videos

Flash News

AngularJS New Batch Start From 09th OCT & 10th OCT.

Hadoop Dev New Batch Start From 10th OCT & 11th OCT.

IBM COGNOS TM New Batch Start From 11th OCT & 12th OCT.

Informatica Dev New Batch Start From 12th OCT & 13th OCT.

Mean Stack New Batch Start 13th OCT & 14th OCT.

SAP BODS new Batch Starting From 14th OCT & 15th OCT.

SAP S/4 HANA New Batch Start From 15th OCT & 16th OCT

Tableau New Batch Start From 16th OCT & 17th OCT


Facets Demo Training

Demo Schedule : 08:30P.M EST / 07:30P.M CST / 05:30P.M PST on 21st OCT & 06:00A.M IST on 22nd OCT
Email :
Rediff Bol :
Google Talk :
MSN Messenger :
Yahoo Messenger :
Skype Talk :