Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New methods of working with big data, such as Hadoop and MapReduce, offer alternatives to traditional data warehousing.
Big Data Analytics with R and Hadoop is focused on the techniques of integrating R and Hadoop by various tools such as RHIPE and RHadoop. A powerful data analytics engine can be built, which can process analytics algorithms over a large scale dataset in a scalable manner. This can be implemented through data analytics operations of R, MapReduce, and HDFS of Hadoop.
prerequisties
Working professionals, managers and recent graduates are eligible for the program. We do not specify any academic background requirements.
Elementary programming skills.
Duration
Online
It is a 16 days program and extends up to 2hrs each.
The format is 40% theory, 80% Hands-on.
Corporate
It is a 4 days program and extends up to 8hrs each.
The format is 40% theory, 80% Hands-on.
Classroom
Private Classroom arranged on request and minimum attendies for batch is 4.
course content
Introduction to Big Data
Logistics
Analysis through DataVisualization
Understanding the "business case" and defining a solution framework
An introduction to R programming language and environment
Techniques of Pre-processing data (Binning, Normalizing, Filling missing values, removing noise)
Data Pre-processing—continued
Traps and Errors
Confusion matrix, Analyze False positives and False Negatives from a problem perspective
Different error measures used in Forecasting
Model Selection
K-fold validation
Introduction to Decision Trees and their structure
Construction of Decision Trees through simplified examples
Choosing the "best" attribute at each non-leaf node
Entropy
Information Gain
Generalizing Decision Trees
Information Content and Gain Ratio
Dealing with numerical variables other measures of randomness
Inductive learning from a 500-ft view
Issues in inductive learning like curse of dimensionality
Overfitting
Bias-Variance tradeoff
Pruning a Decision Tree
Cost as a consideration
Unwrapping Trees as rules
A mathematical model for association analysis
Large itemsets and Association Rules
Apriori
Constructs large itemsets with minisup by iterations
Interestingness of discovered association rules
Application examples
Association analysis vs. Classification
Using Association Rules to compare stores
Dissociation Rules
Sequential Analysis Using
Association Rules
Data visualization and Story-telling
Anatomy of a graph
Animated graphs, BI dashboards and the latest trends in data visualization
An end-to-end case study in R involving understanding the data
Filling the missing values
Applying and assessing models and reporting the results.
Videos
Big Data Analytics with R and Hadoop Videos will be updated soon
To Watch More Videos Click Here
Flash News
PUBLIC DEMO
(1) Workday Technical Demo Training
Demo Schedule :09:30 P.M EST / 08:30 P.M CST / 6:30 P.M PST on 13th April & 07:00 A.M IST on 14th April