Data Science Essentials

Course is a mix of science, information technology and industry domain information.

This course enables all participating learners to start their careers into Data Science. Ideally designed to suit learners to become researchers, analysts or visualizers.

This is a project and hands-on intensive program, all students will run through practical development experience using an integrated data set.

Best fit program for tech & non-tech graduates and working professionals to create a new career in their own industry as a data analyst.

Program Duration

4 months

Daily or Weekend

3 – 4 hours per day

Program Covers

Python Programming
R Programming
Big Data with Hadoop 2

Detailed course

Big Data with Hadoop

• Big Data and its Sources
• RDBMS vs. Hadoop
• Hadoop Architecture and Ecosystem
• When to Use and Not use Hadoop
• HDFS Characteristics and Definitions
• HDFS Design and Architecture Overview
• Accessing HDFS
• HDFS Commands
• Basic File System Operations
• HDFS Administration Commands
• HDFS Features and Benefits

R Programming

Introduction to R
• Math, Variables, and Strings
• Vectors and Factors
• Vector operations

Data structures in R
• Arrays & Matrices
• Lists
• Data frames

R programming fundamentals
• Conditions and loops
• Functions in R
• Objects and Classes
• Debugging

Working with data in R
• Reading CSV and Excel Files
• Reading text files
• Writing and saving data objects to file in R

Strings and Dates in R
• String operations in R
• Regular Expressions
• Dates in R

Python Programming

• Introduction to Python
• Writing Our First Python Program
• Data types in Python
• Operators in Python
• Input and Output
• Control Statements
• Arrays in Python
• Strings and Characters
• Functions
• Lists and Tuples

Machine Learning for Data Science & Analytics

Machine learning vs. Statistical modelling

Supervised vs. Unsupervised Learning
• Machine Learning Languages, Types, and Examples
• Machine Learning vs Statistical Modelling
• Supervised vs Unsupervised Learning
• Supervised Learning Classification
• Unsupervised Learning

Supervised Learning
• Understanding nearest neighbour classification
• The KNN algorithm
• Measuring similarity with distance
• Choosing Appropriate K
• Use Case

Classification Using Naïve Bayes
• Basic Concepts of Bayesian Methods
• Probabilistic Learning

Classification using Decision Trees
• The C5.0 decision tree algorithm
• Understanding Classification Rules
• Separate and Conquer
• Rules from decision trees
• Advantages & Disadvantages of Decision Trees

Understanding Regression
• Simple Linear Regression
• Ordinary least Square estimation


Multiple Linear Regression

Support Vector Machines
• Classification with Hyper planes
• Using Kernels for non-linear spaces

Neural Networks
• Black Box Methods
• Training neural networks with back propagation

Unsupervised Learning

Association Rules – Pattern detection
• K-Means Clustering plus Advantages & Disadvantages
• Hierarchical Clustering plus Advantages & Disadvantages
• Measuring the Distances Between Clusters – Single Linkage Clustering
• Measuring the Distances Between Clusters – Algorithms for Hierarchy Clustering
• Density-Based Clustering

Evaluating Model Performance

Improving Model Performance

Data Visualization with Tableau

Introduction to Tableau Desktop

Connecting to Data

Customizing a Data Source
• Filtering Your Data
• Sorting Your Data
• Creating Groups in Your Data
• Creating Hierarchies in Your Data
• Working with Date Fields: Discrete and Continuous Time
• Working with Date Fields: Custom Dates
• Working with Multiple Measures: Dual Axis and Combo Charts
• Working with Multiple Measures: Combined Axis Charts
• Showing Relationships between Numerical Values
• Mapping Data Geographically
• Using Crosstabs: Totals and Aggregation

Using Crosstabs: Highlight Tables
• Using Crosstabs: Heat Maps
• Using Calculations: Customize Your Data
• Using Calculations: Working with Strings, Dates, and Type Conversion Functions
• Using Calculations: Working with Aggregations
• Using Quick Table Calculations to Analyze Data
• Showing Breakdowns of the Whole
• Highlighting Data with Reference Lines
• Create a Dashboard: Combining Your Views
• Create a Dashboard: Add Actions for Interactivity
• Sharing Your Work

Working with a Data Extract
• Joining Tables
• Blending Multiple Data Sources
• Blending Data without a Common Field
• Using Split and Custom Split
• Advanced Calculations: Aggregating

• Controlling Table Calculations
• Showing the Biggest and Smallest Values
• Using Level of Detail Expressions
• Filtering and LOD Expressions
• Using Parameters to Control Data in the View
• Parameters: Swap Measures

Using Sets to Highlight Data
• Advanced Mapping: Modifying Locations
• Advanced Mapping: Customizing Tableau’s Geocoding
• Advanced Mapping: Using a Background Image
• Viewing Distributions
• Comparing Measures Against a Goal
• Showing Statistics and Forecasting: Use the Analytics Pane and Trend Lines Advanced

Dashboards: Using Design Techniques and Filter Actions

Telling Stories with Data

Enroll Now