Mastering Data Science & Big Data Analytics

Course is a mix of science, information technology and industry domain information.

This course enables all participating learners to start their careers into Data Science. Ideally designed to suit learners to become researchers, analysts or visualizers.

This is a project and hands-on intensive program, all students will run through practical development experience using an integrated data set.

Best fit program for tech & non-tech graduates and working professionals to create a new career in their own industry as data analyst.

Program Duration

5 months

Daily or Weekend

3 – 4 hours per day

Program Covers

Python Programming core & advanced
Python Scripting & Web Development

Detailed course

Python Programming

Introduction To Python

Installing Python

The Python Interpreter

Basics of Python
• Variables
• DataTypes

Python Strings
• String Methods
• Python Numbers and Booleans

Python Lists

Python Sets

Python Tuples

Python Ranges

Python Dictionaries

• Dictionaries – Methods
• Conversions Between Data Types Python Conditionals, Loops and Exceptions
• Conditionals – If / Elif / Else
• Loops – For / While
• Exceptions
• Try / Except / Else / Finally

Python Regular Expressions

Python Classes and Objects
• Classes – Objects
• Objects and their attributes
• Classes – Inheritance

Python Functions and Modules
• Functions – Basics
• Recursive functions
• Modules – Importing

Python File Operations

Advance Python Programming

• List / Set / Dictionary Comprehensions
• Lambda Functions
• map() and filter()
• Iterators and Generators
• Decorators
• Threading Basics
• Connecting to Database using Python
• Create Table/Insert/update/delete with Python
• Committing and Rolling Back Transactions

Python for Analytics & Machine Learning

• Fundamentals of Python
• Numpy Arrays
• Introduction to Matrices
• Pandas DataFrames
• Importing data into DataFrames
• Visualization
• Introduction to Matplotlib

Machine Learning with Python

Introduction of Data Science and Machine Leaning

Introduction Python

Statistics for Machine Learning
• Inferential Statistics
• Descriptive Statistics

Theory of Distribution
• Probability distribution
• Sampling Data
• Types of Sampling

Regression & Modelling
• Simple Linear Regression
• Multiple Linear Regression
• Polynomial Regression
• Decision Tree Regression
• Random Forest Regression
• Logistic Regression

K-Nearest Neighbors (K-NN)

Support Vector Machine (SVM)

Naive Bayes

Decision Tree Classification

Random Forest Classification

• K-Means , Hierarchical Clustering

Introduction to Deep Learning

• Apriori
• Upper Confidence Bound (UCB)
• Thompson Sampling
• Natural Language Processing
• Artificial Neural Networks

Big Data with Cloudera Hadoop

Intro. Big Data & Hadoop
Apache Hadoop
• Apache Hadoop Ecosystem
• Hadoop Core Components

Hadoop Storage: HDFS
• Hadoop Processing
• MapReduce Framework
• Cloudera’s Distribution Hadoop
• CDH Architecture

Hadoop Architecture and HDFS
• HDFS Deployments: (HA) & Non-HA
• HDFS (HA) Using (QJM)
• Data Replication Rack-Awareness
• HDFS Commands
• HDFS Administration Commands

Hadoop MapReduce Framework
• MapReduce Architecture
• MapReduce Application Workflow
• Data Locality Optimization in Hadoop

Resource Management Using YARN
• YARN (MRv2) Architecture
• MapReduce Vs Pig

Programming Structure in Pig
• Hive Vs Pig
• Hive Arch. & Components
• Deep Dive in Hive

NOSQL Database Hbase
Apache Sqoop , Sqoop Syntax

Cloudera Impala
• Impala with Hive, HDFS, HBase
• Multinode-Hadoop Cluster Setup
• Cluster Maintenance

Apache Spark & Scala Programming

Scala
• Functional Programing Paradigm
• Introduction to Scala
• Data Types, Control Structures & Collections
• Functional Programming using Scala
• Object Oriented Programming
• Singletons and traits
• Scala Indepth
• Advanced Scala concepts
• Extractors, Annotations & Parsing

Apache Spark
• Introduction to Big Data and Spark
• Foundation to Spark
• Working with Resilient Distributed DataSets (RDD)
• Spark Eco-system – Spark Streaming & Spark SQL

Enroll Now