Spark and Scala Training in Hyderabad

  1. Introduction to Big Data and Hadoop

  • Big Data
    • What is Big Data?
    • Why all industries are talking about Big Data?
    • What are the issues in Big Data?
      • Storage
        • What are the challenges for storing big data?
      • Processing
        • What are the challenges for processing big data?
    • What are the technologies support big data?
      • Hadoop
      • Data Bases
        • Traditional
        • NO SQL
  • Hadoop
    • What is Hadoop?
    • History of Hadoop
    • Why Hadoop?
    • Hadoop Use cases
    • Advantages and Disadvantages of Hadoop
  • Importance of Different Ecosystems of Hadoop
  • Importance of Integration with other Big Data solutions
  • Big Data Real time Use Cases
  • Batch vs. Real Time Big Data Analytics
  • Real Time Analytics
    • Streaming Data – Storm / Kafka / Flume
    • In Memory Data - Spark

  1. Introduction to Spark

  • What is Spark
  • Why Spark
  • Who Uses Spark
  • Brief History of Spark
  • Storage Layers for Spark
  • Why Spark is 100 times faster than MapReduce
  • Difference between Spark-1.x and Spark-2.x
  • Unified Stack of Spark
    • Spark Core
    • Spark Sql
    • Spark Streaming
    • Spark MLLib
    • Spark GraphX
  • Spark Architecture explanation
    • Master Slave architecture
    • Spark Driver
    • Workers
    • Executors
  • Installation of Spark in different modes
    • Local mode
    • Pseudo mode
    • Cluster mode

  1. Basics of Spark

  • Creating the Spark Context
  • Creating the Spark Conf
  • Creating the Spark Session
  • Configuring Spark Context with Spark Conf
  • Caching Overview
  • Distributed Persistence
  • Combine scala and java seamlessly
  • Deploying Applications with spark-submit
  • Verify spark jobs in Spark Web UI
  • SBT
    • Installing sbt
    • Building a Spark Project with sbt
    • Running Spark Project with sbt
  • MAVEN
    • Installing maven
    • Building a Spark Project with maven
    • Running Spark Project with maven
  1. Resilient Distributed Dataset (RDD)

  • What is RDD
  • Creating RDDs
  • RDD Operations
    • Transformations
    • Actions
    • Lazy Evaluation
  • Passing Functions to Spark
    • Python, Java, Scala

  1. Working with Key/Value Pairs

  • Creating Pair RDDs
  • Transformations on Pair RDDs
    • Aggregations
    • Grouping Data
    • Joins
    • Sorting Data
  • Data Partitioning
    • Determining an RDD’s Partitioner
    • Custom Partitioners

  1. Loading and Saving Your Data

  • File Formats
    • Text, json, csv, tsv, Object files
    • Hadoop Input and Output Formats
  • Loading Data using RDD
  • Saving Data using RDD
  • MapReduce and Pair RDD Operations
  • Scala and Hadoop Integrations

  1. Broadcast and Accumulators

  • Accumulators
    • Introduction to Accumulators
    • Practical Examples on Accumulators
    • Creating Custom Accumulators
  • Broadcast variables
    • Introduction to Broadcast variables
    • Practical Examples on Broadcast variables
    • Optimizing Broadcasts

  1. Working with Spark in different programming languages

  • Python
    • Installing Python
    • How to use 'pyspark'
    • Practical examples on spark in python
  • Scala
    • Installing Scala
    • How to use 'spark-shell'
    • Practical examples on spark in Scala
  • Java
    • Installing Java
    • How to use 'Java'
    • Practical examples on spark in Java
  • R
    • Installing R
    • How to use 'SparkR'
    • Practical examples on spark in R

  1. Apache Spark SQL

  • Spark SQL & Hive Architecture explanation
  • Working with Spark SQL DataSets
  • Working with Spark SQL DataFrames
  • Practice on Spark SQL Context
  • Practical examples on Spark SQL
  • Integrating Spark SQL with
    • Hive
    • Phoenix
    • Cassandra
    • RDBMS
  • Processing different files using Spark
    • Text
    • Json
    • Csv
    • Tsv
    • Parquet
  • Spark SQL UDFs
  • Spark SQL Performance Tuning Options
  • JDBC/ODBC Server

  1. Apache Spark Streaming

  • Spark Streaming Architecture explanation
  • Creating the Streaming Context
  • Discretized Streams (DStreams)
  • Transformations on Dstreams
    • UpdateStateByKey Operation
    • Transform Operation
    • Window Operations
    • Join Operations
  • Output Operations on DStreams
  • Streaming UI explanation
  • Spark Streaming Sources
    • Basic Sources
    • Advanced Sources
  • Integrating Spark Streaming with
    • Flume
    • Kafka
    • Twitter
    • HDFS
  • Performance Considerations
  • Practical examples on Spark Streaming

  1. Apache Spark MLib

  • Machine Learning Basics
  • Machine Learning Algorithms
    • Classification
    • Clustering
    • Collaborative Filtering
  • Performance Considerations
  • Practical examples on Spark MLib

  1. Apache Spark Graphx

  • Introduction to Spark Graphx
  • Practical Examples on Spark Graphx

  1. Apache Mesos

  • Introduction to Apache Mesos
  • Apache Mesos Architecture explanation
  • Practical Examples on Apache Mesos
  1. Apache Mahout

  • Introduction to Apache Mahout
  • Apache Mahout Architecture explanation
  • Practical Examples on Apache Mahout
  1. Apache Storm

  • Introduction to Apache Storm
  • Apache Storm Architecture explanation
  • Practical Examples on Apache Storm
  1. Apache Kafka

  • Introduction to Apache Kafka
  • Installing Apache Kafka
  • Apache Kafka Architecture explanation
  • Practical Examples on Apache Kafka
  1. Apache FLUME

  • Introduction to flume
  • Flume installation
  • Flume Architecture
    • Agent
    • Sources
    • Channels
    • Sinks
  • Practical Examples on Flume
  1. Apache Phoenix

  • Introduction to Phoenix
  • Installing Phoenix
  • Integrating with Hbase
  • Practical Examples on Phoenix
  1. Apache Cassandra

  • Introduction to Cassandra
  • Installing Cassandra
  • Practical Examples on Cassandra
  1. Apache Drill

  • Introduction to Drill
  • Installing Drill
  • Practical Examples on Drill
  1. Apache Zeppelin

  • Introduction to Zeppelin
  • Installing Zeppelin
  • Practical Examples on Zeppelin
  • Data Visualization using Zeppelin
  1. Play Framework

  • Introduction to Play Framework
  • Installing Play Framework
  • Practical Examples on Play Framework
  • Spark Project using Play Framework
  1. Introduction of Scala

  • What is Scala?
  • Why Scala?
  • Advantages of Scala?
  • Using the Scala REPL(Read Evaluate print loop)
  • What is Type Inference
  • Interoperability between Scala and Java
  1. Scala using Command Line

  • Installing Java & Scala
  • Interactive Scala
  • Writing Scala Scripts
  • Compiling Scala Programs
  1. Basics of Scala

  • Defining Variables
  • Defining Functions
  • String Interpolation
  • IDE for Scala
  1. Scala Type Less, Do More

  • Semicolons
  • Variable Declarations
  • Method Declarations
  • Type Inference
  • Immutability
  • Reserved Words
  • Operators
  • Precedence Rules
  • Literals
  • Options
  • Arrays, Lists, Ranges, Tuples
  1. Expressions and Conditionals

  • If expressions
  • If-Else expressions
  • Match Expressions
  • For Loops
  • While Loops
  • Do-While Loops
  • Conditional Operators
  • Enumerations
  • Pattern Matching
  • Using try, catch, and finally Clauses
  1. Functional Programming in Scala

  • What is Functional Programming?
  • Functional Literals and Closures
  • Recursions
  • Tail Calls
  • Currying
  • Functional Data Structures
  • Sequences,Maps,Sets
  • Traversing
  • Traversal, Mapping, Filtering, Folding and Reducing
  • Implicit Function Parameters
  • Call by Name, Call by Value
  1. Object-Oriented Programming in Scala

  • Class and Object Basics
  • Value Classes
  • Parent Types
  • Constructors in Scala
  • Fields in Classes
  • Nested Types
  • Traits as Mixins
  • Stackable Traits
  • Creating Traits
  • Visibility Rules
  1. Scala for Big Data

  • Improving MapReduce with Scala
  • Moving Beyond MapReduce
  • Categories for Mathematics
  • A List of Scala-Based Data Tools


Spark with Big Data Integrations:

    • Spark and Hive integration
    • Saprk and Phoenix integration
    • Spark and Cassandra integration
    • Spark and Flume integration
    • Spark and Kafka integraion
    • Spark and RDBMS integration
  1. Real Time Big Data Projects

  • We willl be sharing End-to-End Big Data Projects
  • We are providing Big Data Project Practice on Our Lab
  • We are providing Important Recorded Videos on Our YouTube Channel
  • Any information search in Google / YouTube by keyword is 'Kalyan Hadoop'
  1. Spark Administration topics:

  • Hadoop Installation
  • Hive Installation
  • Hbase Installation
  • Zookeeper Installation
  • Phoenix Installation
  • Kafka Installation
  • Flume Installation
  • Zeppelin Installation
  • Play Framework Installation
  • MySql Installation
  • Java Installation
  • Scala Installation
  • Python Installation
  • R Installation
  • Eclipse Installation
  • Cloudera Distribution installation

Free Big Data Workshops:

  • Spark & Scala
  • Cassandra
  • MongoDB
  • Search engine & E-commerce solutions
  • Big Data Analytics (R, Mahout, Spark ML)
  1. What we are offering to you:


  • Hands on Practice on Cloudera CCA175 Spark and Hadoop Developer Certification
  • Tips to Crack the CCA175 Certification
  • Hands on Practice on Spark & Scala Real-Time Examples
  • Providing 1 Major project on Spark.
  • Providing 2 Mini projects on Spark.
  • Real Time Big Data projects will be shared
  • Free Big Data Workshops on new & advanced technologies
  • Free Weekly Online Hadoop Certification
  • Hands on installation Spark and it's relative software's in your laptop.
  • Well documented Spark & Scala material with all the topics covering in the course.
  • Well documented Spark blog contains frequent interview questions along with the answers and latest updates on BigData technology.
  • Discussing about Spark & Scala interview questions daily base.
  • Resume preparation with POC's or Project's based on your experience.
Related Posts Plugin for WordPress, Blogger...