Spark Training in Hyderabad

  1. Introduction to Spark

  • What is Spark
  • Why Spark
  • Who Uses Spark
  • Brief History of Spark
  • Storage Layers for Spark
  • Why Spark is 100 times faster than MapReduce
  • Difference between Spark-1.x and Spark-2.x
  • Unified Stack of Spark
    • Spark Core
    • Spark Sql
    • Spark Streaming
    • Spark MLLib
    • Spark GraphX
  • Spark Architecture explanation
    • Master Slave architecture
    • Spark Driver
    • Workers
    • Executors
  • Installation of Spark in different modes
    • Local mode
    • Pseudo mode
    • Cluster mode
  1. Basics of Spark

  • Creating the Spark Context
  • Creating the Spark Conf
  • Creating the Spark Session
  • Configuring Spark Context with Spark Conf
  • Caching Overview
  • Distributed Persistence
  • Combine scala and java seamlessly
  • Deploying Applications with spark-submit
  • Verify spark jobs in Spark Web UI
  • SBT
    • Installing sbt
    • Building a Spark Project with sbt
    • Running Spark Project with sbt
  • MAVEN
    • Installing maven
    • Building a Spark Project with maven
    • Running Spark Project with maven
  1. Resilient Distributed Dataset (RDD)

  • What is RDD
  • Creating RDDs
  • RDD Operations
    • Transformations
    • Actions
    • Lazy Evaluation
  • Passing Functions to Spark
    • Python, Java, Scala
  1. Working with Key/Value Pairs

  • Creating Pair RDDs
  • Transformations on Pair RDDs
    • Aggregations
    • Grouping Data
    • Joins
    • Sorting Data
  • Data Partitioning
    • Determining an RDD’s Partitioner
    • Custom Partitioners
  1. Loading and Saving Your Data

  • File Formats
    • Text, json, csv, tsv, Object files
    • Hadoop Input and Output Formats
  • Loading Data using RDD
  • Saving Data using RDD
  • MapReduce and Pair RDD Operations
  • Scala and Hadoop Integrations
  1. Broadcast and Accumulators

  • Accumulators
    • Introduction to Accumulators
    • Practical Examples on Accumulators
    • Creating Custom Accumulators
  • Broadcast variables
    • Introduction to Broadcast variables
    • Practical Examples on Broadcast variables
    • Optimizing Broadcasts
  1. Working with Spark in different programming languages

  • Python
    • Installing Python
    • How to use 'pyspark'
    • Practical examples on spark in python
  • Scala
    • Installing Scala
    • How to use 'spark-shell'
    • Practical examples on spark in Scala
  • Java
    • Installing Java
    • How to use 'Java'
    • Practical examples on spark in Java
  • R
    • Installing R
    • How to use 'SparkR'
    • Practical examples on spark in R
  1. Apache Spark SQL

  • Spark SQL & Hive Architecture explanation
  • Working with Spark SQL DataSets
  • Working with Spark SQL DataFrames
  • Practice on Spark SQL Context
  • Practical examples on Spark SQL
  • Integrating Spark SQL with
    • Hive
    • Phoenix
    • Cassandra
    • RDBMS
  • Processing different files using Spark
    • Text
    • Json
    • Csv
    • Tsv
    • Parquet
  • Spark SQL UDFs
  • Spark SQL Performance Tuning Options
  • JDBC/ODBC Server
  1. Apache Spark Streaming

  • Spark Streaming Architecture explanation
  • Creating the Streaming Context
  • Discretized Streams (DStreams)
  • Transformations on Dstreams
    • UpdateStateByKey Operation
    • Transform Operation
    • Window Operations
    • Join Operations
  • Output Operations on DStreams
  • Streaming UI explanation
  • Spark Streaming Sources
    • Basic Sources
    • Advanced Sources
  • Integrating Spark Streaming with
    • Flume
    • Kafka
    • Twitter
    • HDFS
  • Performance Considerations
  • Practical examples on Spark Streaming
  1. Apache Spark MLib

  • Machine Learning Basics
  • Machine Learning Algorithms
    • Classification
    • Clustering
    • Collaborative Filtering
  • Performance Considerations
  • Practical examples on Spark MLib
  1. Apache Spark Graphx

  • Introduction to Spark Graphx
  • Practical Examples on Spark Graphx
  1. Apache Mesos

  • Introduction to Apache Mesos
  • Apache Mesos Architecture explanation
  • Practical Examples on Apache Mesos
  1. Apache Mahout

  • Introduction to Apache Mahout
  • Apache Mahout Architecture explanation
  • Practical Examples on Apache Mahout
  1. Apache Storm

  • Introduction to Apache Storm
  • Apache Storm Architecture explanation
  • Practical Examples on Apache Storm
  1. Apache Kafka

  • Introduction to Apache Kafka
  • Installing Apache Kafka
  • Apache Kafka Architecture explanation
  • Practical Examples on Apache Kafka
  1. Apache FLUME

  • Introduction to flume
  • Flume installation
  • Flume Architecture
    • Agent
    • Sources
    • Channels
    • Sinks
  • Practical Examples on Flume
  1. Apache Phoenix

  • Introduction to Phoenix
  • Installing Phoenix
  • Integrating with Hbase
  • Practical Examples on Phoenix
  1. Apache Cassandra

  • Introduction to Cassandra
  • Installing Cassandra
  • Practical Examples on Cassandra
  1. Apache Drill

  • Introduction to Drill
  • Installing Drill
  • Practical Examples on Drill
  1. Apache Zeppelin

  • Introduction to Zeppelin
  • Installing Zeppelin
  • Practical Examples on Zeppelin
  • Data Visualization using Zeppelin
  1. Play Framework


  • Introduction to Play Framework
  • Installing Play Framework
  • Practical Examples on Play Framework
  • Spark Project using Play Framework
Related Posts Plugin for WordPress, Blogger...