Spark Training in Hyderabad

Introduction to Spark

What is Spark
Why Spark
Who Uses Spark
Brief History of Spark
Storage Layers for Spark
Why Spark is 100 times faster than MapReduce
Difference between Spark-1.x and Spark-2.x
Unified Stack of Spark
- Spark Core
- Spark Sql
- Spark Streaming
- Spark MLLib
- Spark GraphX
Spark Architecture explanation
- Master Slave architecture
- Spark Driver
- Workers
- Executors
Installation of Spark in different modes
- Local mode
- Pseudo mode
- Cluster mode

Basics of Spark

Creating the Spark Context
Creating the Spark Conf
Creating the Spark Session
Configuring Spark Context with Spark Conf
Caching Overview
Distributed Persistence
Combine scala and java seamlessly
Deploying Applications with spark-submit
Verify spark jobs in Spark Web UI
SBT
- Installing sbt
- Building a Spark Project with sbt
- Running Spark Project with sbt
MAVEN
- Installing maven
- Building a Spark Project with maven
- Running Spark Project with maven

Resilient Distributed Dataset (RDD)

What is RDD
Creating RDDs
RDD Operations
- Transformations
- Actions
- Lazy Evaluation
Passing Functions to Spark
- Python, Java, Scala

Working with Key/Value Pairs

Creating Pair RDDs
Transformations on Pair RDDs
- Aggregations
- Grouping Data
- Joins
- Sorting Data
Data Partitioning
- Determining an RDD’s Partitioner
- Custom Partitioners

Loading and Saving Your Data

File Formats
- Text, json, csv, tsv, Object files
- Hadoop Input and Output Formats
Loading Data using RDD
Saving Data using RDD
MapReduce and Pair RDD Operations
Scala and Hadoop Integrations

Broadcast and Accumulators

Accumulators
- Introduction to Accumulators
- Practical Examples on Accumulators
- Creating Custom Accumulators
Broadcast variables
- Introduction to Broadcast variables
- Practical Examples on Broadcast variables
- Optimizing Broadcasts

Working with Spark in different programming languages

Python
- Installing Python
- How to use 'pyspark'
- Practical examples on spark in python
Scala
- Installing Scala
- How to use 'spark-shell'
- Practical examples on spark in Scala
Java
- Installing Java
- How to use 'Java'
- Practical examples on spark in Java
R
- Installing R
- How to use 'SparkR'
- Practical examples on spark in R

Apache Spark SQL

Spark SQL & Hive Architecture explanation
Working with Spark SQL DataSets
Working with Spark SQL DataFrames
Practice on Spark SQL Context
Practical examples on Spark SQL
Integrating Spark SQL with
- Hive
- Phoenix
- Cassandra
- RDBMS
Processing different files using Spark
- Text
- Json
- Csv
- Tsv
- Parquet
Spark SQL UDFs
Spark SQL Performance Tuning Options
JDBC/ODBC Server

Apache Spark Streaming

Spark Streaming Architecture explanation
Creating the Streaming Context
Discretized Streams (DStreams)
Transformations on Dstreams
- UpdateStateByKey Operation
- Transform Operation
- Window Operations
- Join Operations
Output Operations on DStreams
Streaming UI explanation
Spark Streaming Sources
- Basic Sources
- Advanced Sources
Integrating Spark Streaming with
- Flume
- Kafka
- Twitter
- HDFS
Performance Considerations
Practical examples on Spark Streaming

Apache Spark MLib

Machine Learning Basics
Machine Learning Algorithms
- Classification
- Clustering
- Collaborative Filtering
Performance Considerations
Practical examples on Spark MLib

Apache Spark Graphx

Introduction to Spark Graphx
Practical Examples on Spark Graphx

Apache Mesos

Introduction to Apache Mesos
Apache Mesos Architecture explanation
Practical Examples on Apache Mesos

Apache Mahout

Introduction to Apache Mahout
Apache Mahout Architecture explanation
Practical Examples on Apache Mahout

Apache Storm

Introduction to Apache Storm
Apache Storm Architecture explanation
Practical Examples on Apache Storm

Apache Kafka

Introduction to Apache Kafka
Installing Apache Kafka
Apache Kafka Architecture explanation
Practical Examples on Apache Kafka

Apache FLUME

Introduction to flume
Flume installation
Flume Architecture
- Agent
- Sources
- Channels
- Sinks
Practical Examples on Flume

Apache Phoenix

Introduction to Phoenix
Installing Phoenix
Integrating with Hbase
Practical Examples on Phoenix

Apache Cassandra

Introduction to Cassandra
Installing Cassandra
Practical Examples on Cassandra

Apache Drill

Introduction to Drill
Installing Drill
Practical Examples on Drill

Apache Zeppelin

Introduction to Zeppelin
Installing Zeppelin
Practical Examples on Zeppelin
Data Visualization using Zeppelin

Play Framework

Introduction to Play Framework
Installing Play Framework
Practical Examples on Play Framework
Spark Project using Play Framework

Kalyan Hadoop and Spark Training in Hyderabad Learn Big Data From Basics... @ Kalyan @

Spark Training in Hyderabad

Introduction to Spark

Basics of Spark

Resilient Distributed Dataset (RDD)

Working with Key/Value Pairs

Loading and Saving Your Data

Broadcast and Accumulators

Working with Spark in different programming languages

Apache Spark SQL

Apache Spark Streaming

Apache Spark MLib

Apache Spark Graphx

Apache Mesos

Apache Mahout

Apache Storm

Apache Kafka

Apache FLUME

Apache Phoenix

Apache Cassandra

Apache Drill

Apache Zeppelin

Play Framework

No comments :

Post a Comment