Friday 9 June 2017

How to Perform Incremental Load in Sqoop

Importing Incremental Data

You can also perform incremental imports using Sqoop. Incremental import is a technique that imports only the newly added rows in a table. It is required to add incremental , check-column , and last-value options to perform the incremental import.

incremental - Used by Sqoop to determine which rows are new. Legal values for this mode include append and lastmodified .

check-column - To provide the column that needs to checked the determine the candidate rows.

last-value - This is the maximum value of the last import run.



Example:


sqoop import 

--connect jdbc:mysql://localhost:3306/kalyan 
--username root 
--table sample 
-m 1
--incremental append 
--check-column id 
--last-value 1000



How to Enable Transactions in Hive

Hive supports transactions by setting the correct parameters. To enable transactions, the following configurations need to be set. These configuration parameters must be set appropriately to turn on transaction support in Hive:

hive.support.concurrency - true

hive.enforce.bucketing - true

hive.exec.dynamic.partition.mode - nonstrict

hive.txn.manager - org.apache.hadoop.hive.ql.lockmgr.DbTxnManager

hive.compactor.initiator.on - true on one instance of the Thrift metastore service

hive.compactor.worker.threads - 10 for an instance of the Thrift metastore service



Use this specific table format:

CREATE TABLE mytable (
 c1 int,
 c2 string,
 c3 string
)
CLUSTERED BY (c1) INTO x BUCKETS
STORED AS orc
TBLPROPERTIES('transactional' = 'true');



What Is ACID and Why Use It?

ACID stands for four traits of database transactions:

Atomicity - An operation either succeeds completely or fails; operations do not leave incomplete data in the system.

Consistency - Once an operation completes, the results of that operation are visible to every subsequent operation.

Isolation - Operations completed by one user do not cause unexpected side effects for other users.

Durability - Once an operation is complete, it will be preserved even if the machine or system experiences a failure.


These behaviours are mandatory to ensure transaction functionality.

If your operations are ACID compliant, the system will ensure your processing is protected against any failures.


Friday 7 April 2017

SPARK BASICS Practice on 02 Apr 2017

`Spark` is meant for `In-Memory Distributed Computing`

`Spark` provides 4 libraries:
1. Spark SQL
2. Spark Streaming
3. Spark MLLib
4. Spark GraphX

`Spark Context` is the Entry point for any `Spark Operations`

`Resilient Distributed DataSets` => RDD

RDD features:
-------------------
1. immutability
2. lazy evaluation
3. cacheable
4. type infer

RDD operations:
-----------------
1. Transformations
<old rdd> ----> <new rdd>

2. Actions
<rdd> ---> <result>


Examples on RDD:
-------------------------
list <- {1,2,3,4,5}

1. Transformations:
---------------------
Ex1:
-----
f(x) <- {x + 1}

f(list) <- {2,3,4,5,6}


Ex2:
-----
f(x) <- {x * x}

f(list) <- {1,4,9,16,25}


2. Actions:
---------------------

sum(list) -> 15
min(list) -> 1
max(list) -> 5



How to Start the Spark:
-----------------------------
scala => spark-shell
python => pyspark
R => sparkR

Spark-1.x:
--------------------------------------
Spark context available as 'sc'
Spark Sql Context available as 'sqlContext'


Spark-2.x:
--------------------------------------
Spark context available as 'sc'
Spark session available as 'spark'


How to Create a RDD:
---------------------------------------
We can create RDD 2 ways
1. from collections (List, Seq, Set, ....)
2. from data sets (text, csv, tsv, json, ...)


1. from collections:
---------------------------------------
val list = List(1,2,3,4,5)

val rdd = sc.parallelize(list)

val rdd = sc.parallelize(list, 2)

Syntax:
-----------------------
val rdd = sc.parallelize(<collection object>, <no.of partitions>)


2. from datasets:
---------------------------------------
val file = "file:///home/orienit/work/input/demoinput"

val rdd = sc.textFile(file)

val rdd = sc.textFile(file, 1)


Syntax:
-----------------------
val rdd = sc.textFile(<file path>, <no.of partitions>)

------------------------------------------------------

scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val rdd = sc.parallelize(list)
rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:26

scala> rdd.getNumPartitions
res0: Int = 4

scala> val rdd = sc.parallelize(list, 2)
rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[1] at parallelize at <console>:26

scala> rdd.getNumPartitions
res1: Int = 2


------------------------------------------------------

scala> val file = "file:///home/orienit/work/input/demoinput"
file: String = file:///home/orienit/work/input/demoinput

scala> val rdd = sc.textFile(file)
rdd: org.apache.spark.rdd.RDD[String] = file:///home/orienit/work/input/demoinput MapPartitionsRDD[3] at textFile at <console>:26

scala> rdd.getNumPartitions
res2: Int = 2

scala> val rdd = sc.textFile(file, 1)
rdd: org.apache.spark.rdd.RDD[String] = file:///home/orienit/work/input/demoinput MapPartitionsRDD[5] at textFile at <console>:26

scala> rdd.getNumPartitions
res3: Int = 1

------------------------------------------------------
Examples on RDD
------------------------------------------------------
val list = List(1,2,3,4,5)

val rdd1 = sc.parallelize(list, 2)

------------------------------------------------------

val rdd2 = rdd1.map(x => x + 1)

val rdd3 = rdd1.map(x => x +* x)

------------------------------------------------------

scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val rdd1 = sc.parallelize(list, 2)
rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[6] at parallelize at <console>:26

------------------------------------------------------

scala> val rdd2 = rdd1.map(x => x + 1)
rdd2: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[7] at map at <console>:28

scala> rdd1.collect
res4: Array[Int] = Array(1, 2, 3, 4, 5)

scala> rdd2.collect
res5: Array[Int] = Array(2, 3, 4, 5, 6)


scala> val rdd3 = rdd1.map(x => x * x)
rdd3: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[8] at map at <console>:28

scala> rdd3.collect
res6: Array[Int] = Array(1, 4, 9, 16, 25)

------------------------------------------------------

rdd1.min

rdd1.max

rdd1.sum


------------------------------------------------------
scala> rdd1.min
res7: Int = 1

scala> rdd1.max
res8: Int = 5

scala> rdd1.sum
res9: Double = 15.0

scala> rdd1.collect
res10: Array[Int] = Array(1, 2, 3, 4, 5)

scala> rdd1.count
res11: Long = 5

------------------------------------------------------
Word Count in Spark using Scala
------------------------------------------------------

val input = "file:///home/orienit/work/input/demoinput"

val output = "file:///home/orienit/work/output/spark-op"

val fileRdd = sc.textFile(input, 1)

val wordsRdd = fileRdd.flatMap(line => line.split(" "))

val tuplesRdd = wordsRdd.map(word => (word, 1))

val wordCountRdd = tuplesRdd.reduceByKey((a,b) => a + b)

wordCountRdd.saveAsTextFile(output)

------------------------------------------------------
Optimize the Code :
------------------------------------------------------

val input = "file:///home/orienit/work/input/demoinput"

val output = "file:///home/orienit/work/output/spark-op"

val fileRdd = sc.textFile(input, 1)

val wordCountRdd = fileRdd.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a,b) => a + b)

wordCountRdd.saveAsTextFile(output)


------------------------------------------------------

scala> val input = "file:///home/orienit/work/input/demoinput"
input: String = file:///home/orienit/work/input/demoinput

scala> val output = "file:///home/orienit/work/output/spark-op"
output: String = file:///home/orienit/work/output/spark-op

scala> val fileRdd = sc.textFile(input, 1)
fileRdd: org.apache.spark.rdd.RDD[String] = file:///home/orienit/work/input/demoinput MapPartitionsRDD[11] at textFile at <console>:26

scala> fileRdd.collect
res12: Array[String] = Array(I am going, to hyd, I am learning, hadoop course)

scala> val wordsRdd = fileRdd.flatMap(line => line.split(" "))
wordsRdd: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[12] at flatMap at <console>:28

scala> wordsRdd.collect
res13: Array[String] = Array(I, am, going, to, hyd, I, am, learning, hadoop, course)

scala> val tuplesRdd = wordsRdd.map(word => (word, 1))
tuplesRdd: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[13] at map at <console>:30

scala> tuplesRdd.collect
res14: Array[(String, Int)] = Array((I,1), (am,1), (going,1), (to,1), (hyd,1), (I,1), (am,1), (learning,1), (hadoop,1), (course,1))

scala> val wordCountRdd = tuplesRdd.reduceByKey((a,b) => a + b)
wordCountRdd: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[14] at reduceByKey at <console>:32

scala> wordCountRdd.collect
res15: Array[(String, Int)] = Array((learning,1), (hadoop,1), (am,2), (hyd,1), (I,2), (to,1), (going,1), (course,1))

scala> wordCountRdd.saveAsTextFile(output)




------------------------------------------------------
Grep Job in Spark using Scala:
------------------------------------------------------

val input = "file:///home/orienit/work/input/demoinput"

val output = "file:///home/orienit/work/output/spark-grep-op"

val fileRdd = sc.textFile(input, 1)

val grepRdd = fileRdd.filter(line => line.contains("am"))

grepRdd.saveAsTextFile(output)



------------------------------------------------------
Sed Job in Spark using Scala:
------------------------------------------------------

val input = "file:///home/orienit/work/input/demoinput"

val output = "file:///home/orienit/work/output/spark-sed-op"

val fileRdd = sc.textFile(input, 1)

val sedRdd = fileRdd.map(line => line.replaceAll("am", "xyz"))

sedRdd.saveAsTextFile(output)

------------------------------------------------------
Spark SQL:
------------------------------------------------------

hive> select * from kalyan.student;
scala> spark.sql("select * from kalyan.student").show

------------------------------------------------------

hive> select year, count(*) from kalyan.student group by year;
scala> spark.sql("select year, count(*) from kalyan.student group by year").show

------------------------------------------------------
Data Frames in Spark
------------------------------------------------------
val hiveDf = spark.sql("select * from kalyan.student")

hiveDf.show


hiveDf.registerTempTable("hivetbl")


val prop = new java.util.Properties
prop.setProperty("driver","com.mysql.jdbc.Driver")
prop.setProperty("user","root")
prop.setProperty("password","hadoop")

val jdbcDf = spark.read.jdbc("jdbc:mysql://localhost:3306/kalyan", "student", prop)

jdbcDf.show

jdbcDf.registerTempTable("jdbctbl")

------------------------------------------------------
scala> val hiveDf = spark.sql("select * from kalyan.student")
hiveDf: org.apache.spark.sql.DataFrame = [name: string, id: int ... 2 more fields]

scala> hiveDf.show
+------+---+------+----+
|  name| id|course|year|
+------+---+------+----+
|  arun|  1|   cse|   1|
| sunil|  2|   cse|   1|
|   raj|  3|   cse|   1|
|naveen|  4|   cse|   1|
| venki|  5|   cse|   2|
|prasad|  6|   cse|   2|
| sudha|  7|   cse|   2|
|  ravi|  1|  mech|   1|
|  raju|  2|  mech|   1|
|  roja|  3|  mech|   1|
|  anil|  4|  mech|   2|
|  rani|  5|  mech|   2|
|anvith|  6|  mech|   2|
| madhu|  7|  mech|   2|
|  arun|  1|    it|   3|
| sunil|  2|    it|   3|
|   raj|  3|    it|   3|
|naveen|  4|    it|   3|
| venki|  5|    it|   4|
|prasad|  6|    it|   4|
+------+---+------+----+
only showing top 20 rows

------------------------------------------------------

scala> val jdbcDf = spark.read.jdbc("jdbc:mysql://localhost:3306/kalyan", "student", prop)
jdbcDf: org.apache.spark.sql.DataFrame = [name: string, id: int ... 2 more fields]

scala> jdbcDf.show
+------+---+------+----+
|  name| id|course|year|
+------+---+------+----+
|  anil|  1| spark|2016|
|anvith|  5|hadoop|2015|
|   dev|  6|hadoop|2015|
|   raj|  3| spark|2016|
| sunil|  4|hadoop|2015|
|venkat|  2| spark|2016|
+------+---+------+----+

------------------------------------------------------

scala> hiveDf.registerTempTable("hivetbl")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> jdbcDf.registerTempTable("jdbctbl")
warning: there was one deprecation warning; re-run with -deprecation for details

------------------------------------------------------

spark.sql("select hivetbl.*, jdbctbl.* from hivetbl join jdbctbl on hivetbl.name = jdbctbl.name").show

------------------------------------------------------

scala> spark.sql("select hivetbl.*, jdbctbl.* from hivetbl join jdbctbl on hivetbl.name = jdbctbl.name").show
+------+---+------+----+------+---+------+----+
|  name| id|course|year|  name| id|course|year|
+------+---+------+----+------+---+------+----+
|  anil|  4|   ece|   4|  anil|  1| spark|2016|
|  anil|  4|  mech|   2|  anil|  1| spark|2016|
|anvith|  6|   ece|   4|anvith|  5|hadoop|2015|
|anvith|  6|  mech|   2|anvith|  5|hadoop|2015|
|   raj|  3|    it|   3|   raj|  3| spark|2016|
|   raj|  3|   cse|   1|   raj|  3| spark|2016|
| sunil|  2|    it|   3| sunil|  4|hadoop|2015|
| sunil|  2|   cse|   1| sunil|  4|hadoop|2015|
+------+---+------+----+------+---+------+----+



------------------------------------------------------

val casDf = sqlContext.read.cassandraFormat("student", "kalyan").load()

casDf.show

------------------------------------------------------


scala> val casDf = sqlContext.read.cassandraFormat("student", "kalyan").load()
casDf: org.apache.spark.sql.DataFrame = [name: string, course: string ... 2 more fields]

scala> casDf.show
+------+------+---+----+
|  name|course| id|year|
+------+------+---+----+
|  anil| spark|  1|2016|
|   raj| spark|  3|2016|
|anvith|hadoop|  5|2015|
|   dev|hadoop|  6|2015|
| sunil|hadoop|  4|2015|
|venkat| spark|  2|2016|
|    kk|hadoop|  7|2015|
+------+------+---+----+

------------------------------------------------------
INSERT INTO kalyan.student(name, id, course, year) VALUES ('rajesh', 8, 'hadoop', 2017);

------------------------------------------------------

scala> casDf.show
+------+------+---+----+
|  name|course| id|year|
+------+------+---+----+
|  anil| spark|  1|2016|
|   raj| spark|  3|2016|
|anvith|hadoop|  5|2015|
|   dev|hadoop|  6|2015|
| sunil|hadoop|  4|2015|
|venkat| spark|  2|2016|
|    kk|hadoop|  7|2015|
|rajesh|hadoop|  8|2017|
+------+------+---+----+

------------------------------------------------------

cqlsh:kalyan> select year, count(*) from kalyan.student group by year;
SyntaxException: line 1:42 missing EOF at 'group' (...(*) from kalyan.student [group] by...)

------------------------------------------------------

casDf.registerTempTable("castbl")


spark.sql("select year, count(*) from castbl group by year").show

------------------------------------------------------

scala> casDf.registerTempTable("castbl")
warning: there was one deprecation warning; re-run with -deprecation for details


scala> spark.sql("select year, count(*) from castbl group by year").show 
+----+--------+
|year|count(1)|
+----+--------+
|2015|       4|
|2016|       3|
|2017|       1|
+----+--------+



------------------------------------------------------
spark.sql("select castbl.*, jdbctbl.* from castbl join jdbctbl on castbl.name = jdbctbl.name").show
------------------------------------------------------

scala> spark.sql("select castbl.*, jdbctbl.* from castbl join jdbctbl on castbl.name = jdbctbl.name").show
+------+------+---+----+------+---+------+----+
|  name|course| id|year|  name| id|course|year|
+------+------+---+----+------+---+------+----+
|  anil| spark|  1|2016|  anil|  1| spark|2016|
|anvith|hadoop|  5|2015|anvith|  5|hadoop|2015|
|   dev|hadoop|  6|2015|   dev|  6|hadoop|2015|
|   raj| spark|  3|2016|   raj|  3| spark|2016|
| sunil|hadoop|  4|2015| sunil|  4|hadoop|2015|
|venkat| spark|  2|2016|venkat|  2| spark|2016|
+------+------+---+----+------+---+------+----+


------------------------------------------------------
scala> casDf.toJSON.collect.foreach(println)
{"name":"anil","course":"spark","id":1,"year":2016}
{"name":"raj","course":"spark","id":3,"year":2016}
{"name":"anvith","course":"hadoop","id":5,"year":2015}
{"name":"dev","course":"hadoop","id":6,"year":2015}
{"name":"sunil","course":"hadoop","id":4,"year":2015}
{"name":"venkat","course":"spark","id":2,"year":2016}
{"name":"kk","course":"hadoop","id":7,"year":2015}
{"name":"rajesh","course":"hadoop","id":8,"year":2017}


scala> hiveDf.toJSON.collect.foreach(println)
{"name":"arun","id":1,"course":"cse","year":1}
{"name":"sunil","id":2,"course":"cse","year":1}
{"name":"raj","id":3,"course":"cse","year":1}
{"name":"naveen","id":4,"course":"cse","year":1}
{"name":"venki","id":5,"course":"cse","year":2}
{"name":"prasad","id":6,"course":"cse","year":2}
{"name":"sudha","id":7,"course":"cse","year":2}
{"name":"ravi","id":1,"course":"mech","year":1}
{"name":"raju","id":2,"course":"mech","year":1}
{"name":"roja","id":3,"course":"mech","year":1}
{"name":"anil","id":4,"course":"mech","year":2}
{"name":"rani","id":5,"course":"mech","year":2}
{"name":"anvith","id":6,"course":"mech","year":2}
{"name":"madhu","id":7,"course":"mech","year":2}


scala> jdbcDf.toJSON.collect.foreach(println)
{"name":"anil","id":1,"course":"spark","year":2016}
{"name":"anvith","id":5,"course":"hadoop","year":2015}
{"name":"dev","id":6,"course":"hadoop","year":2015}
{"name":"raj","id":3,"course":"spark","year":2016}
{"name":"sunil","id":4,"course":"hadoop","year":2015}
{"name":"venkat","id":2,"course":"spark","year":2016}

------------------------------------------------------

SCALA BASICS Practice on 01 Apr 2017

`Scala` means `Scalable Language`

`Scala`is Functional + Object Oriented Programming Language

In Java:
------------
1. Primitive data types (int, float, double, long ...)
2. Wrapper Classes (integer, Float, Double, Long ..)

Wrapper Classes are possible to do `Serialization and DeSerialization`

Java Syntax:
-----------------
<data type> <variable name> = <data> ;



In Scala:
-------------------
Everything is `Object`


Scala Syntax:
-----------------
val <variable name> : <data type> = <data>

var <variable name> : <data type> = <data>


val => value => it is immutable (we can't change the reference)

var => variable => it is mutable (we can change the reference)


orienit@kalyan:~$ scala
Welcome to Scala 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_66-internal).
Type in expressions for evaluation. Or try :help.

scala> 

`Scala` Provides `REPL` functionality.

`REPL` means Read Evaluate Print Loop

`REPL` functionality already there in `Python` and `R` programming languages

---------------------------------------------------
scala> val name : String = "kalyan"
name: String = kalyan

scala> name = "xyz"
<console>:12: error: reassignment to val
       name = "xyz"
            ^

scala> var name : String = "kalyan"
name: String = kalyan

scala> name = "xyz"
name: String = xyz

---------------------------------------------------
`Scala` provides `Type Infer`
---------------------------------------------------
Based on the `data` it will find the `data type`


scala> val name : String = "kalyan"
name: String = kalyan

scala> val name = "kalyan"
name: String = kalyan

scala> val id = 1
id: Int = 1

scala> val id = 1.5
id: Double = 1.5

scala> val id = 1l
id: Long = 1

scala> val id = 1f
id: Float = 1.0


---------------------------------------------------
scala> val id = 1
id: Int = 1

scala> val id = 1l
id: Long = 1

scala> val id = 1f
id: Float = 1.0

scala> val id = 1d
id: Double = 1.0

scala> val id : Long = 1
id: Long = 1

scala> val id : Float = 1
id: Float = 1.0

scala> val id : Double = 1
id: Double = 1.0

---------------------------------------------------

scala> a.toInt
res4: Int = 1

scala> a.toDouble
res5: Double = 1.0

scala> a.toFloat
res6: Float = 1.0

scala> a.toLong
res7: Long = 1

scala> a.toChar
res8: Char = ?


---------------------------------------------------
`Scala` Provides `Operator Overloading` simillar to `C++`
---------------------------------------------------
scala> val a = 1
a: Int = 1

scala> val b = 2
b: Int = 2

scala> val c = a + b
c: Int = 3

scala> val c = a.+(b)
c: Int = 3

scala> val c = a.-(b)
c: Int = -1

scala> val c = a.*(b)
c: Int = 2

---------------------------------------------------

a + b   <===>  a.+(b)

a - b   <===>  a.-(b)

a * b   <===>  a.*(b)

---------------------------------------------------

scala> a < b
res0: Boolean = true

scala> a <= b
res1: Boolean = true

scala> a >= b
res2: Boolean = false

scala> a > b
res3: Boolean = false

---------------------------------------------------

scala> a.
!=   >>            isInfinite      min              toInt           
%    >>>           isInfinity      round            toLong          
&    ^             isNaN           self             toOctalString   
*    abs           isNegInfinity   shortValue       toRadians       
+    byteValue     isPosInfinity   signum           toShort         
-    ceil          isValidByte     to               unary_+         
/    compare       isValidChar     toBinaryString   unary_-         
<    compareTo     isValidInt      toByte           unary_~         
<<   doubleValue   isValidLong     toChar           underlying      
<=   floatValue    isValidShort    toDegrees        until           
==   floor         isWhole         toDouble         |               
>    getClass      longValue       toFloat                          
>=   intValue      max             toHexString                      


---------------------------------------------------
if, if else, if else if expressions in Scala
---------------------------------------------------

if(exp1) { body1 }

if(exp1) { 
body1 
}

---------------------------------------------------

if(exp1) { body1 } else { body2 }

if(exp1) { 
body1 
} else { 
body2 
}


Wrong in `Scala Prompt`
---------------------------------------------------
if(exp1) { 
body1 

else { 
body2 
}

if(exp1) { 
body1 
} else 

body2 
}

---------------------------------------------------
if(exp1) { body1 } else if(exp2) { body2 }

if(exp1) { 
body1 
} else if(exp2) { 
body2 
}


if(exp1) { 
body1 
} else if(exp2) { 
body2 
} else if(exp3) { 
body3 
}

---------------------------------------------------
val a = 10
val b = 20
val c = 30

---------------------------------------------------

scala> val a = 10
a: Int = 10

scala> val b = 20
b: Int = 20

scala> val c = 30
c: Int = 30


scala> if(a > b) println("nothing")

scala> if(a < b) println("i am there")
i am there

scala> if(a < b) println("i am there") else println("nothing")
i am there

scala> if(a > b) println("i am there") else println("nothing")
nothing

scala> if(a > b) println("i am there") else if(a < b) println("nothing") 
nothing

---------------------------------------------------
Arrays in Java:
---------------------------------------------------
<data type>[] <variable name> = {};

<data type>[] <variable name> = new <data type>[size];

String[] names = {"anil", "raj", "venkat"}

or

String[] names =  new String[3];
names[0] = "anil";
names[1] = "raj";
names[2] = "venkat";

---------------------------------------------------
Arrays in Scala:
---------------------------------------------------
val <variable name> : Array[<data type] = Array[<data type](...)

val <variable name> : Array[<data type] = new Array[<data type](size)


val names : Array[String] = Array[String]("anil", "raj", "venkat")

or

val names : Array[String] = new Array[String](3)
names(0) = "anil"
names(1) = "raj"
names(2) = "venkat"


---------------------------------------------------

scala> val names : Array[String] = Array[String]("anil", "raj", "venkat")
names: Array[String] = Array(anil, raj, venkat)

scala> names(0)
res14: String = anil

scala> names(1)
res15: String = raj

scala> names(2)
res16: String = venkat

---------------------------------------------------

scala> val names : Array[String] = new Array[String](3)
names: Array[String] = Array(null, null, null)

scala> names(0) = "anil"

scala> names(1) = "raj"

scala> names(2) = "venkat"

scala> names
res20: Array[String] = Array(anil, raj, venkat)

scala> names(0)
res21: String = anil

scala> names(1)
res22: String = raj

scala> names(2)
res23: String = venkat

---------------------------------------------------

scala> val names = Array[String]("anil", "raj", "venkat")
names: Array[String] = Array(anil, raj, venkat)

scala> val names = Array("anil", "raj", "venkat")
names: Array[String] = Array(anil, raj, venkat)

---------------------------------------------------

val nums = Array(1,2,3,4,5,6,7,8,9,10)

---------------------------------------------------
scala> val nums = Array(1,2,3,4,5,6,7,8,9,10)
nums: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> for(x <- nums) println(x)
1
2
3
4
5
6
7
8
9
10

scala> for(x <- nums) if(x %2 == 0) println(x)
2
4
6
8
10

scala> for(x <- nums if(x %2 == 0) ) println(x)
2
4
6
8
10

scala> for(x <- nums) if(x %2 == 0) println("Even Number: " + x) else println("Odd Number: " + x)
Odd Number: 1
Even Number: 2
Odd Number: 3
Even Number: 4
Odd Number: 5
Even Number: 6
Odd Number: 7
Even Number: 8
Odd Number: 9
Even Number: 10


---------------------------------------------------
for(x <- nums) 
if(x %2 == 0) {
println("Even Number: " + x) 
} else {
println("Odd Number: " + x)
}
---------------------------------------------------

scala> for(x <- nums) 
     | if(x %2 == 0) {
     | println("Even Number: " + x) 
     | } else {
     | println("Odd Number: " + x)
     | }
Odd Number: 1
Even Number: 2
Odd Number: 3
Even Number: 4
Odd Number: 5
Even Number: 6
Odd Number: 7
Even Number: 8
Odd Number: 9
Even Number: 10

---------------------------------------------------
String Interpolation:
---------------------------------------------------
val name = "kalyan"
val course = "spark"
val percentage =  80.5
val count = 100
---------------------------------------------------

val exp1 = "name is " + name + ", course is " + course

val exp2 = "name is $name, course is $course"

val exp3 = s"name is $name, course is $course"

val exp4 = s"name is $name, percentage is $percentage"

val exp5 = s"name is $name, percentage is $percentage%.3f"

val exp6 = f"name is $name, percentage is $percentage%.3f"

val exp7 = s"name is $name\ncourse is $course"

val exp8 = raw"name is $name\ncourse is $course"

---------------------------------------------------
---------------------------------------------------
scala> val name = "kalyan"
name: String = kalyan

scala> val course = "spark"
course: String = spark

scala> val percentage =  80.5
percentage: Double = 80.5

scala> val count = 100
count: Int = 100

---------------------------------------------------

scala> val exp1 = "name is " + name + ", course is " + course
exp1: String = name is kalyan, course is spark

scala> val exp2 = "name is $name, course is $course"
exp2: String = name is $name, course is $course

scala> val exp3 = s"name is $name, course is $course"
exp3: String = name is kalyan, course is spark

scala> val exp4 = s"name is $name, percentage is $percentage"
exp4: String = name is kalyan, percentage is 80.5

scala> val exp5 = s"name is $name, percentage is $percentage%.3f"
exp5: String = name is kalyan, percentage is 80.5%.3f

scala> val exp6 = f"name is $name, percentage is $percentage%.3f"
exp6: String = name is kalyan, percentage is 80.500

scala> val exp7 = s"name is $name\ncourse is $course"
exp7: String =
name is kalyan
course is spark

scala> val exp8 = raw"name is $name\ncourse is $course"
exp8: String = name is kalyan\ncourse is spark


---------------------------------------------------
Collections in `Scala`
---------------------------------------------------
In scala we have 2 types of collections:

1. immutable collections (scala.collection.immutable)

2. mutable collections (scala.collection.mutable)

---------------------------------------------------

scala> import scala.collection.immutable.
::                    LongMap                SortedMap        
AbstractMap           LongMapEntryIterator   SortedSet        
BitSet                LongMapIterator        Stack            
DefaultMap            LongMapKeyIterator     Stream           
HashMap               LongMapUtils           StreamIterator   
HashSet               LongMapValueIterator   StreamView       
IndexedSeq            Map                    StreamViewLike   
IntMap                MapLike                StringLike       
IntMapEntryIterator   MapProxy               StringOps        
IntMapIterator        Nil                    Traversable      
IntMapKeyIterator     NumericRange           TreeMap          
IntMapUtils           Page                   TreeSet          
IntMapValueIterator   PagedSeq               TrieIterator     
Iterable              Queue                  Vector           
LinearSeq             Range                  VectorBuilder    
List                  RedBlackTree           VectorIterator   
ListMap               Seq                    VectorPointer    
ListSerializeEnd      Set                    WrappedString    
ListSet               SetProxy                                



---------------------------------------------------

scala> import scala.collection.mutable.
AVLIterator            LinkedListLike              
AVLTree                ListBuffer                  
AbstractBuffer         ListMap                     
AbstractIterable       LongMap                     
AbstractMap            Map                         
AbstractSeq            MapBuilder                  
AbstractSet            MapLike                     
AnyRefMap              MapProxy                    
ArrayBuffer            MultiMap                    
ArrayBuilder           MutableList                 
ArrayLike              ObservableBuffer            
ArrayOps               ObservableMap               
ArraySeq               ObservableSet               
ArrayStack             OpenHashMap                 
BitSet                 PriorityQueue               
Buffer                 PriorityQueueProxy          
BufferLike             Publisher                   
BufferProxy            Queue                       
Builder                QueueProxy                  
Cloneable              ResizableArray              
DefaultEntry           RevertibleHistory           
DefaultMapModel        Seq                         
DoubleLinkedList       SeqLike                     
DoubleLinkedListLike   Set                         
FlatHashTable          SetBuilder                  
GrowingBuilder         SetLike                     
HashEntry              SetProxy                    
HashMap                SortedSet                   
HashSet                Stack                       
HashTable              StackProxy                  
History                StringBuilder               
ImmutableMapAdaptor    Subscriber                  
ImmutableSetAdaptor    SynchronizedBuffer          
IndexedSeq             SynchronizedMap             
IndexedSeqLike         SynchronizedPriorityQueue   
IndexedSeqOptimized    SynchronizedQueue           
IndexedSeqView         SynchronizedSet             
Iterable               SynchronizedStack           
LazyBuilder            Traversable                 
Leaf                   TreeSet                     
LinearSeq              Undoable                    
LinkedEntry            UnrolledBuffer              
LinkedHashMap          WeakHashMap                 
LinkedHashSet          WrappedArray                
LinkedList             WrappedArrayBuilder      

---------------------------------------------------

---------------------------------------------------
val list = List(1,2,3,4,5)

val list = List[Int](1,2,3,4,5)

val seq = Seq[Int](1,2,3,4,5)

val set = Set[Int](1,2,3,4,5)

val vec = Vector[Int](1,2,3,4,5)

val str = Stream[Int](1,2,3,4,5)

---------------------------------------------------

val st = Stack[Int](1,2,3,4,5)

val qu = Queue[Int](1,2,3,4,5)

---------------------------------------------------
scala> val st = Stack[Int](1,2,3,4,5)
<console>:11: error: not found: value Stack
       val st = Stack[Int](1,2,3,4,5)
                ^
scala> val qu = Queue[Int](1,2,3,4,5)
<console>:11: error: not found: value Queue
       val qu = Queue[Int](1,2,3,4,5)
                ^

---------------------------------------------------
val st1 = scala.collection.immutable.Stack[Int](1,2,3,4,5)

val qu1 = scala.collection.immutable.Queue[Int](1,2,3,4,5)

val st2 = scala.collection.mutable.Stack[Int](1,2,3,4,5)

val qu2 = scala.collection.mutable.Queue[Int](1,2,3,4,5)

---------------------------------------------------

scala> val st1 = scala.collection.immutable.Stack[Int](1,2,3,4,5)
st1: scala.collection.immutable.Stack[Int] = Stack(1, 2, 3, 4, 5)

scala> val qu1 = scala.collection.immutable.Queue[Int](1,2,3,4,5)
qu1: scala.collection.immutable.Queue[Int] = Queue(1, 2, 3, 4, 5)

scala> val st2 = scala.collection.mutable.Stack[Int](1,2,3,4,5)
st2: scala.collection.mutable.Stack[Int] = Stack(1, 2, 3, 4, 5)

scala> val qu2 = scala.collection.mutable.Queue[Int](1,2,3,4,5)
qu2: scala.collection.mutable.Queue[Int] = Queue(1, 2, 3, 4, 5)


---------------------------------------------------


scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val list = List[Int](1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val seq = Seq[Int](1,2,3,4,5)
seq: Seq[Int] = List(1, 2, 3, 4, 5)

scala> val set = Set[Int](1,2,3,4,5)
set: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4)

scala> val vec = Vector[Int](1,2,3,4,5)
vec: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)

scala> val str = Stream[Int](1,2,3,4,5)
str: scala.collection.immutable.Stream[Int] = Stream(1, ?)


---------------------------------------------------

val list = List(1,2,3,4,5)

or

val list = 1 :: 2 :: 3 :: 4 :: 5 :: Nil

or

val list = 1 :: (2 :: (3 :: (4 :: (5 :: Nil))))

---------------------------------------------------

scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val list = 1 :: 2 :: 3 :: 4 :: 5 :: Nil
list: List[Int] = List(1, 2, 3, 4, 5)

scala> val list = 1 :: (2 :: (3 :: (4 :: (5 :: Nil))))
list: List[Int] = List(1, 2, 3, 4, 5)

---------------------------------------------------

scala> var list = List(4,5,6)
list: List[Int] = List(4, 5, 6)

scala> list :+ 7
res0: List[Int] = List(4, 5, 6, 7)

scala> 3 +: list
res1: List[Int] = List(3, 4, 5, 6)

---------------------------------------------------

var list = List(4,5,6)

list = 3 +: list

list = list :+ 7

list = 2 +: list :+ 8

---------------------------------------------------
scala> var list = List(4,5,6)
list: List[Int] = List(4, 5, 6)

scala> list = 3 +: list
list: List[Int] = List(3, 4, 5, 6)

scala> list = list :+ 7
list: List[Int] = List(3, 4, 5, 6, 7)

scala> list = 2 +: list :+ 8
list: List[Int] = List(2, 3, 4, 5, 6, 7, 8)

---------------------------------------------------

val list1 = List(1,2,3,4,5)

val list2 = List(6,7,8,9,10)

---------------------------------------------------

scala> val list1 = List(1,2,3,4,5)
list1: List[Int] = List(1, 2, 3, 4, 5)

scala> val list2 = List(6,7,8,9,10)
list2: List[Int] = List(6, 7, 8, 9, 10)

scala> val list3 = list1 +: list2
list3: List[Any] = List(List(1, 2, 3, 4, 5), 6, 7, 8, 9, 10)

scala> val list3 = list2 +: list1
list3: List[Any] = List(List(6, 7, 8, 9, 10), 1, 2, 3, 4, 5)

scala> val list3 = list1 :+ list2
list3: List[Any] = List(1, 2, 3, 4, 5, List(6, 7, 8, 9, 10))

scala> val list3 = list1 ::: list2
list3: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

---------------------------------------------------
Functions in Scala:
---------------------------------------------------
1. anaonymus functions
2. named functions
3. curried functions
---------------------------------------------------


1. anaonymus functions
---------------------------------------------------
(x : Int, y : Int) => x + y

val add = (x : Int, y : Int) => x + y

add(1,2)

---------------------------------------------------

scala> (x : Int, y : Int) => x + y
res2: (Int, Int) => Int = <function2>

scala> val add = (x : Int, y : Int) => x + y
add: (Int, Int) => Int = <function2>

scala> add(1,2)
res3: Int = 3


2. named functions
---------------------------------------------------

def add(x : Int, y : Int) = x + y

add(1,2)

---------------------------------------------------

scala> def add(x : Int, y : Int) = x + y
add: (x: Int, y: Int)Int

scala> add(1,2)
res4: Int = 3


3. curried functions
---------------------------------------------------

def add(x : Int)(y : Int) = x + y

add(1)(2)

---------------------------------------------------

scala> def add(x : Int)(y : Int) = x + y
add: (x: Int)(y: Int)Int

scala> add(1)(2)
res5: Int = 3

---------------------------------------------------

`named functions` importance
---------------------------------------------------

def add(x : Int, y : Int) = x + y

def add(x : Int = 1, y : Int = 2) = x + y

---------------------------------------------------

scala> def add(x : Int = 1, y : Int = 2) = x + y
add: (x: Int, y: Int)Int

scala> add()
res6: Int = 3

scala> add(10)
res7: Int = 12

scala> add(10,20)
res8: Int = 30

scala> add(x = 10, y = 20)
res9: Int = 30

scala> add(y = 20, x = 10)
res10: Int = 30

---------------------------------------------------
factorial of n:
---------------------------------------------------

def factorial(n : Int) = {
 if(n == 1) 1
 else n * factorial(n-1)
}

scala> def factorial(n : Int) = {
     |  if(n == 1) 1
     |  else n * factorial(n-1)
     | }
<console>:13: error: recursive method factorial needs result type
        else n * factorial(n-1)
                 ^

---------------------------------------------------

def factorial(n : Int) : Int = {
 if(n == 1) 1
 else n * factorial(n-1)
}

---------------------------------------------------
scala> def factorial(n : Int) : Int = {
     |  if(n == 1) 1
     |  else n * factorial(n-1)
     | }
factorial: (n: Int)Int

scala> factorial(5)
res11: Int = 120

scala> factorial(4)
res12: Int = 24

---------------------------------------------------
Object Oriented Programming in Scala
---------------------------------------------------
trait
abstract class
class
object
case class
---------------------------------------------------

case class A (a :Int, b : String) {
 override def toString() : String = s"a is $a, b is $b"
}
---------------------------------------------------
scala> case class A (a :Int, b : String) {
     |  override def toString() : String = s"a is $a, b is $b"
     | }
defined class A

scala> A(1, "kalyan")
res13: A = a is 1, b is kalyan

scala> new A(1, "kalyan")
res14: A = a is 1, b is kalyan

scala> val a1 = A(1, "kalyan")
a1: A = a is 1, b is kalyan

scala> val a2 = new A(1, "kalyan")
a2: A = a is 1, b is kalyan

--------------------------------------------------
class B (a :Int, b : String) {
 override def toString() : String = s"a is $a, b is $b"
}

---------------------------------------------------

scala> class B (a :Int, b : String) {
     |  override def toString() : String = s"a is $a, b is $b"
     | }
defined class B

scala> B(1, "kalyan")
<console>:12: error: not found: value B
       B(1, "kalyan")
       ^

scala> new B(1, "kalyan")
res16: B = a is 1, b is kalyan

scala> val b1 = new B(1, "kalyan")
b1: B = a is 1, b is kalyan



---------------------------------------------------
Related Posts Plugin for WordPress, Blogger...