Friday 9 June 2017

How to Perform Incremental Load in Sqoop

Importing Incremental Data

You can also perform incremental imports using Sqoop. Incremental import is a technique that imports only the newly added rows in a table. It is required to add incremental , check-column , and last-value options to perform the incremental import.

incremental - Used by Sqoop to determine which rows are new. Legal values for this mode include append and lastmodified .

check-column - To provide the column that needs to checked the determine the candidate rows.

last-value - This is the maximum value of the last import run.



Example:


sqoop import 

--connect jdbc:mysql://localhost:3306/kalyan 
--username root 
--table sample 
-m 1
--incremental append 
--check-column id 
--last-value 1000



How to Enable Transactions in Hive

Hive supports transactions by setting the correct parameters. To enable transactions, the following configurations need to be set. These configuration parameters must be set appropriately to turn on transaction support in Hive:

hive.support.concurrency - true

hive.enforce.bucketing - true

hive.exec.dynamic.partition.mode - nonstrict

hive.txn.manager - org.apache.hadoop.hive.ql.lockmgr.DbTxnManager

hive.compactor.initiator.on - true on one instance of the Thrift metastore service

hive.compactor.worker.threads - 10 for an instance of the Thrift metastore service



Use this specific table format:

CREATE TABLE mytable (
 c1 int,
 c2 string,
 c3 string
)
CLUSTERED BY (c1) INTO x BUCKETS
STORED AS orc
TBLPROPERTIES('transactional' = 'true');



What Is ACID and Why Use It?

ACID stands for four traits of database transactions:

Atomicity - An operation either succeeds completely or fails; operations do not leave incomplete data in the system.

Consistency - Once an operation completes, the results of that operation are visible to every subsequent operation.

Isolation - Operations completed by one user do not cause unexpected side effects for other users.

Durability - Once an operation is complete, it will be preserved even if the machine or system experiences a failure.


These behaviours are mandatory to ensure transaction functionality.

If your operations are ACID compliant, the system will ensure your processing is protected against any failures.


Related Posts Plugin for WordPress, Blogger...