Data Synchronization Overview¶
The Data Synchronization service supports synchronizing data between extensive heterogeneous data sources, helping data developers with the following tasks:
- Synchronizing data: Synchronizing structured data from external source databases to the Hive Library in EnOS and from the EnOS Hive database to external target databases.
- Synchronizing files: Synchronizing files from external source databases to EnOS HDFS file storage (currently supporting Azure Blob data source).
A data synchronization workflow is a specific type of workflow. The essence of a data synchronization workflow is a workflow with a single data-integration type of task.
The typical scenarios of data synchronization are as follows.
Full-load Data Synchronization¶
Synchronization of full-load of data usually happens at the initial stage of data synchronization. In this scenario, you synchronize full-load of data from the data source at one-time. To achieve this scenario, see Synchronizing Data from External Data Source to Hive (Manaual Scheduling).
Incremental Data Synchronization¶
Synchronization of incremental data usually happens after the initial stage, when you only want to synchronize the new or updated data periodically. In this scenario, you usually select the incremental data to synchronize through a
where clause when you configure the synchronization workflow. To achieve this scenario, see Synchronizing Data from External Data Source to Hive (Periodic Scheduling).
Synchronizing Files from External Databases to HDFS¶
Synchronization of files from external databases to HDFS. To achieve this scenario, see Synchronizing Files from External Databases to HDFS.