Batch Processing Resource


Data developers can use the Batch Processing service to process data through scheduling workflows. For more information, see Batch Data Processing Overview.


Resource Application Scenario

Two resources are required to run a batch data service: Batch Processing - Queue and Batch Processing - Container.

Batch Processing - Queue

When Hive and Spark interpreters are used to process data for offline analysis tasks, the interpreter uses the default queue resource for data query and processing, and tasks cannot be controlled and resources cannot be managed. If you need to run data query and processing tasks that require high resources, you need to apply for the Batch Processing - Queue resource and configure the resource name in the notebook.

Note

The maximum number of resource instances that can be applied for under each OU is 3.

Resource Specification

The Batch Processing - Queue resource can be requested based on the computing unit (CU). If the jobs require higher CPU usage, choose the Computing-Intensive specification. If the jobs require higher memory usage, choose the Memory-Intensive specification.

Specification Allocated Resources
Computing-Intensive 1 CU = 1 Core CPU + 2 GB Memory. Available options are 2 - 5000 CU by default.
Memory-Intensive 1 CU = 1 Core CPU + 4 GB Memory. Available options are 4 - 5000 CU by default.

Batch Processing - Container

To run big data analysis tasks using the batch processing service, you need to apply for the Batch Processing - Container resource.

  • Design Mode: When you need to use batch script development capabilities, you need to request designmode resources in advance.
  • Running Mode: When you need to use data synchronization or batch data processing capabilities, you need to request run-time mode resources when running manual or periodic scheduling tasks.

Note

The maximum number of resource instances that can be applied for each resource type under each OU is 1.

Resource Specification

The Batch Processing - Container resource is caluclated in CUs, and different specifications correspond to different processing capabilities. The higher the specification in the same resource mode, the higher the processing efficiency, and the larger the amount of data processed per unit time.

  • Design Mode: Container resources used for the execution and debugging of the corresponding script when developing functional modules for batch data processing scripts.
  • Runtime Mode: Container resources required by task nodes to run and schedule (both periodic and immediate) for batch data processing or data synchronization functions.

Design Mode Specification

Specification Description
CU 1 CU = 1 Core CPU + 2 GB Memory. Available options are 1 - 5000 CU by default.

Runtime Mode Specification

Specification Description
CU 1 CU = 1 Core CPU + 2 GB Memory. Available options are 1 - 5000 CU by default.