9.2.8. Base TaskManager API¶
-
class
radical.entk.execman.base.task_manager.
Base_TaskManager
(sid, pending_queue, completed_queue, rmgr, rmq_conn_params, rts)[source]¶ A Task Manager takes the responsibility of dispatching tasks it receives from a pending_queue for execution on to the available resources using a runtime system. Once the tasks have completed execution, they are pushed on to the completed_queue for other components of EnTK to process.
Arguments: pending_queue: (list) List of queue(s) with tasks ready to be executed. Currently, only one queue. completed_queue: (list) List of queue(s) with tasks that have finished execution. Currently, only one queue. rmgr: (ResourceManager) Object to be used to access the Pilot where the tasks can be submitted rmq_conn_params: (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.
-
_heartbeat
()[source]¶ - Purpose: Method to be executed in the heartbeat thread. This method
- sends a ‘request’ to the heartbeat-req queue. It expects a ‘response’ message from the ‘heartbeart-res’ queue within 10 seconds. This message should contain the same correlation id. If no message if received in 10 seconds, the tmgr is assumed dead. The end_manager() is called to cleanly terminate tmgr process and the heartbeat thread is also terminated.
- Details: The AppManager can re-invoke both if the execution is still
- not complete.
-
_tmgr
(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]¶ - Purpose: Method to be run by the tmgr process. This method receives
a Task from the pending_queue and submits it to the RTS. At all state transititons, they are synced (blocking) with the AppManager in the master process.
In addition, the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-req queue. It responds with a ‘response’ message to the ‘heartbeart-res’ queue.
- Details: The AppManager can re-invoke the tmgr process with this
- function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholder_dict, which stores the path of each of the tasks on the remote machine.
-
start_heartbeat
()[source]¶ - Purpose: Method to start the heartbeat thread. The heartbeat
- function is not to be accessed directly. The function is started in a separate thread using this method.
-
start_manager
()[source]¶ Purpose: Method to start the tmgr process. The tmgr function is not to be accessed directly. The function is started in a separate thread using this method.
-
9.2.9. RP TaskManager API¶
-
class
radical.entk.execman.rp.task_manager.
TaskManager
(sid, pending_queue, completed_queue, rmgr, rmq_conn_params)[source]¶ A Task Manager takes the responsibility of dispatching tasks it receives from a queue for execution on to the available resources using a runtime system. In this case, the runtime system being used RADICAL Pilot. Once the tasks have completed execution, they are pushed on to another queue for other components of EnTK to access.
Arguments: pending_queue: (list) List of queue(s) with tasks ready to be executed. Currently, only one queue. completed_queue: (list) List of queue(s) with tasks that have finished execution. Currently, only one queue. rmgr: (ResourceManager) Object to be used to access the Pilot where the tasks can be submitted rmq_conn_params: (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.
-
_process_tasks
(task_queue, rmgr, rmq_conn_params)[source]¶ - Purpose: The new thread that gets spawned by the main tmgr process
- invokes this function. This function receives tasks from ‘task_queue’ and submits them to the RADICAL Pilot RTS.
-
_tmgr
(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]¶ - Purpose: This method has 3 purposes: Respond to a heartbeat thread
indicating the live-ness of the RTS, receive tasks from the pending_queue, start a new thread that processes these tasks and submits to the RTS.
It is important to separate the reception of the tasks from their processing due to RMQ/AMQP design. The channel (i.e., the thread that holds the channel) that is receiving msgs from the RMQ server needs to be non-blocking as it can interfere with the heartbeat intervals of RMQ. Processing a large number of tasks can considerable time and can block the communication channel. Hence, the two are separated.
The new thread is responsible for pushing completed tasks (returned by the RTS) to the dequeueing queue. It also converts Tasks into TDs and CUs into (partially described) Tasks. This conversion is necessary since the current RTS is RADICAL Pilot. Once Tasks are recovered from a CU, they are then pushed to the completed_queue. At all state transititons, they are synced (blocking) with the AppManager in the master process.
In addition the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-request queue. It responds with a ‘response’ message to the heartbeart-response queue.
- Details: The AppManager can re-invoke the tmgr process with this
- function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholders, which stores the path of each of the tasks on the remote machine.
-
9.2.10. Dummy TaskManager API¶
-
class
radical.entk.execman.mock.task_manager.
TaskManager
(sid, pending_queue, completed_queue, rmgr, rmq_conn_params)[source]¶ A Task Manager takes the responsibility of dispatching tasks it receives from a pending_queue for execution on to the available resources using a runtime system. Once the tasks have completed execution, they are pushed on to the completed_queue for other components of EnTK to process.
Arguments: pending_queue: (list) List of queue(s) with tasks ready to be executed. Currently, only one queue. completed_queue: (list) List of queue(s) with tasks that have finished execution. Currently, only one queue. rmgr: (ResourceManager) Object to be used to access the Pilot where the tasks can be submitted rmq_conn_params: (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.
-
_process_tasks
(task_queue, rmgr, rmq_conn_params)[source]¶ - Purpose: The new thread that gets spawned by the main tmgr process
- invokes this function. This function receives tasks from ‘task_queue’ and submits them to the RADICAL Pilot RTS.
-
_tmgr
(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]¶ - Purpose: Method to be run by the tmgr process. This method receives
a Task from the pending_queue and submits it to the RTS. Currently, it also converts Tasks into CUDs and CUs into (partially described) Tasks. This conversion is necessary since the current RTS is RADICAL Pilot. Once Tasks are recovered from a CU, they are then pushed to the completed_queue. At all state transititons, they are synced (blocking) with the AppManager in the master process.
In addition the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-req queue. It responds with a ‘response’ message to the ‘heartbeart-res’ queue.
- Details: The AppManager can re-invoke the tmgr process with this
- function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholders, which stores the path of each of the tasks on the remote machine.
-