9.2.8. Base TaskManager API

class radical.entk.execman.base.task_manager.Base_TaskManager(sid, pending_queue, completed_queue, rmgr, rmq_conn_params, rts)[source]

A Task Manager takes the responsibility of dispatching tasks it receives from a pending_queue for execution on to the available resources using a runtime system. Once the tasks have completed execution, they are pushed on to the completed_queue for other components of EnTK to process.

Arguments:
pending_queue:(list) List of queue(s) with tasks ready to be executed. Currently, only one queue.
completed_queue:
 (list) List of queue(s) with tasks that have finished execution. Currently, only one queue.
rmgr:(ResourceManager) Object to be used to access the Pilot where the tasks can be submitted
rmq_conn_params:
 (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ

Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.

_advance(obj, obj_type, new_state, channel, conn_params, queue)[source]
_heartbeat()[source]
Purpose: Method to be executed in the heartbeat thread. This method
sends a ‘request’ to the heartbeat-req queue. It expects a ‘response’ message from the ‘heartbeart-res’ queue within 10 seconds. This message should contain the same correlation id. If no message if received in 10 seconds, the tmgr is assumed dead. The end_manager() is called to cleanly terminate tmgr process and the heartbeat thread is also terminated.
Details: The AppManager can re-invoke both if the execution is still
not complete.
_sync_with_master(obj, obj_type, channel, conn_params, queue)[source]
_tmgr(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]
Purpose: Method to be run by the tmgr process. This method receives

a Task from the pending_queue and submits it to the RTS. At all state transititons, they are synced (blocking) with the AppManager in the master process.

In addition, the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-req queue. It responds with a ‘response’ message to the ‘heartbeart-res’ queue.

Details: The AppManager can re-invoke the tmgr process with this
function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholder_dict, which stores the path of each of the tasks on the remote machine.
check_heartbeat()[source]

Purpose: Check if the heartbeat thread is alive and running

check_manager()[source]

Purpose: Check if the tmgr process is alive and running

start_heartbeat()[source]
Purpose: Method to start the heartbeat thread. The heartbeat
function is not to be accessed directly. The function is started in a separate thread using this method.
start_manager()[source]

Purpose: Method to start the tmgr process. The tmgr function is not to be accessed directly. The function is started in a separate thread using this method.

terminate_heartbeat()[source]

Purpose: Method to terminate the heartbeat thread. This method is blocking as it waits for the heartbeat thread to terminate (aka join).

This is the last method that is executed from the TaskManager and hence closes the profiler.

terminate_manager()[source]

Purpose: Method to terminate the tmgr process. This method is blocking as it waits for the tmgr process to terminate (aka join).

9.2.9. RP TaskManager API

class radical.entk.execman.rp.task_manager.TaskManager(sid, pending_queue, completed_queue, rmgr, rmq_conn_params)[source]

A Task Manager takes the responsibility of dispatching tasks it receives from a queue for execution on to the available resources using a runtime system. In this case, the runtime system being used RADICAL Pilot. Once the tasks have completed execution, they are pushed on to another queue for other components of EnTK to access.

Arguments:
pending_queue:(list) List of queue(s) with tasks ready to be executed. Currently, only one queue.
completed_queue:
 (list) List of queue(s) with tasks that have finished execution. Currently, only one queue.
rmgr:(ResourceManager) Object to be used to access the Pilot where the tasks can be submitted
rmq_conn_params:
 (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ

Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.

_process_tasks(task_queue, rmgr, rmq_conn_params)[source]
Purpose: The new thread that gets spawned by the main tmgr process
invokes this function. This function receives tasks from ‘task_queue’ and submits them to the RADICAL Pilot RTS.
_tmgr(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]
Purpose: This method has 3 purposes: Respond to a heartbeat thread

indicating the live-ness of the RTS, receive tasks from the pending_queue, start a new thread that processes these tasks and submits to the RTS.

It is important to separate the reception of the tasks from their processing due to RMQ/AMQP design. The channel (i.e., the thread that holds the channel) that is receiving msgs from the RMQ server needs to be non-blocking as it can interfere with the heartbeat intervals of RMQ. Processing a large number of tasks can considerable time and can block the communication channel. Hence, the two are separated.

The new thread is responsible for pushing completed tasks (returned by the RTS) to the dequeueing queue. It also converts Tasks into TDs and CUs into (partially described) Tasks. This conversion is necessary since the current RTS is RADICAL Pilot. Once Tasks are recovered from a CU, they are then pushed to the completed_queue. At all state transititons, they are synced (blocking) with the AppManager in the master process.

In addition the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-request queue. It responds with a ‘response’ message to the heartbeart-response queue.

Details: The AppManager can re-invoke the tmgr process with this
function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholders, which stores the path of each of the tasks on the remote machine.
_update_resource(pilot)[source]

Update used pilot.

start_manager()[source]
Purpose: Method to start the tmgr process. The tmgr function
is not to be accessed directly. The function is started in a separate thread using this method.

9.2.10. Dummy TaskManager API

class radical.entk.execman.mock.task_manager.TaskManager(sid, pending_queue, completed_queue, rmgr, rmq_conn_params)[source]

A Task Manager takes the responsibility of dispatching tasks it receives from a pending_queue for execution on to the available resources using a runtime system. Once the tasks have completed execution, they are pushed on to the completed_queue for other components of EnTK to process.

Arguments:
pending_queue:(list) List of queue(s) with tasks ready to be executed. Currently, only one queue.
completed_queue:
 (list) List of queue(s) with tasks that have finished execution. Currently, only one queue.
rmgr:(ResourceManager) Object to be used to access the Pilot where the tasks can be submitted
rmq_conn_params:
 (pika.connection.ConnectionParameters) object of parameters necessary to connect to RabbitMQ

Currently, EnTK is configured to work with one pending queue and one completed queue. In the future, the number of queues can be varied for different throughput requirements at the cost of additional Memory and CPU consumption.

_process_tasks(task_queue, rmgr, rmq_conn_params)[source]
Purpose: The new thread that gets spawned by the main tmgr process
invokes this function. This function receives tasks from ‘task_queue’ and submits them to the RADICAL Pilot RTS.
_tmgr(uid, rmgr, pending_queue, completed_queue, rmq_conn_params)[source]
Purpose: Method to be run by the tmgr process. This method receives

a Task from the pending_queue and submits it to the RTS. Currently, it also converts Tasks into CUDs and CUs into (partially described) Tasks. This conversion is necessary since the current RTS is RADICAL Pilot. Once Tasks are recovered from a CU, they are then pushed to the completed_queue. At all state transititons, they are synced (blocking) with the AppManager in the master process.

In addition the tmgr also receives heartbeat ‘request’ msgs from the heartbeat-req queue. It responds with a ‘response’ message to the ‘heartbeart-res’ queue.

Details: The AppManager can re-invoke the tmgr process with this
function if the execution of the workflow is still incomplete. There is also population of a dictionary, placeholders, which stores the path of each of the tasks on the remote machine.
start_manager()[source]
Purpose: Method to start the tmgr process. The tmgr function
is not to be accessed directly. The function is started in a separate thread using this method.