Summary: TarMAC:Targeted Multi-Agent Communication
Author of the paper: Anonymous,from ICLR 2019 under review.
- Motivation: communicate to different agents with different messages enble a more flexble strategy;avtively select message recievers can reduce the cost of communication;
- Use soft attention in the communication architecture to enable agents to communicate agent-goal-specific msgs,be adaptive to variable team sizes and be interpretable of what message to whom
- Use multi-stage communication to exchange infomations
- Tested on 2D and even 3D scene.
every agent get its local observation and the aggregated message for it ,and they share a policy network constructed by GRU ,use this network output the message and the action they would take .
The message output by the policy network consists two parts:$m_i^t=[k_i^t,v_i^t]$ .the $k_i^t$ denotes the signature for target recipients,and the $v_i^t$ denotes the value they want to transmit. for the reciever,it will generate an query ,then apply to all the message it recived.
I didn't get clear about the multi-stage communication,waited for the paper updated.
what we can take away
- choose the message routing is a good thought,but we should focus on the listener.
- multi-stage communictaion can be useful in complex scene to communicate action intentions and make the final decision
- use GRU to handle the message and the local observarion are useful for PO-MDP.
- the multi stage communication did not get a clear statement.
- It's still transmit all the info to all others,that seems like didn't match the paper's intitution s