# Introduction

Author of the paper: Anonymous,from ICLR 2019 under review.

• Motivation: communicate to different agents with different messages enble a more flexble strategy;avtively select message recievers can reduce the cost of communication;
• Use soft attention in the communication architecture to enable agents to communicate agent-goal-specific msgs,be adaptive to variable team sizes and be interpretable of what message to whom
• Use multi-stage communication to exchange infomations
• Tested on 2D and even 3D scene.
• paper

# Approach

every agent get its local observation and the aggregated message for it ,and they share a policy network constructed by GRU ,use this network output the message and the action they would take .

## Communication

The message output by the policy network consists two parts:$m_i^t=[k_i^t,v_i^t]$ .the $k_i^t$ denotes the signature for target recipients,and the $v_i^t$ denotes the value they want to transmit. for the reciever,it will generate an query ,then apply to all the message it recived.

I didn't get clear about the multi-stage communication,waited for the paper updated.

# Notes

## what we can take away

• choose the message routing is a good thought,but we should focus on the listener.
• multi-stage communictaion can be useful in complex scene to communicate action intentions and make the final decision
• use GRU to handle the message and the local observarion are useful for PO-MDP.

## Confused

• the multi stage communication did not get a clear statement.
• It's still transmit all the info to all others,that seems like didn't match the paper's intitution s