Ranking Policy Decisions

Hadrien Pouget
Hadrien Pouget
Youcheng Sun
Youcheng Sun
Cited by: 0|Bibtex|Views0|Links

Abstract:

Policies trained via Reinforcement Learning (RL) are often needlessly complex, making them more difficult to analyse and interpret. In a run with $n$ time steps, a policy will decide $n$ times on an action to take, even when only a tiny subset of these decisions deliver value over selecting a simple default action. Given a pre-trained p...More

Code:

Data:

Your rating :
0

 

Tags
Comments