•  
  •  
 

Turkish Journal of Electrical Engineering and Computer Sciences

DOI

10.3906/elk-2008-94

Abstract

Reinforcement learning (RL) agents are often designed specifically for a particular problem and they generallyhave uninterpretable working processes. Statistical methods-based agent algorithms can be improved in terms ofgeneralizability and interpretability using symbolic artificial intelligence (AI) tools such as logic programming. Inthis study, we present a model-free RL architecture that is supported with explicit relational representations of theenvironmental objects. For the first time, we use the PrediNet network architecture in a dynamic decision-making problemrather than image-based tasks, and multi-head dot-product attention network (MHDPA) as a baseline for performancecomparisons. We tested two networks in two environments -i.e., the baseline box-world environment and our novelenvironment, relational-grid-world (RGW). With the procedurally generated RGW environment, which is complex interms of visual perceptions and combinatorial selections, it is easy to measure the relational representation performance ofthe RL agents. The experiments were carried out using different configurations of the environment so that the presentedmodule and the environment were compared with the baselines. We reached similar policy optimization performanceresults with the PrediNet architecture and MHDPA. Additionally, we achieved to extract the propositional representationexplicitly -which makes the agent's statistical policy logic more interpretable and tractable. This flexibility in the agent'spolicy provides convenience for designing non-task-specific agent architectures. The main contributions of this studyare two-fold -an RL agent that can explicitly perform relational reasoning, and a new environment that measures therelational reasoning capabilities of RL agents.

First Page

1259

Last Page

1273

Share

COinS