Forschung – Page 2

Publication

Aug 10, 2019

Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning

We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically...

Publication

Aug 10, 2019

Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search

We propose an approach to general subgoal-based temporal abstraction in MCTS. Our approach approximates a set of available macro-actions locally for each state only requiring a generative model and a subgoal predicate. For that, we modify the expansion step of...

Publication

May 13, 2019

Distributed Policy Iteration for Scalable Approximation of Cooperative Multi-Agent Policies (Extended Abstract)

We propose Strong Emergent Policy (STEP) approximation, a scalable approach to learn strong decentralized policies for cooperative MAS with a distributed variant of policy iteration. For that, we use function approximation to learn from action recommendations of a decentralized multi-agent...

Publication

Jan 27, 2019

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling

State-of-the-art approaches to partially observable planning like POMCP are based on stochastic tree search. While these approaches are computationally efficient, they may still construct search trees of considerable size, which could limit the performance due to restricted memory resources. In...

Publication

Jul 10, 2018

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation

Making decisions is a great challenge in distributed autonomous environments due to enormous state spaces and uncertainty. Many online planning algorithms rely on statistical sampling to avoid searching the whole state space, while still being able to make acceptable decisions....

Research

Adaptive Thompson Sampling Stacks for Memory Bounded Open-Loop Planning

Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search

Distributed Policy Iteration for Scalable Approximation of Cooperative Multi-Agent Policies (Extended Abstract)

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation