solidot新版网站常见问题,请点击这里查看。
消息
本文已被查看2204次
Approximate Dynamic Programming with Probabilistic Temporal Logic Constraints. (arXiv:1810.02199v1 [math.OC])
来源于:arXiv
In this paper, we develop approximate dynamic programming methods for
stochastic systems modeled as Markov Decision Processes, given both soft
performance criteria and hard constraints in a class of probabilistic temporal
logic called Probabilistic Computation Tree Logic (PCTL). Our approach consists
of two steps: First, we show how to transform a class of PCTL formulas into
chance constraints that can be enforced during planning in stochastic systems.
Second, by integrating randomized optimization and entropy-regulated dynamic
programming, we devise a novel trajectory sampling-based approximate value
iteration method to iteratively solve for an upper bound on the value function
while ensuring the constraints that PCTL specifications are satisfied.
Particularly, we show that by the on-policy sampling of the trajectories, a
tight bound can be achieved between the upper bound given by the approximation
and the true value function. The correctness and efficiency of the method are
demonstr 查看全文>>