solidot新版网站常见问题,请点击这里查看。
消息
本文已被查看5335次
Gradient conjugate priors and deep neural networks. (arXiv:1802.02643v1 [math.ST])
来源于:arXiv
The paper deals with learning the probability distribution of the observed
data by artificial neural networks. We suggest a so-called gradient conjugate
prior (GCP) update appropriate for neural networks, which is a modification of
the classical Bayesian update for conjugate priors. We establish a connection
between the gradient conjugate prior update and the maximization of the
log-likelihood of the predictive distribution. Unlike for the Bayesian neural
networks, we do not impose a prior on the weights of the neural networks, but
rather assume that the ground truth distribution is normal with unknown mean
and variance and learn by neural networks the parameters of a prior
(normal-gamma distribution) for these unknown mean and variance. The update of
the parameters is done, using the gradient that, at each step, directs towards
minimizing the Kullback--Leibler divergence from the prior to the posterior
distribution (both being normal-gamma). We obtain a corresponding dynamical
system 查看全文>>