Files
Obsidian/Record/DL/Loss.md

23 lines
938 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 熵
随机变量 $X = \{x_1,x_2,...,x_i\}$,对应的概率为 $p_i = p(X = x_i)$,则熵为
$$
H(X) = - \sum_{i=1}^{n}p(x_i) \log p(x_i)
$$
> $p(x_i)=0$ 时,$p(x_i)logp(x_i)=0$。
> $\log p(x)$表示某个状态所需的信息量,较低的熵往往需要的信息量更少,这样才会使得总信息量更小。熵表示服从某一概率分布时理论最小平均编码长度。
# 交叉熵
$$
H(p,q) = \sum_x p(x) \frac{1}{q(x)}=-\sum_x p(x) \log q(x)
$$
> 表示对预测分布 $q(x)$ 使用真实分布 $p(x)$ 来进行编码时所需要的信息量大小。
> 由于熵是最小平均编码长度,当且仅当$p=q$时,交叉熵取得最小值$H(p,q) = H(q,p) = H(p) = H(q)$
# KL 散度
KL散度(相对熵)的表示如下:
$$
D_{KL}(p||q) = H(p,q) - H(p) = - \sum_x p(x) \log \frac{q(x)}{p(x)}
$$
KL散度有以下性质
1. 正定性:$D_{KL}(p||q) \ge 0$
2. 非对称性:$D_{KL}(p||q) \ne D_{KL}(q||p)$
3.