-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsearch.xml
20 lines (9 loc) · 30.5 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title></title>
<link href="2021/05/20/hmm/"/>
<url>2021/05/20/hmm/</url>
<content type="html"><![CDATA[<h2 id="hmm-隐马尔可夫模型">HMM-隐马尔可夫模型</h2><h3 id="基本概念init">1. 基本概念(init)</h3><blockquote><p>定义:隐马尔可夫模型是关于时序的概率模型,描述由一个隐藏的马尔可夫链<strong>随机生成不可观测的状态随机序列</strong>, 再由<strong>各个状态生成一个观测从而产生观测随机序列</strong>的过程。隐藏的马尔可夫链随机生成的状态的序列,称为状态序列( state sequence ) ;每个状态生成一个观测,而由此产生的观测的随机序列,称为观测序列( observation sequence )。序列的每一个位置又可以看作是一个时刻。</p></blockquote><p><span class="math inline">\(I\)</span>是长度为<span class="math inline">\(T\)</span>的状态序列,<span class="math inline">\(O\)</span>是对应的观测序列: <span class="math display">\[I=\left(i_{1}, i_{2}, \cdots, i_{T}\right), \quad O=\left(o_{1}, o_{2}, \cdots, o_{T}\right)\]</span> <span class="math inline">\(A\)</span>是<strong>状态转移概率矩阵</strong>: <span class="math display">\[A=\left[a_{i j}\right]_{N \times N}\]</span> 其中, <span class="math display">\[a_{i j}=P\left(i_{t+1}=q_{j} \mid i_{t}=q_{i}\right), \quad i=1,2, \cdots, N ; \quad j=1,2, \cdots, N\]</span> 是在时刻<span class="math inline">\(t\)</span>处于状态<span class="math inline">\(q_i\)</span>的条件下在时刻<span class="math inline">\(t+1\)</span>转移到状态<span class="math inline">\(q_j\)</span>的概率。</p><p><em>B</em> 是<strong>观测概率矩阵</strong>: <span class="math display">\[B=\left[b_{j}(k)\right]_{N \times M}\]</span> 其中, <span class="math display">\[b_{j}(k)=P\left(o_{t}=v_{k} \mid i_{t}=q_{j}\right), \quad k=1,2, \cdots, M ; \quad j=1,2, \cdots, N\]</span> 是在时刻 <em>t</em> 处于状态<span class="math inline">\(q_i\)</span>的条件下生成观测<span class="math inline">\(v_{k}\)</span>的概率。</p><p><em>π</em> 是<strong>初始状态概率向量</strong>: <span class="math display">\[\pi=\left(\pi_{i}\right)\]</span> 其中,</p><p><span class="math display">\[\pi_{i}=P\left(i_{1}=q_{i}\right), \quad i=1,2, \cdots, N\]</span> 是时刻<span class="math inline">\(t=1\)</span>处于状态<span class="math inline">\(q_i\)</span>的概率。</p><blockquote><p>F:_projects+_entity_recognition.py</p></blockquote><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token keyword">def</span> <span class="token function">__init__</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> N<span class="token punctuation">,</span> M<span class="token punctuation">)</span><span class="token punctuation">:</span><span class="token triple-quoted-string string">"""Args: N: 状态数,这里对应存在的标注的种类 M: 观测数,这里对应有多少不同的字"""</span> self<span class="token punctuation">.</span>N <span class="token operator">=</span> N self<span class="token punctuation">.</span>M <span class="token operator">=</span> M <span class="token comment"># 状态转移概率矩阵 A[i][j]表示从i状态转移到j状态的概率</span> self<span class="token punctuation">.</span>A <span class="token operator">=</span> torch<span class="token punctuation">.</span>zeros<span class="token punctuation">(</span>N<span class="token punctuation">,</span> N<span class="token punctuation">)</span> <span class="token comment"># 观测概率矩阵, B[i][j]表示i状态下生成j观测的概率</span> self<span class="token punctuation">.</span>B <span class="token operator">=</span> torch<span class="token punctuation">.</span>zeros<span class="token punctuation">(</span>N<span class="token punctuation">,</span> M<span class="token punctuation">)</span> <span class="token comment"># 初始状态概率 Pi[i]表示初始时刻为状态i的概率</span> self<span class="token punctuation">.</span>Pi <span class="token operator">=</span> torch<span class="token punctuation">.</span>zeros<span class="token punctuation">(</span>N<span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h3 id="学习算法train">2. 学习算法(train)</h3><p>HMM的训练,即根据训练语料对模型参数进行估计,因为我们有观测序列以及其对应的状态序列,所以我们使用<strong>极大似然估计</strong>的方法来估计隐马尔可夫模型的参数。</p><h4 id="转移概率a_i-j的估计">2.1 转移概率<span class="math inline">\(a_{i j}\)</span>的估计</h4><p>设样本中时刻 <em>t</em> 处于状态 <em>i</em> 时刻 <em>t+1</em> 转移到状态 <em>j</em> 的频数为<span class="math inline">\(A_{ij}\)</span>,那么状态转移概率<span class="math inline">\(a_{i j}\)</span>的估计是 <span class="math display">\[\hat{a}_{i j}=\frac{A_{i j}}{\sum_{j=1}^{N} A_{i j}}, \quad i=1,2, \cdots, N ; \quad j=1,2, \cdots, N\]</span></p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token triple-quoted-string string">"""参数: word_lists: 列表,由字组成的列表,如 ['担','任','科','员'] tag_lists: 列表,由对应的标注组成的列表,如 ['O','O','B-TITLE', 'E-TITLE']"""</span><span class="token comment"># 估计转移概率矩阵</span><span class="token keyword">for</span> tag_list <span class="token keyword">in</span> tag_lists<span class="token punctuation">:</span> seq_len <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>tag_list<span class="token punctuation">)</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>seq_len <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">:</span> current_tagid <span class="token operator">=</span> tag2id<span class="token punctuation">[</span>tag_list<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">]</span> next_tagid <span class="token operator">=</span> tag2id<span class="token punctuation">[</span>tag_list<span class="token punctuation">[</span>i<span class="token operator">+</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">]</span> self<span class="token punctuation">.</span>A<span class="token punctuation">[</span>current_tagid<span class="token punctuation">]</span><span class="token punctuation">[</span>next_tagid<span class="token punctuation">]</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token comment"># 问题:如果某元素没有出现过,该位置为0,这在后续的计算中是不允许的</span><span class="token comment"># 解决方法:我们将等于0的概率加上很小的数</span>self<span class="token punctuation">.</span>A<span class="token punctuation">[</span>self<span class="token punctuation">.</span>A <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">.</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token number">1e</span><span class="token operator">-</span><span class="token number">10</span>self<span class="token punctuation">.</span>A <span class="token operator">=</span> self<span class="token punctuation">.</span>A <span class="token operator">/</span> self<span class="token punctuation">.</span>A<span class="token punctuation">.</span><span class="token builtin">sum</span><span class="token punctuation">(</span>dim<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">,</span> keepdim<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h4 id="观测概率b_jk的估计">2.2 观测概率<span class="math inline">\(b_j(k)\)</span>的估计</h4><p>设样本中状态为 <em>j</em> 并观测为 <em>k</em> 的频数是<span class="math inline">\(B_{j k}\)</span>那么状态为 <em>j</em> 观测为 <em>k</em> 的概率<span class="math inline">\(b_{j k}\)</span>的估计是 <span class="math display">\[\hat{b}_{j}(k)=\frac{B_{j k}}{\sum_{k=1}^{M} B_{j k}}, \quad j=1,2, \cdots, N ; \quad k=1,2, \cdots, M\]</span></p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># 估计观测概率矩阵</span><span class="token keyword">for</span> tag_list<span class="token punctuation">,</span> word_list <span class="token keyword">in</span> <span class="token builtin">zip</span><span class="token punctuation">(</span>tag_lists<span class="token punctuation">,</span> word_lists<span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token keyword">assert</span> <span class="token builtin">len</span><span class="token punctuation">(</span>tag_list<span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token builtin">len</span><span class="token punctuation">(</span>word_list<span class="token punctuation">)</span> <span class="token keyword">for</span> tag<span class="token punctuation">,</span> word <span class="token keyword">in</span> <span class="token builtin">zip</span><span class="token punctuation">(</span>tag_list<span class="token punctuation">,</span> word_list<span class="token punctuation">)</span><span class="token punctuation">:</span> tag_id <span class="token operator">=</span> tag2id<span class="token punctuation">[</span>tag<span class="token punctuation">]</span> word_id <span class="token operator">=</span> word2id<span class="token punctuation">[</span>word<span class="token punctuation">]</span> self<span class="token punctuation">.</span>B<span class="token punctuation">[</span>tag_id<span class="token punctuation">]</span><span class="token punctuation">[</span>word_id<span class="token punctuation">]</span> <span class="token operator">+=</span> <span class="token number">1</span>self<span class="token punctuation">.</span>B<span class="token punctuation">[</span>self<span class="token punctuation">.</span>B <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">.</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token number">1e</span><span class="token operator">-</span><span class="token number">10</span>self<span class="token punctuation">.</span>B <span class="token operator">=</span> self<span class="token punctuation">.</span>B <span class="token operator">/</span> self<span class="token punctuation">.</span>B<span class="token punctuation">.</span><span class="token builtin">sum</span><span class="token punctuation">(</span>dim<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">,</span> keepdim<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h4 id="初始状态概率pi_i的估计">2.3 初始状态概率<span class="math inline">\(\pi_{i}\)</span>的估计</h4><p><span class="math inline">\(\hat{\pi}_{i}\)</span>为<span class="math inline">\(S\)</span>个样本中初始状态为<span class="math inline">\(q_i\)</span>的频率</p><p>由于监督学习需要使用标注的训练数据,而人工标注训练数据往往代价很高,有时就会利用无监督学习的方法。</p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># 估计初始状态概率</span><span class="token keyword">for</span> tag_list <span class="token keyword">in</span> tag_lists<span class="token punctuation">:</span> init_tagid <span class="token operator">=</span> tag2id<span class="token punctuation">[</span>tag_list<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">]</span> self<span class="token punctuation">.</span>Pi<span class="token punctuation">[</span>init_tagid<span class="token punctuation">]</span> <span class="token operator">+=</span> <span class="token number">1</span>self<span class="token punctuation">.</span>Pi<span class="token punctuation">[</span>self<span class="token punctuation">.</span>Pi <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">.</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token number">1e</span><span class="token operator">-</span><span class="token number">10</span>self<span class="token punctuation">.</span>Pi <span class="token operator">=</span> self<span class="token punctuation">.</span>Pi <span class="token operator">/</span> self<span class="token punctuation">.</span>Pi<span class="token punctuation">.</span><span class="token builtin">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h3 id="预测算法decoding">3. 预测算法(decoding)</h3><p>根据动态规划原理,我们只需从时刻<span class="math inline">\(t=1\)</span>开始,递推地计算在时刻<span class="math inline">\(t\)</span>状态为<span class="math inline">\(i\)</span>的各条部分路径的最大概率,直至得到时刻<span class="math inline">\(t = T\)</span>状态为<span class="math inline">\(i\)</span>的各条路径的最大概率。时刻<span class="math inline">\(t = T\)</span>的最大概率即为最优路径的概率<span class="math inline">\(P^*\)</span>,最优路径的终结点<span class="math inline">\(i^*\)</span>也同时得到。之后,为了找出最优路径的各个结点,从终结点件开始,由后向前逐步求得结点<span class="math inline">\(i^*_{T-1},...,i^*_1\)</span>,得到最优路径<span class="math inline">\(I^*=(i^*_1,...,i^*_{T-1},i^*_T)\)</span>。这就是维特比算法。</p><h4 id="维特比算法">维特比算法</h4><p><span class="math inline">\(\begin{array}{l} \text { 输入: 模型 } \lambda=(A, B, \pi) \text { 和观测 } O=\left(o_{1}, o_{2}, \cdots, o_{T}\right) \text { ; }\\ \text { 输出: 最优路径 } I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right) \text { 。 } \end{array}\)</span></p><ol type="1"><li>初始化 <span class="math display">\[\begin{array}{c}\delta_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N \\\Psi_{1}(i)=0, \quad i=1,2, \cdots, N\end{array}\]</span></li></ol><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># 问题:整条链很长的情况下,十分多的小概率相乘,最后可能造成下溢</span><span class="token comment"># 解决办法:采用对数概率,这样源空间中的很小概率,就被映射到对数空间的大的负数</span><span class="token comment"># 同时相乘操作也变成简单的相加操作</span>A <span class="token operator">=</span> torch<span class="token punctuation">.</span>log<span class="token punctuation">(</span>self<span class="token punctuation">.</span>A<span class="token punctuation">)</span>B <span class="token operator">=</span> torch<span class="token punctuation">.</span>log<span class="token punctuation">(</span>self<span class="token punctuation">.</span>B<span class="token punctuation">)</span>Pi <span class="token operator">=</span> torch<span class="token punctuation">.</span>log<span class="token punctuation">(</span>self<span class="token punctuation">.</span>Pi<span class="token punctuation">)</span><span class="token comment"># 初始化 维比特矩阵viterbi 它的维度为[状态数, 序列长度]</span><span class="token comment"># 其中viterbi[i, j]表示标注序列的第j个标注为i的所有单个序列(i_1, i_2, ..i_j)出现的概率最大值</span>seq_len <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>word_list<span class="token punctuation">)</span>viterbi <span class="token operator">=</span> torch<span class="token punctuation">.</span>zeros<span class="token punctuation">(</span>self<span class="token punctuation">.</span>N<span class="token punctuation">,</span> seq_len<span class="token punctuation">)</span><span class="token comment"># backpointer是跟viterbi一样大小的矩阵</span><span class="token comment"># backpointer[i, j]存储的是 标注序列的第j个标注为i时,第j-1个标注的id</span><span class="token comment"># 等解码的时候,我们用backpointer进行回溯,以求出最优路径</span>backpointer <span class="token operator">=</span> torch<span class="token punctuation">.</span>zeros<span class="token punctuation">(</span>self<span class="token punctuation">.</span>N<span class="token punctuation">,</span> seq_len<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token builtin">long</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token comment"># self.Pi[i] 表示第一个字的标记为i的概率</span><span class="token comment"># Bt[word_id]表示字为word_id的时候,对应各个标记的概率</span><span class="token comment"># self.A.t()[tag_id]表示各个状态转移到tag_id对应的概率</span>start_wordid <span class="token operator">=</span> word2id<span class="token punctuation">.</span>get<span class="token punctuation">(</span>word_list<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span>Bt <span class="token operator">=</span> B<span class="token punctuation">.</span>t<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token keyword">if</span> start_wordid <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span> <span class="token comment"># 如果字不再字典里,则假设状态的概率分布是均匀的</span> bt <span class="token operator">=</span> torch<span class="token punctuation">.</span>log<span class="token punctuation">(</span>torch<span class="token punctuation">.</span>ones<span class="token punctuation">(</span>self<span class="token punctuation">.</span>N<span class="token punctuation">)</span> <span class="token operator">/</span> self<span class="token punctuation">.</span>N<span class="token punctuation">)</span> <span class="token keyword">else</span><span class="token punctuation">:</span> bt <span class="token operator">=</span> Bt<span class="token punctuation">[</span>start_wordid<span class="token punctuation">]</span> <span class="token comment"># 转置表示各状态下生成start_wordid观测的概率</span>viterbi<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">=</span> Pi <span class="token operator">+</span> bt <span class="token comment"># 状态路径[0]的各状态下观测为start_wordid的联合概率P(..o_i, ..i_t|λ),为了对照观测序列</span>backpointer<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token operator">-</span><span class="token number">1</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="2" type="1"><li>递推。对$ t=2,3, , T$ <span class="math display">\[\\\begin{array}{l}\delta_{t}(i)=\max _{1 \leqslant j \leqslant N}\left[\delta_{t-1}(j) a_{j i}\right] b_{i}\left(o_{t}\right), \quad i=1,2, \cdots, N \\\Psi_{t}(i)=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_{t-1}(j) a_{j i}\right], \quad i=1,2, \cdots, N\end{array}\]</span></li></ol><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># viterbi[tag_id, step] = max(viterbi[:, step-1]* self.A.t()[tag_id] * Bt[word])</span><span class="token comment"># 其中word是step时刻对应的字</span><span class="token keyword">for</span> step <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> seq_len<span class="token punctuation">)</span><span class="token punctuation">:</span> wordid <span class="token operator">=</span> word2id<span class="token punctuation">.</span>get<span class="token punctuation">(</span>word_list<span class="token punctuation">[</span>step<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span> <span class="token comment"># 处理字不在字典中的情况</span> <span class="token comment"># bt是在t时刻字为wordid时,状态的概率分布</span> <span class="token keyword">if</span> wordid <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span> <span class="token comment"># 如果字不再字典里,则假设状态的概率分布是均匀的</span> bt <span class="token operator">=</span> torch<span class="token punctuation">.</span>log<span class="token punctuation">(</span>torch<span class="token punctuation">.</span>ones<span class="token punctuation">(</span>self<span class="token punctuation">.</span>N<span class="token punctuation">)</span> <span class="token operator">/</span> self<span class="token punctuation">.</span>N<span class="token punctuation">)</span> <span class="token keyword">else</span><span class="token punctuation">:</span> bt <span class="token operator">=</span> Bt<span class="token punctuation">[</span>wordid<span class="token punctuation">]</span> <span class="token comment"># 否则从观测概率矩阵中取bt</span> <span class="token keyword">for</span> tag_id <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>tag2id<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token comment"># max_id:最有可能(step-1->step)转移到tag_id的上一个状态</span> max_prob<span class="token punctuation">,</span> max_id <span class="token operator">=</span> torch<span class="token punctuation">.</span><span class="token builtin">max</span><span class="token punctuation">(</span> viterbi<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> step <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">+</span> A<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> tag_id<span class="token punctuation">]</span><span class="token punctuation">,</span> dim<span class="token operator">=</span><span class="token number">0</span> <span class="token punctuation">)</span> viterbi<span class="token punctuation">[</span>tag_id<span class="token punctuation">,</span> step<span class="token punctuation">]</span> <span class="token operator">=</span> max_prob <span class="token operator">+</span> bt<span class="token punctuation">[</span>tag_id<span class="token punctuation">]</span> backpointer<span class="token punctuation">[</span>tag_id<span class="token punctuation">,</span> step<span class="token punctuation">]</span> <span class="token operator">=</span> max_id<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="3" type="1"><li>终止 <span class="math display">\[\begin{array}{c}P^{*}=\max _{1 \leqslant i \leqslant N} \delta_{T}(i) \\i_{T}^{*}=\arg \max _{1 \leqslant i \leqslant N}\left[\delta_{T}(i)\right]\end{array}\]</span></li></ol><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># t=seq_len 即 viterbi[:, seq_len]中的最大概率,就是最优路径的概率,</span><span class="token comment"># 最后一个最佳状态根据viterbi算,其他backpointer</span>best_path_prob<span class="token punctuation">,</span> best_path_pointer <span class="token operator">=</span> torch<span class="token punctuation">.</span><span class="token builtin">max</span><span class="token punctuation">(</span> viterbi<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> seq_len <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span> dim<span class="token operator">=</span><span class="token number">0</span><span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="4" type="1"><li>最优路径回溯。对 <span class="math inline">\(t=T-1, T-2, \cdots, 1\)</span> <span class="math display">\[i_{t}^{*}=\Psi_{t+1}\left(i_{t+1}^{*}\right)\]</span> 求得最优路径<span class="math inline">\(I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)_{\text {。 }}\)</span></li></ol><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># 回溯,求最优路径</span>best_path_pointer <span class="token operator">=</span> best_path_pointer<span class="token punctuation">.</span>item<span class="token punctuation">(</span><span class="token punctuation">)</span>best_path <span class="token operator">=</span> <span class="token punctuation">[</span>best_path_pointer<span class="token punctuation">]</span><span class="token keyword">for</span> back_step <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>seq_len <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">,</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">:</span> best_path_pointer <span class="token operator">=</span> backpointer<span class="token punctuation">[</span>best_path_pointer<span class="token punctuation">,</span> back_step<span class="token punctuation">]</span> best_path_pointer <span class="token operator">=</span> best_path_pointer<span class="token punctuation">.</span>item<span class="token punctuation">(</span><span class="token punctuation">)</span> best_path<span class="token punctuation">.</span>append<span class="token punctuation">(</span>best_path_pointer<span class="token punctuation">)</span><span class="token comment"># 将tag_id组成的序列转化为tag</span><span class="token keyword">assert</span> <span class="token builtin">len</span><span class="token punctuation">(</span>best_path<span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token builtin">len</span><span class="token punctuation">(</span>word_list<span class="token punctuation">)</span>id2tag <span class="token operator">=</span> <span class="token builtin">dict</span><span class="token punctuation">(</span><span class="token punctuation">(</span>id_<span class="token punctuation">,</span> tag<span class="token punctuation">)</span> <span class="token keyword">for</span> tag<span class="token punctuation">,</span> id_ <span class="token keyword">in</span> tag2id<span class="token punctuation">.</span>items<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>tag_list <span class="token operator">=</span> <span class="token punctuation">[</span>id2tag<span class="token punctuation">[</span>id_<span class="token punctuation">]</span> <span class="token keyword">for</span> id_ <span class="token keyword">in</span> <span class="token builtin">reversed</span><span class="token punctuation">(</span>best_path<span class="token punctuation">)</span><span class="token punctuation">]</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre>]]></content>
</entry>
</search>