邓作恒的博客
2020-11-19T09:22:09+00:00
http://dengzuoheng.github.io
C++并发型模式#14: 负载均衡 - work stealing
2020-01-02T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-14-work-stealing
<h2 id="introduction">Introduction</h2>
<p>Work stealing 通常翻译为工作窃取, 也有翻译为工作密取, 是指工作线程本身的任务队列为空时, 从其他工作线程的任务队列从窃取任务来执行.</p>
<p>在fork/join篇中, 我们提到, 假如我们要线程池做一些比较大的任务, 做的过程中会把这个人物分割为多个较小的任务(较小的任务也可能分割成更小的任务), 为了减少工作线程对公共任务队列的竞争, 我们让每个工作线程持有一个任务队列, 自己做任务时分割出来的小任务就放到自己的工作队列中.</p>
<p>但是这样会存在一个问题, 初始的任务有大有小, 有的工作线程自己的任务做完了, 其他线程还在忙碌, 从而产生负载不均衡的问题. 为了解决这个问题, 人们发明了工作窃取算法, 这个算法的核心很简单, 就是当前工作线程的任务队列为空时, 去其他还有任务的工作线程的任务队列取一个(或多个)任务回来.</p>
<p><img src="/images/work_stealing.png" alt="Work Stealing" /></p>
<h2 id="design-and-behavior">Design and Behavior</h2>
<p>为了实现一个工作窃取的线程池, 我们需要解决以下问题:</p>
<ul>
<li>需要一个公共队列吗</li>
<li>为什么需要双端队列</li>
<li>从哪个任务队列窃取</li>
<li>一次窃取多少个任务</li>
<li>什么时候唤醒</li>
</ul>
<h3 id="需要一个公共队列吗">需要一个公共队列吗?</h3>
<p>外部任务提交进公共队列还是直接散列到工作线程的任务队列主要看需求, 从竞争激烈程度来看, 散列的竞争应该比公共队列少. 但是如果散列的话, 窃取从队尾取任务, 可能导致后进的任务反而先完成, 不符合整个线程池先进先出的预期. java的<code class="language-plaintext highlighter-rouge">ForkJoinPool</code>是有公共队列的, 所以这里我们也使用公共队列缓存外部提交的任务.</p>
<h3 id="为什么使用双端队列">为什么使用双端队列?</h3>
<p>在fork/join篇中我们已经了解过”per-thead deque”的方案, 即每个工作线程有独立的任务队列. 为什么使用双端队列, 我们需要从两个方面来分析.</p>
<p>一方面, 我们两端都需要提交任务. 如果用散列的话, 我们就需要从外部提交到任务队列队尾(先进先出). 而fork/join提交子任务是提交到队首的(后进先出).</p>
<p>另一方面, 我们两端都需要取任务. 队首不用说, 工作线程是从队首取任务的. 工作窃取一般是从队尾窃取任务的, 因为双端队列两端可以分别被两个锁保护, 减少竞争. 而且fork/join情况下, 队尾的任务更大, 我们倾向于窃取大的任务.</p>
<h3 id="从哪个任务队列窃取">从哪个任务队列窃取?</h3>
<p>提交时就散列到各任务队列的话这个问题很好回答, 那就是随机选一个, 然后从这个开始遍历其他.</p>
<p>有公共队列的情况需要特别考虑, 就是, 我们先窃取其他队列的, 还是先从公共队列取? 先从公共队列取很符合自觉, 但实际上不符合整个线程池先进先出的预期, 因为其他任务队列的任务必定的先进任务分割出来的. 但是如果先窃取, 那窃取的频率又会大幅上升, 可能每次都需要遍历一遍其他工作队列以搜索可窃取的任务, 这可能要加锁解锁很多次. java的<code class="language-plaintext highlighter-rouge">ForkJoinPool</code> 是先窃取的, 所以这里我们也采用先窃取的方案.</p>
<h3 id="一次窃取多少个任务">一次窃取多少个任务?</h3>
<p>一次窃取多少个任务主要是考虑锁的竞争, 每次窃取一个, 窃取很多次就可能有很多次锁竞争. 一次窃取多个又可能窃取者自己又做不完了要等别人窃取了, 毕竟队尾的任务比较大. java的<code class="language-plaintext highlighter-rouge">ForkJoinPool</code>是一次窃取一个的. 但笔者也用过一次窃取多个的实现, 不过这个实现并不是用于fork/join的, 而是大量提交任务, 提交时散列到各个队列的, 这时候我们可以假设每个任务差不多大, 所以可以按一定比例窃取. 我们这里是fork/join篇的续篇, 所以还是考虑fork/join的场景下任务大小比较不一的情况, 每次窃取一个.</p>
<h3 id="什么时候唤醒">什么时候唤醒?</h3>
<p>没有任务的时候工作线程需要进入阻塞等待, 问题是什么时候唤醒呢? 主要考虑两点,窃取的时候和fork的时候.</p>
<p>很自然我们说唤醒是唤醒一个而不是多个. 窃取的时候发现队列里面有好多任务, 那肯定是要唤醒的, 但如果任务队列就剩一个任务了, 那还要唤醒吗? 从java的<code class="language-plaintext highlighter-rouge">ForkJoinPool</code>的实现看确实是要唤醒的, 毕竟不能眼见着有任务却不去执行.</p>
<p>工作线程fork了子任务, 考虑到fork之后通常是要join的, 我们得留一个任务给join的时候<code class="language-plaintext highlighter-rouge">try_execute_one</code>, 所以fork的时候应该是任务队列有多于1个任务的时候唤醒.</p>
<h2 id="basic-implementation">Basic Implementation</h2>
<h3 id="blocking_deque">blocking_deque</h3>
<p>为了实现工作窃取线程池, 我们首先得有一个线程安全双端队列, 我们可以叫它<code class="language-plaintext highlighter-rouge">sync_deque</code>或<code class="language-plaintext highlighter-rouge">blocking_deque</code>, 其接口如下:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">class</span> <span class="nc">blocking_deque</span> <span class="o">:</span> <span class="n">boost</span><span class="o">::</span><span class="n">noncopyable</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="n">blocking_deque</span><span class="p">();</span>
<span class="n">queue_op_status</span> <span class="n">push_back</span><span class="p">(</span><span class="k">const</span> <span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="n">queue_op_status</span> <span class="n">pop_back</span><span class="p">(</span><span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="n">queue_op_status</span> <span class="n">try_pop_back</span><span class="p">(</span><span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="n">queue_op_status</span> <span class="n">push_front</span><span class="p">(</span><span class="k">const</span> <span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="n">queue_op_status</span> <span class="n">pop_front</span><span class="p">(</span><span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="n">queue_op_status</span> <span class="n">try_pop_front</span><span class="p">(</span><span class="n">T</span><span class="o">&</span> <span class="n">val</span><span class="p">);</span>
<span class="kt">size_t</span> <span class="n">size</span><span class="p">()</span> <span class="k">const</span><span class="p">;</span>
<span class="kt">bool</span> <span class="n">empty</span><span class="p">()</span> <span class="k">const</span><span class="p">;</span>
<span class="kt">bool</span> <span class="n">closed</span><span class="p">()</span> <span class="k">const</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">close</span><span class="p">();</span>
<span class="p">};</span>
</code></pre></div></div>
<p>我们可以简单地模仿阻塞队列blocking queue实现, 这里不赘述.</p>
<h3 id="接口与成员">接口与成员</h3>
<p>我们这里继续使用<code class="language-plaintext highlighter-rouge">boost::function<void()></code>作为task, 参考上一篇fork/join中的讨论, 我们需要为<code class="language-plaintext highlighter-rouge">work_stealing_thead_pool</code>提供<code class="language-plaintext highlighter-rouge">submit_front</code>和<code class="language-plaintext highlighter-rouge">submit_back</code>接口, 其中<code class="language-plaintext highlighter-rouge">submit_front</code>是给<code class="language-plaintext highlighter-rouge">fork</code>函数用的.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">work_stealing_thread_pool</span> <span class="o">:</span> <span class="n">boost</span><span class="o">::</span><span class="n">noncopyable</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="n">work_stealing_thread_pool</span><span class="p">();</span>
<span class="o">~</span><span class="n">work_stealing_thread_pool</span><span class="p">();</span>
<span class="nl">public:</span>
<span class="kt">void</span> <span class="n">close</span><span class="p">();</span>
<span class="kt">bool</span> <span class="n">closed</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">join</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">submit</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit_front</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit_back</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">bool</span> <span class="n">try_executing_one</span><span class="p">();</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Pred</span><span class="p">></span>
<span class="kt">bool</span> <span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span> <span class="n">pred</span><span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>
<p>因为每个工作线程都有一个任务队列, 我们可以用<code class="language-plaintext highlighter-rouge">std::vector</code>存线程对象和任务队列, 另外我们希望<code class="language-plaintext highlighter-rouge">submit_front</code>的时候如果是工作线程提交的, 应该提交到工作线程对应的工作队列去, 所以还得有个map去保存线程id到vector索引, 于是我们有以下数据成员:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">work_stealing_thread_pool</span> <span class="o">:</span> <span class="n">boost</span><span class="o">::</span><span class="n">noncopyable</span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">blocking_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="n">taskq_t</span><span class="p">;</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">taskq_t</span><span class="o">></span> <span class="n">taskq_ptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">></span> <span class="n">m_threads</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unordered_map</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span><span class="p">,</span> <span class="kt">size_t</span><span class="o">></span> <span class="n">m_thm</span><span class="p">;</span>
<span class="n">taskq_ptr</span> <span class="n">m_comm_q</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">taskq_ptr</span><span class="o">></span> <span class="n">m_perth_q</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">mutex</span> <span class="n">m_mtx</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">condition_variable</span> <span class="n">m_cond</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="p">};</span>
</code></pre></div></div>
<p>这里的<code class="language-plaintext highlighter-rouge">m_mtx</code>和<code class="language-plaintext highlighter-rouge">m_cond</code>可能会引人迷惑, 这里有什么需要保护吗? 其实没有, <code class="language-plaintext highlighter-rouge">blocking_deque</code>是线程安全的, 而运行过程中我们不会去改变这些vector和map. 这里放一个条件变量是因为<code class="language-plaintext highlighter-rouge">work_stealing_thread_pool</code>从任务队列取任务的操作不能是阻塞的, 详细原因我们后面再讲, 但因为取任务非阻塞, 所有队列为空的时候, 工作线程应该如何进入休眠又如何被唤醒是个问题, 所以这里给了个条件变量, 让工作线程可以在这个条件变量上wait.</p>
<p>理清数据成员后, 我们可以写出构造函数:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">work_stealing_thread_pool</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">thread_count</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">hardware_concurrency</span><span class="p">()</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">m_comm_q</span><span class="p">.</span><span class="n">reset</span><span class="p">(</span><span class="k">new</span> <span class="n">taskq_t</span><span class="p">());</span>
<span class="n">std</span><span class="o">::</span><span class="n">srand</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">time</span><span class="p">(</span><span class="nb">NULL</span><span class="p">));</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">thread_count</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_perth_q</span><span class="p">.</span><span class="n">emplace_back</span><span class="p">(</span><span class="k">new</span> <span class="n">taskq_t</span><span class="p">());</span>
<span class="n">m_threads</span><span class="p">.</span><span class="n">emplace_back</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">bind</span><span class="p">(</span><span class="o">&</span><span class="n">work_stealing_thread_pool</span><span class="o">::</span><span class="n">worker_thread</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">ref</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">),</span> <span class="n">i</span><span class="p">));</span>
<span class="n">m_thm</span><span class="p">[</span><span class="n">m_threads</span><span class="p">.</span><span class="n">back</span><span class="p">().</span><span class="n">get_id</span><span class="p">()]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">close</span><span class="p">();</span>
<span class="k">throw</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>需要注意的是, 因为之后窃取时需要访问其他工作队列, 所以我们<code class="language-plaintext highlighter-rouge">worker_thread</code>函数会接受线程池的指针<code class="language-plaintext highlighter-rouge">this</code>以及当前工作线程的索引<code class="language-plaintext highlighter-rouge">i</code>.</p>
<h3 id="工作线程执行体">工作线程执行体</h3>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">void</span> <span class="nf">worker_thread</span><span class="p">(</span><span class="n">work_stealing_thread_pool</span><span class="o">&</span> <span class="n">self</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">current_thread_idx</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">try</span> <span class="p">{</span>
<span class="c1">// 1. try execute one</span>
<span class="k">if</span> <span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">try_executing_one</span><span class="p">(</span><span class="n">current_thread_idx</span><span class="p">))</span> <span class="p">{</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 2. check closed</span>
<span class="k">if</span> <span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">all_closed</span><span class="p">()</span> <span class="o">&&</span> <span class="n">self</span><span class="p">.</span><span class="n">all_empty</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 3. wait for task</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="n">m_mtx</span><span class="p">);</span>
<span class="n">self</span><span class="p">.</span><span class="n">m_cond</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">catch</span> <span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">thread_interrupted</span><span class="o">&</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="c1">// for</span>
<span class="p">}</span>
<span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">worker_thead</code>是比较核心的函数, 与普通线程池每轮循环会阻塞在任务队列上不同, work stealing取任务是非阻塞的, 其有三个步骤:</p>
<ol>
<li>取任务, 包括尝试从当前工作线程的任务队列取, 尝试窃取其他任务队列的任务, 以及尝试从公共队列取, 因为之后我们还需要实现<code class="language-plaintext highlighter-rouge">try_executing_one()</code>, 我们将其提取到了<code class="language-plaintext highlighter-rouge">try_executing_one(size_t current_thread_idx)</code>.</li>
<li>检查是否可以退出, 退出条件有两个, 一是队列全部关闭, 二是队列全部清空</li>
<li>如果我们没拿到任务, 也不符合退出条件, 只好进入阻塞等待</li>
</ol>
<p>我们先来实现<code class="language-plaintext highlighter-rouge">try_executing_one(size_t current_thread_idx)</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="nf">try_executing_one</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">current_thread_idx</span><span class="p">)</span> <span class="p">{</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">auto</span><span class="o">&</span> <span class="n">local_q</span> <span class="o">=</span> <span class="n">m_perth_q</span><span class="p">[</span><span class="n">current_thread_idx</span><span class="p">];</span>
<span class="c1">// 1. try local_q first</span>
<span class="k">auto</span> <span class="n">st</span> <span class="o">=</span> <span class="n">local_q</span><span class="o">-></span><span class="n">try_pop_front</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 2. try steal others</span>
<span class="n">st</span> <span class="o">=</span> <span class="n">try_steal_one</span><span class="p">(</span><span class="n">current_thread_idx</span><span class="p">,</span> <span class="n">task</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// 3. try comm_q</span>
<span class="n">st</span> <span class="o">=</span> <span class="n">m_comm_q</span><span class="o">-></span><span class="n">try_pop_front</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>在这个函数中我们要实现我们上一章节曾讨论的先考虑窃取后考虑公共队列. 另外我们可以看到取任务非阻塞的理由, 因为如果在某一步阻塞了, 就无法进行下一步尝试.</p>
<h3 id="窃取">窃取</h3>
<p>窃取函数<code class="language-plaintext highlighter-rouge">try_steal_one</code>需要注意两点, 一是随机窃取, 二是窃取的队列如果还有任务, 则应该notify其他可能正在阻塞的工作线程:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">queue_op_status</span> <span class="nf">try_steal_one</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">skip_index</span><span class="p">,</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">size_t</span> <span class="n">offset</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">rand</span><span class="p">()</span> <span class="o">%</span> <span class="n">m_perth_q</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">m_perth_q</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">size_t</span> <span class="n">idx</span> <span class="o">=</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="n">offset</span><span class="p">)</span> <span class="o">%</span> <span class="n">m_perth_q</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">idx</span> <span class="o">==</span> <span class="n">skip_index</span><span class="p">)</span> <span class="p">{</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">auto</span><span class="o">&</span> <span class="n">q</span> <span class="o">=</span> <span class="n">m_perth_q</span><span class="p">[</span><span class="n">idx</span><span class="p">];</span>
<span class="n">queue_op_status</span> <span class="n">st</span> <span class="o">=</span> <span class="n">q</span><span class="o">-></span><span class="n">try_pop_front</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">q</span><span class="o">-></span><span class="n">size</span><span class="p">()</span> <span class="o">></span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">st</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">empty</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>然后我们补充一下<code class="language-plaintext highlighter-rouge">all_closed</code>和<code class="language-plaintext highlighter-rouge">all_empty</code>, <code class="language-plaintext highlighter-rouge">worker_thread</code>就算完成了:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="nf">all_closed</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">m_comm_q</span><span class="o">-></span><span class="n">closed</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">for</span> <span class="p">(</span><span class="k">auto</span><span class="o">&</span> <span class="n">q</span> <span class="o">:</span> <span class="n">m_perth_q</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">q</span><span class="o">-></span><span class="n">closed</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="err">}</span>
<span class="kt">bool</span> <span class="nf">all_empty</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">m_comm_q</span><span class="o">-></span><span class="n">empty</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">for</span> <span class="p">(</span><span class="k">auto</span><span class="o">&</span> <span class="n">q</span> <span class="o">:</span> <span class="n">m_perth_q</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">q</span><span class="o">-></span><span class="n">empty</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="err">}</span>
</code></pre></div></div>
<h3 id="reschedule_until">reschedule_until</h3>
<p><code class="language-plaintext highlighter-rouge">reschedule_until</code>也会发生窃取, 所以要调用我们刚刚实现<code class="language-plaintext highlighter-rouge">try_executing_one(size_t current_thread_idx)</code>, 但因为<code class="language-plaintext highlighter-rouge">reschedule_until</code>不一定发生在工作线程, 所以我们还得写一个<code class="language-plaintext highlighter-rouge">try_executing_one()</code>进行适配:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Pred</span><span class="p">></span>
<span class="kt">bool</span> <span class="nf">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span> <span class="n">pred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">do</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">try_executing_one</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="n">pred</span><span class="p">());</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">bool</span> <span class="nf">try_executing_one</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">id</span><span class="p">);</span>
<span class="c1">// 1. worker thread, try execute its task</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="kt">size_t</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="k">return</span> <span class="n">try_executing_one</span><span class="p">(</span><span class="n">idx</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="c1">// 2. main thread or other, try execute comm task</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">m_comm_q</span><span class="o">-></span><span class="n">try_pop_front</span><span class="p">(</span><span class="n">task</span><span class="p">)</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="c1">// 3. no task in comm, random try execute one</span>
<span class="kt">size_t</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">rand</span><span class="p">()</span> <span class="o">%</span> <span class="n">m_perth_q</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
<span class="k">return</span> <span class="n">try_executing_one</span><span class="p">(</span><span class="n">idx</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>在这个<code class="language-plaintext highlighter-rouge">try_executing_one()</code>中, 我们首先检查当前线程是否工作线程, 如果是就走<code class="language-plaintext highlighter-rouge">try_executing_one(idx)</code>把该尝试的都尝试一遍; 如果不是工作线程, 比如主线程什么的, 就先尝试公共队列, 没有任务在随机一个idx, 再走<code class="language-plaintext highlighter-rouge">try_executing_one(idx)</code>.</p>
<h3 id="任务提交">任务提交</h3>
<p>任务提交时, 首先我们得查看提交者是否是工作线程, 如果是, 则提交到工作线程的任务队列, 否则提交到公共队列, 无论哪种, 都应该<code class="language-plaintext highlighter-rouge">notify_one</code>.</p>
<p>也许有人会有疑问, 工作线程提交到自己的任务队列, 是否应该<code class="language-plaintext highlighter-rouge">notify_one</code>? 被其他工作线程取走了不是cache不友好吗? 这是个好问题, 我们可以考虑工作线程提交子任务之后不一定立刻开始等待, 也许还会做其他事情, 所以为了子任务及时处理, 还是唤醒其他工作线程比较好. 我们也可以考虑提价子任务后立刻进入等待, 我们应该留一个任务去<code class="language-plaintext highlighter-rouge">reschedule_until</code>. 这两种方案都可以, 但我们上面章节有提到Java的实现是留了一个任务, 这里我们也留一个任务:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">submit</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_comm_q</span><span class="o">-></span><span class="n">push_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">submit_front</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">id</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="kt">size_t</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="n">m_perth_q</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span><span class="o">-></span><span class="n">push_front</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">m_perth_q</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span><span class="o">-></span><span class="n">size</span><span class="p">()</span> <span class="o">></span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">m_comm_q</span><span class="o">-></span><span class="n">push_front</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">submit_back</span><span class="p">(</span><span class="k">const</span> <span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">id</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_thm</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="kt">size_t</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="n">m_perth_q</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span><span class="o">-></span><span class="n">push_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">m_perth_q</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span><span class="o">-></span><span class="n">size</span><span class="p">()</span> <span class="o">></span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">m_comm_q</span><span class="o">-></span><span class="n">push_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>至此, <code class="language-plaintext highlighter-rouge">work_stealing_thread_pool</code>的核心函数均已实现, 其他必要函数留作练习.</p>
<h3 id="实验">实验</h3>
<p>参考fork/join篇的例子, 为了在GCC7.3中编译做了一些修改:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="cp">#define BOOST_THREAD_PROVIDES_FUTURE
#include "blocking_deque.h"
#include "work_stealing_thread_pool.h"
</span>
<span class="cp">#include <iostream>
#include <memory>
#include <type_traits>
#include <boost/thread.hpp>
#include <boost/thread/future.hpp>
</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">fork</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="n">F</span><span class="o">&&</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">promise</span><span class="o"><</span><span class="n">T</span><span class="o">>></span> <span class="n">pr</span><span class="p">(</span><span class="k">new</span> <span class="n">boost</span><span class="o">::</span><span class="n">promise</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">());</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">ft</span> <span class="o">=</span> <span class="n">pr</span><span class="o">-></span><span class="n">get_future</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">task</span> <span class="o">=</span> <span class="p">[</span><span class="n">pr</span><span class="p">,</span> <span class="n">f</span><span class="o">=</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">func</span><span class="p">)]</span> <span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">pr</span><span class="o">-></span><span class="n">set_value</span><span class="p">(</span><span class="n">f</span><span class="p">());</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception</span><span class="o">&</span> <span class="n">e</span><span class="p">)</span> <span class="p">{</span>
<span class="n">pr</span><span class="o">-></span><span class="n">set_exception</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">};</span>
<span class="n">ex</span><span class="p">.</span><span class="n">submit_front</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">return</span> <span class="n">ft</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">fork</span><span class="o"><</span><span class="kt">int</span><span class="o">></span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">bind</span><span class="p">(</span><span class="n">fib</span><span class="o"><</span><span class="n">Ex</span><span class="o">></span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">ref</span><span class="p">(</span><span class="n">ex</span><span class="p">),</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">));</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">fork</span><span class="o"><</span><span class="kt">int</span><span class="o">></span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">bind</span><span class="p">(</span><span class="n">fib</span><span class="o"><</span><span class="n">Ex</span><span class="o">></span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">ref</span><span class="p">(</span><span class="n">ex</span><span class="p">),</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">));</span>
<span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span><span class="o">-></span><span class="kt">bool</span><span class="p">{</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">is_ready</span><span class="p">()</span> <span class="o">&&</span> <span class="n">f2</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">work_stealing_thread_pool</span> <span class="n">pool</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">fib</span><span class="p">(</span><span class="n">pool</span><span class="p">,</span> <span class="mi">32</span><span class="p">);</span>
<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o"><<</span> <span class="n">ret</span> <span class="o"><<</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="n">pool</span><span class="p">.</span><span class="n">close</span><span class="p">();</span>
<span class="n">pool</span><span class="p">.</span><span class="n">join</span><span class="p">();</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="总结">总结</h2>
<p>本文讨论了work stealing thread pool的实现, 参考java, 我们实现了以下特性:</p>
<ul>
<li>有公共队列</li>
<li>任务队列都是双端队列</li>
<li>先从其他工作线程的任务队列窃取</li>
<li>一次窃取1个任务</li>
<li>提交任务和窃取的时候都可能唤醒睡眠的工作线程</li>
</ul>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Robert D. Blumofe , Charles E. Leiserson, <a href="https://www.csd.uwo.ca/~mmorenom/CS433-CS9624/Resources/Scheduling_multithreaded_computations_by_work_stealing.pdf">Scheduling Multithreaded Computations by Work Stealing</a>, Journal of the ACM, Vol. 46, No.5, Spet. 1999, pp. 720-748</li>
<li class="ref">[2] houbb, <a href="https://houbb.github.io/2019/01/18/jcip-39-fork-join">JCIP-39-Fork/Join 框架、工作窃取算法</a>, Jan. 2019</li>
<li class="ref">[3] Doug Lea, <a href="http://gee.cs.oswego.edu/dl/papers/fj.pdf">A Java Fork/Join Framework</a>, <a href="https://www.cnblogs.com/suxuan/p/4970498.html">中译版</a>, 素轩(译), Nov. 2015</li>
<li class="ref">[4] rakyll, <a href="https://rakyll.org/scheduler/">Go’s work-stealing scheduler</a>, July, 2017</li>
</ul>
C++并发型模式#13: 动态任务分解 - fork/join
2019-12-09T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-13-fork-join
<h2 id="introduction">Introduction</h2>
<p>将一个复杂的任务分解成更简单的任务再一一解决, 使得每一个子程序更加易于理解并确保其正确, 这是我们常用的方法. 虽然给函数起名是一件痛苦的事情, 但大多数时候我们都乐于做这样的分解.</p>
<p>非递归的场景下, 我们可能有这样的代码:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">foobar</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">k</span> <span class="o">%</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
<span class="n">foo</span><span class="p">();</span>
<span class="n">bar</span><span class="p">();</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">foo</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>递归的情况下, 我们常以斐波那契数列为例:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>现在我们有多线程了, 有executor框架了, 我们很自然就希望那些不直接依赖的子问题可以并行的解决, 而且有充分的并发性, 比如说:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kt">void</span> <span class="nf">foobar</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">k</span> <span class="o">%</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">void</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">foo</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">void</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">bar</span><span class="p">);</span>
<span class="n">f1</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="n">f2</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">foo</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>任务在执行过程中视情况动态地创建(派生)子任务, 然后聚合子任务的结果, 这种并发地处理子问题的方法就是<code class="language-plaintext highlighter-rouge">fork/join</code>(派生/聚合)模式了[6]. 这里的<code class="language-plaintext highlighter-rouge">boost::async</code>就是<code class="language-plaintext highlighter-rouge">fork</code>, <code class="language-plaintext highlighter-rouge">get</code>并将结果相加就是<code class="language-plaintext highlighter-rouge">join</code>. 虽然看起来很简单, 但是这样简单的写法会碰到许多问题, 比如:</p>
<ul>
<li>如果executor不是固定线程数的线程池, 比如说我们用<code class="language-plaintext highlighter-rouge">boost::thread_executor</code>, 你会产生很多线程</li>
<li>如果executor是固定线程数的线程池, 有很多的任务在等待子任务导致没有线程去执行子任务了</li>
<li>没等子任务完成, 父任务就返回了</li>
<li>子任务相互依赖, 导致奇怪的死锁</li>
</ul>
<p>下面, 我们来一个个解决这些问题.</p>
<h2 id="forkjoin-in-fixed-thread-pool">fork/join in fixed thread pool</h2>
<p>相对于不限线程数的<code class="language-plaintext highlighter-rouge">fork/join</code>, 我们更期待固定线程数的线程池的<code class="language-plaintext highlighter-rouge">fork/join</code>, 但这样会死锁.</p>
<p>固定线程池为什么会死锁呢? 这是一个很容易重现的问题, 假设我们现在计算<code class="language-plaintext highlighter-rouge">fib</code>, <code class="language-plaintext highlighter-rouge">n=3</code>, 线程池只有两个线程. 主线程提交了t0<code class="language-plaintext highlighter-rouge">fib(3)</code>.</p>
<p>开始时, 线程1拿到t0<code class="language-plaintext highlighter-rouge">fib(3)</code>, 线程2空着; 然后线程1<code class="language-plaintext highlighter-rouge">fork</code>了两个任务: t1<code class="language-plaintext highlighter-rouge">fib(2)</code>, t2<code class="language-plaintext highlighter-rouge">fib(1)</code>, 线程1阻塞; 然后线程2拿到<code class="language-plaintext highlighter-rouge">fib(2)</code>, 又<code class="language-plaintext highlighter-rouge">fork</code>了两个任务: t3<code class="language-plaintext highlighter-rouge">fib(1)</code>, t4<code class="language-plaintext highlighter-rouge">fib(0)</code>, 线程2阻塞; 这时任务队列里面有3个任务: 线程1提交的t0<code class="language-plaintext highlighter-rouge">fib(3)</code>的第二个子任务t2<code class="language-plaintext highlighter-rouge">fib(1)</code>, 线程2提交的t3<code class="language-plaintext highlighter-rouge">fib(1)</code>和t4<code class="language-plaintext highlighter-rouge">fib(0)</code>, 但是, 两线程均阻塞, 已经没有空闲的线程去执行它们了.</p>
<p>这个问题主要是因为我们<code class="language-plaintext highlighter-rouge">join</code>的时候把当前线程阻塞了, 那有没有办法不阻塞呢? <code class="language-plaintext highlighter-rouge">reschedule_until</code>是一种办法. <code class="language-plaintext highlighter-rouge">reschedule_until</code>的意思时, 从executor的任务队列中取一个任务出来在当前线程执行, 直到某一条件达成或者任务队列空, 我们可以拿<code class="language-plaintext highlighter-rouge">basic_thread_pool</code>的<code class="language-plaintext highlighter-rouge">reschedule_until</code>复习一下:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">Pred</span><span class="p">></span>
<span class="kt">bool</span> <span class="n">basic_thread_pool</span><span class="o">::</span><span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span> <span class="n">pred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">do</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">try_executing_one</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="n">pred</span><span class="p">());</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">bool</span> <span class="n">basic_thread_pool</span><span class="o">::</span><span class="n">try_executing_one</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">m_tasks</span><span class="p">.</span><span class="n">try_pull</span><span class="p">(</span><span class="n">task</span><span class="p">)</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这样我们可以改造一下<code class="language-plaintext highlighter-rouge">fib</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span><span class="o">-></span><span class="kt">bool</span><span class="p">{</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">is_ready</span><span class="p">()</span> <span class="o">&&</span> <span class="n">f2</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>现在, 我们再来分析一下<code class="language-plaintext highlighter-rouge">fib(3)</code>, 简单起见, 我们先讨论只有一个线程的情况:</p>
<p>线程1提交了两个任务之后, 会进入<code class="language-plaintext highlighter-rouge">reschedule_until</code>, 这时候任务队列有两个刚刚提交的任务: t1<code class="language-plaintext highlighter-rouge">fib(2)</code>, t2<code class="language-plaintext highlighter-rouge">fib(1)</code>. <code class="language-plaintext highlighter-rouge">f1</code>和<code class="language-plaintext highlighter-rouge">f2</code>均没有<code class="language-plaintext highlighter-rouge">ready</code>, 所以<code class="language-plaintext highlighter-rouge">reschedule_until</code>会取出t1<code class="language-plaintext highlighter-rouge">fib(2)</code>出来执行.</p>
<p>执行t1<code class="language-plaintext highlighter-rouge">fib(2)</code>又提交t3<code class="language-plaintext highlighter-rouge">fib(1)</code>和t4<code class="language-plaintext highlighter-rouge">fib(0)</code>, 此时的队列是:t2<code class="language-plaintext highlighter-rouge">fib(1)</code>, t3<code class="language-plaintext highlighter-rouge">fib(1)</code>, t4<code class="language-plaintext highlighter-rouge">fib(0)</code>; 然后进入新的<code class="language-plaintext highlighter-rouge">reschedule_until</code>(t1<code class="language-plaintext highlighter-rouge">fib(2)</code>也是需要等两个子任务的), 取出队首的t2<code class="language-plaintext highlighter-rouge">fib(1)</code>, 直接解决, 但是等的子任务还没完成, 继续取出下一个任务t3<code class="language-plaintext highlighter-rouge">fib(1)</code>直接解决, 继续取出t4<code class="language-plaintext highlighter-rouge">fib(0)</code>直接解决. 这时t1<code class="language-plaintext highlighter-rouge">fib(2)</code>等的两个子任务完成, 退出自己的<code class="language-plaintext highlighter-rouge">reschedule_until</code>, t1<code class="language-plaintext highlighter-rouge">fib(2)</code>完成, 因为t0<code class="language-plaintext highlighter-rouge">fib(3)</code>提交的t2<code class="language-plaintext highlighter-rouge">fib(1)</code>已经被t1<code class="language-plaintext highlighter-rouge">fib(2)</code>等待子任务时的<code class="language-plaintext highlighter-rouge">reschedule_until</code>解决了, 所以t0<code class="language-plaintext highlighter-rouge">fib(3)</code>等的子任务也已经完成, 所以t0<code class="language-plaintext highlighter-rouge">fib(3)</code>也就完成了.</p>
<p>这样的改良存在两个问题:</p>
<ul>
<li>
<p>如果是有多个工作线程的情况, <code class="language-plaintext highlighter-rouge">fib(3)</code>提交的子任务可能被其他线程拿掉而导致<code class="language-plaintext highlighter-rouge">reschedule_until</code>拿不到任务而退出, 此时任务队列是空的, 当前线程仍会进入阻塞等待, 但是没关系, 此时等待的子任务已经在执行了, 不会导致死锁.</p>
</li>
<li>
<p>一般executor是先进先出的, 那么<code class="language-plaintext highlighter-rouge">reschedule_until</code>不一定先执行自己提交的子任务, 也可能是执行任务队列中茫茫多的别人的任务, 那就冤了, 那得猴年马月才轮到自己的子任务, 这样cache也不友好. 而且, 别人的任务大概也有子任务, 这样无限制地<code class="language-plaintext highlighter-rouge">reschedule_until</code>, 调用栈会堆得很高, 高到可能爆栈.[5]</p>
</li>
</ul>
<p>所以, <code class="language-plaintext highlighter-rouge">fork/join</code>一般采用双端队列[4], 提交子任务的时候提交到队首, 保证无论哪个线程拿了队首任务, 都保证了子任务先被执行, 减少<code class="language-plaintext highlighter-rouge">reschedule_until</code>的发生, 调用栈很高得情况会比单端队列少一些.</p>
<h2 id="using-deque-for-tasks">using deque for tasks</h2>
<p>为了使用双端队列, 我们boost的executor concept只有一个submit就不够用了, 我们需要用deque重写<code class="language-plaintext highlighter-rouge">basic_thread_pool</code>, 好在boost有<code class="language-plaintext highlighter-rouge">sync_deque</code>, 我们暂时不需要自己去实现一个双端任务队列.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">deque_thread_pool</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="kt">void</span> <span class="n">submit</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span> <span class="n">submit_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span> <span class="p">}</span>
<span class="kt">void</span> <span class="n">submit_back</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit_front</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="p">};</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">fork</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="n">F</span><span class="o">&&</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">promise</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">pr</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">ft</span> <span class="o">=</span> <span class="n">pr</span><span class="p">.</span><span class="n">get_future</span><span class="p">();</span>
<span class="n">ex</span><span class="p">.</span><span class="n">submit_front</span><span class="p">([</span><span class="n">p</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">pr</span><span class="p">),</span> <span class="n">f</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">func</span><span class="p">)]()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">p</span><span class="p">.</span><span class="n">set_value</span><span class="p">(</span><span class="n">f</span><span class="p">());</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception</span><span class="o">&</span> <span class="n">e</span><span class="p">)</span> <span class="p">{</span>
<span class="n">p</span><span class="p">.</span><span class="n">set_exception</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">});</span>
<span class="k">return</span> <span class="n">ft</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这样我们可以得到新版本的<code class="language-plaintext highlighter-rouge">fib</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span><span class="o">-></span><span class="kt">bool</span><span class="p">{</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">is_ready</span><span class="p">()</span> <span class="o">&&</span> <span class="n">f2</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>但即使如此, cache不友好得情况仍然还在, 因为你提交两个子任务可能瞬间就被其他线程拿掉了. 你<code class="language-plaintext highlighter-rouge">reschedule_until</code>的可能还是茫茫多的别人的任务.</p>
<p>如果想尽量在本线程完成自己提交的子任务, 工作线程就需要维护一个自己的任务队列, 然后双端队列保证自己提交得子任务后进先出, <code class="language-plaintext highlighter-rouge">reschedule_until</code>就先取本线程的任务队列的任务来执行. (这里用双端队列而不是栈是为了未来允许其他线程过来work stealing)</p>
<p>取本线程的任务队列, 我们上面写的<code class="language-plaintext highlighter-rouge">reschedule_until</code>就不行了, 我们得写一个新的<code class="language-plaintext highlighter-rouge">fork_join_thread_pool</code>.</p>
<h2 id="deque-per-worker-thread">deque per worker thread</h2>
<p>对于每个工作线程都有一个双端任务队列的情况, 我们可以列出如下接口:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">fork_join_thread_pool</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">map</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">></span> <span class="o">></span> <span class="n">m_threads</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">map</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span><span class="p">,</span> <span class="n">booost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="o">></span> <span class="o">></span> <span class="n">m_per_thread_tasks</span><span class="p">;</span>
<span class="n">sync_queue</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="n">m_tasks</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">fork_join_thread_pool</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">thread_count</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">hardware_concurrency</span><span class="p">()</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
<span class="o">~</span><span class="n">fork_join_thread_pool</span><span class="p">();</span>
<span class="nl">public:</span>
<span class="kt">bool</span> <span class="n">try_executing_one</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">close</span><span class="p">();</span>
<span class="kt">bool</span> <span class="n">closed</span><span class="p">();</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Pred</span><span class="p">></span>
<span class="kt">bool</span> <span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit_front</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit_back</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">run</span><span class="p">();</span>
<span class="p">};</span>
</code></pre></div></div>
<p>使用<code class="language-plaintext highlighter-rouge">std::map</code>来存, 是为了<code class="language-plaintext highlighter-rouge">submit</code>和<code class="language-plaintext highlighter-rouge">reschedule_until</code>的时候可以根据当前线程id来进行. 这个map会很小, 所以我们相信其性能不会太差, 当然我们也可以根据需要用别的数据结构代替.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">fork_join_thread_pool</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">thread_count</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">latch</span> <span class="n">lt</span><span class="p">(</span><span class="n">thread_count</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">thread_count</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">></span> <span class="n">tr</span><span class="p">(</span><span class="k">new</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="p">([</span><span class="o">&</span><span class="p">]{</span>
<span class="n">lt</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="k">this</span><span class="o">-></span><span class="n">run</span><span class="p">();</span>
<span class="p">}));</span>
<span class="n">m_per_thread_tasks</span><span class="p">[</span><span class="n">tr</span><span class="o">-></span><span class="n">id</span><span class="p">()].</span><span class="n">reset</span><span class="p">(</span><span class="k">new</span> <span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span><span class="p">);</span>
<span class="n">m_threads</span><span class="p">[</span><span class="n">tr</span><span class="o">-></span><span class="n">id</span><span class="p">()]</span> <span class="o">=</span> <span class="n">tr</span><span class="p">;</span>
<span class="n">lt</span><span class="p">.</span><span class="n">cound_down</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">catch</span><span class="p">(...)</span> <span class="p">{</span>
<span class="n">close</span><span class="p">();</span>
<span class="k">throw</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>因为我们需要线程id做key, 所以线程对象会先于任务队列构造出来. 为了保证线程安全, 构造函数用了<code class="language-plaintext highlighter-rouge">boost::latch</code>, 这限制了<code class="language-plaintext highlighter-rouge">run</code>函数不会在所有工作线程和任务队列构造完之前被执行.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">run</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">assert</span><span class="p">(</span><span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">())</span> <span class="o">!=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">end</span><span class="p">());</span>
<span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">>&</span> <span class="n">local_task</span> <span class="o">=</span> <span class="o">*</span><span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">at</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">());</span>
<span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">concurrent</span><span class="o">::</span><span class="n">queue_op_status</span> <span class="n">st</span> <span class="o">=</span> <span class="n">local_tasks</span><span class="p">.</span><span class="n">try_pull</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">boost</span><span class="o">::</span><span class="n">concurrent</span><span class="o">::</span><span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">boost</span><span class="o">::</span><span class="n">concurrent</span><span class="o">::</span><span class="n">queue_op_status</span> <span class="n">st</span> <span class="o">=</span> <span class="n">m_tasks</span><span class="p">.</span><span class="n">wait_pull</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">st</span> <span class="o">==</span> <span class="n">boost</span><span class="o">::</span><span class="n">concurrent</span><span class="o">::</span><span class="n">queue_op_status</span><span class="o">::</span><span class="n">closed</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">task</span><span class="p">();</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">thread_interrupted</span><span class="o">&</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="c1">// for</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>在<code class="language-plaintext highlighter-rouge">run</code>函数中, 首先我们先尝试从此线程的任务队列中取任务执行, 直到线程的任务队列为空, 再从线程池的公共任务队列取任务.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">submit_front</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span> <span class="n">this_id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">this_id</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="n">booost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="o">></span> <span class="n">q</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">q</span><span class="p">)</span> <span class="p">{</span>
<span class="n">q</span><span class="o">-></span><span class="n">push_front</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">m_tasks</span><span class="p">.</span><span class="n">push_front</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">submit_back</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span> <span class="n">this_id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">this_id</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="n">booost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="o">></span> <span class="n">q</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">q</span><span class="p">)</span> <span class="p">{</span>
<span class="n">q</span><span class="o">-></span><span class="n">push_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">m_tasks</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">w</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">submit_front</code>和<code class="language-plaintext highlighter-rouge">submit_back</code>都先找一下有没有当前线程对应的任务队列, 没有才提交到线程池的任务队列中.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span> <span class="n">pred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">id</span> <span class="n">this_id</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">this_id</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">it</span> <span class="o">!=</span> <span class="n">m_per_thread_tasks</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="n">booost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="o">></span> <span class="n">q</span> <span class="o">=</span> <span class="n">it</span><span class="o">-></span><span class="n">second</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">q</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">reschedule_until</span><span class="p">(</span><span class="n">pred</span><span class="p">,</span> <span class="n">q</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">do</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">try_executing_one</span><span class="p">(</span><span class="n">m_tasks</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="n">pred</span><span class="p">());</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">bool</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span> <span class="n">pred</span><span class="p">,</span> <span class="n">booost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="o">></span> <span class="n">local_tasks</span><span class="p">)</span> <span class="p">{</span>
<span class="k">do</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">try_executing_one</span><span class="p">(</span><span class="o">*</span><span class="n">local_tasks</span><span class="p">))</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">try_executing_one</span><span class="p">(</span><span class="n">m_tasks</span><span class="p">))</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="n">pred</span><span class="p">());</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">bool</span> <span class="n">fork_join_thread_pool</span><span class="o">::</span><span class="n">try_executing_one</span><span class="p">(</span><span class="n">sync_deque</span><span class="o"><</span><span class="n">work</span><span class="o">>&</span> <span class="n">queue</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">work</span> <span class="n">task</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">queue</span><span class="p">.</span><span class="n">try_pull</span><span class="p">(</span><span class="n">task</span><span class="p">)</span> <span class="o">==</span> <span class="n">queue_op_status</span><span class="o">::</span><span class="n">success</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span><span class="p">();</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">reschedule_until</code>会复杂一些, 因为可能有当前线程对应的任务队列, 但是此任务队列可能没有任务, 于是我们又要看线程池的公共任务队列有没有任务.</p>
<p>当我们不在工作线程调用<code class="language-plaintext highlighter-rouge">reschedule_until</code>时, <code class="language-plaintext highlighter-rouge">try_executing_one</code>执行任务中提交的子任务都会提交到线程池的任务队列中.</p>
<p>至此, 我们实现了<code class="language-plaintext highlighter-rouge">fork_join_thread_pool</code>, 方便起见, 我们可以写一个<code class="language-plaintext highlighter-rouge">join</code>函数:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">join</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">e</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">>&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="kt">bool</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">f</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ret</span><span class="p">)</span> <span class="p">{</span>
<span class="n">f</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T1</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T2</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">join</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">e</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T1</span><span class="o">>&</span> <span class="n">f1</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T2</span><span class="o">>&</span> <span class="n">f2</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="kt">bool</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">is_ready</span><span class="p">()</span> <span class="o">&&</span> <span class="n">f2</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ret</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">wait_for_all</span><span class="p">(</span><span class="n">f1</span><span class="p">,</span> <span class="n">f2</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这样我们得到了新版本的<code class="language-plaintext highlighter-rouge">fib</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">fib</span><span class="p">,</span> <span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span>
<span class="n">join</span><span class="p">(</span><span class="n">f1</span><span class="p">,</span> <span class="n">f2</span><span class="p">);</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">+</span> <span class="n">f2</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>当然, 这还不是最终极的版本. 任务有大有小, 自己的大任务分解后很久都做不完怎么办? 其它线程闲着了这么办? 然后人们又让线程没任务的时候去帮其他线程, 这种玩法叫<code class="language-plaintext highlighter-rouge">work stealing</code>[1][4][6], 有点复杂, 我们需要单列一篇讨论, 这里不详谈.</p>
<h2 id="forkjoin-future-task">fork/join future task</h2>
<p>future是可以作为参数或者返回值传递的, 但作为返回值时我们自然不会返回executor, 然而我们上面的<code class="language-plaintext highlighter-rouge">join</code>是需要executor的, 所以我们需要给future增加一个接口或者修改<code class="language-plaintext highlighter-rouge">wait</code>的行为, 方便起见, 我们增加一个<code class="language-plaintext highlighter-rouge">join</code>方法.</p>
<p>我们的future支持executor和then的时候, 在<code class="language-plaintext highlighter-rouge">shared_state_base</code>中保存了一个<code class="language-plaintext highlighter-rouge">executor_ptr</code>, 它是executor的指针包装. 所以我们的<code class="language-plaintext highlighter-rouge">shared_state_base::join</code>可以通过这个来<code class="language-plaintext highlighter-rouge">reschedule_until</code>.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kt">void</span> <span class="n">shared_state_base</span><span class="o">::</span><span class="n">join</span><span class="p">()</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">==</span> <span class="n">launch_policy</span><span class="o">::</span><span class="n">policy_executor</span> <span class="o">&&</span> <span class="n">ex</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="kt">bool</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">ex</span><span class="o">-></span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">](){</span>
<span class="k">return</span> <span class="k">this</span><span class="o">-></span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ret</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">wait</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>同样我们可以有不带executor的free function <code class="language-plaintext highlighter-rouge">join</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T1</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T2</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">join</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T1</span><span class="o">>&</span> <span class="n">f1</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T2</span><span class="o">>&</span> <span class="n">f2</span><span class="p">)</span> <span class="p">{</span>
<span class="n">f1</span><span class="p">.</span><span class="n">join</span><span class="p">();</span>
<span class="n">f2</span><span class="p">.</span><span class="n">join</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>带executor的版本也可稍加改造:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T1</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T2</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">join</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">e</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T1</span><span class="o">>&</span> <span class="n">f1</span><span class="p">,</span> <span class="n">future</span><span class="o"><</span><span class="n">T2</span><span class="o">>&</span> <span class="n">f2</span><span class="p">)</span> <span class="p">{</span>
<span class="k">const</span> <span class="kt">bool</span> <span class="n">ret</span> <span class="o">=</span> <span class="n">ex</span><span class="p">.</span><span class="n">reschedule_until</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">f1</span><span class="p">.</span><span class="n">is_ready</span><span class="p">()</span> <span class="o">&&</span> <span class="n">f2</span><span class="p">.</span><span class="n">is_ready</span><span class="p">();</span>
<span class="p">});</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ret</span><span class="p">)</span> <span class="p">{</span>
<span class="n">join</span><span class="p">(</span><span class="n">f1</span><span class="p">,</span> <span class="n">f2</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="task_region--task_block">task_region / task_block</h2>
<p>使用free function来fork/join虽然很方便, 但却没有什么机制去限制当前任务必须等待子任务完成才退出. 虽然说逻辑上确实也可能存在不需要等待子任务的任务, 但这样的灵活性同样带来更多的心智负担和调试困难. 另一方面, 抛异常或者仅仅是程序员写错代码而导致子任务没有被join也可能带来一系列问题. 再者, 更严格的限制可能使得编译器做更多的针对性优化. 所以, C++社区选择了<code class="language-plaintext highlighter-rouge">fully-strict</code>的规则, 即子任务须在直接父任务完成前完成. (不fully的规则叫<code class="language-plaintext highlighter-rouge">terminally-strict</code>, 放宽到了祖先任务而不是直接父任务).[2]</p>
<p><code class="language-plaintext highlighter-rouge">task_region</code>就是这样拿出来的提案, <code class="language-plaintext highlighter-rouge">join</code>不是程序员自己去写, 而是<code class="language-plaintext highlighter-rouge">task_regon</code>结束的时候自动<code class="language-plaintext highlighter-rouge">join</code>.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">fib</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">f1</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">f2</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">task_region</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">](</span><span class="n">task_region_handle_gen</span><span class="o"><</span><span class="n">Ex</span><span class="o">>&</span> <span class="n">trh</span><span class="p">)</span> <span class="p">{</span>
<span class="n">trh</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span> <span class="n">f1</span> <span class="o">=</span> <span class="n">fib</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">);</span> <span class="p">});</span>
<span class="n">trh</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span> <span class="n">f2</span> <span class="o">=</span> <span class="n">fin</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="mi">2</span><span class="p">);</span> <span class="p">});</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">f1</span> <span class="o">+</span> <span class="n">f2</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>(也许你看<code class="language-plaintext highlighter-rouge">boost::experimental::parallel::task_region</code>的文档实例会发现跟上面这个写法有些许不同, boost中并没有为<code class="language-plaintext highlighter-rouge">f2</code>提交任务, 这是因为目前(boost1.7)的<code class="language-plaintext highlighter-rouge">task_region</code>实现仍然是没有在<code class="language-plaintext highlighter-rouge">wait</code>中调用<code class="language-plaintext highlighter-rouge">reschedule_until</code>或者其他调度策略的, 所以为了避免多余的等待, <code class="language-plaintext highlighter-rouge">f2</code>的计算就留在当前线程了)</p>
<p><code class="language-plaintext highlighter-rouge">task_regon</code>是一个free function, 一般有两个版本, 一个只接受可调用对象, 另一个接受executor和可调用对象, 但其实没什么区别, 前者只是给了一个默认的executor而已.</p>
<p>接受的可调用对象是规定的, 它必须以<code class="language-plaintext highlighter-rouge">task_region_handle_gen<Ex>&</code>为参数, <code class="language-plaintext highlighter-rouge">task_region</code>内提交任务都必须通过这个参数. 回忆<code class="language-plaintext highlighter-rouge">task_region</code>的目的, 我们很容易想到, <code class="language-plaintext highlighter-rouge">task_region_handle_gen<Ex></code>析构前会等待我们提交给它的子任务.</p>
<p>这样一来, 我们可以猜到<code class="language-plaintext highlighter-rouge">task_region</code>的实现:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tempate</span><span class="o"><</span><span class="k">typename</span> <span class="n">Ex</span><span class="p">,</span> <span class="k">typename</span> <span class="n">F</span><span class="o">></span>
<span class="kt">void</span> <span class="nf">task_region</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="n">F</span><span class="o">&&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task_region_handle_gen</span><span class="o"><</span><span class="n">Ex</span><span class="o">></span> <span class="n">trh</span><span class="p">(</span><span class="n">ex</span><span class="p">);</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">f</span><span class="p">(</span><span class="n">trh</span><span class="p">);</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="c1">// handle task region exception</span>
<span class="p">}</span>
<span class="n">thr</span><span class="p">.</span><span class="n">wait_all</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">wait_all</code>即是等待所有子任务.</p>
<p>因为<code class="language-plaintext highlighter-rouge">wait_all</code>只会在<code class="language-plaintext highlighter-rouge">task_region_handle_gen<Ex></code>析构或者<code class="language-plaintext highlighter-rouge">task_region</code>结束前被显示调用, 所以一个<code class="language-plaintext highlighter-rouge">task_region</code>内, 提交的子任务是不应捕获<code class="language-plaintext highlighter-rouge">trh</code>并在子任务中继续向其提交任务的. 如果我们要继续分割任务, 就再来一个<code class="language-plaintext highlighter-rouge">task_region</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">task_region</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">](</span><span class="k">auto</span><span class="o">&</span> <span class="n">trh</span><span class="p">)</span> <span class="p">{</span>
<span class="n">trh</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="o">&</span><span class="p">]{</span>
<span class="n">task_region</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">](</span><span class="k">auto</span><span class="o">&</span> <span class="n">inner_trh</span><span class="p">)</span> <span class="p">{</span>
<span class="n">inner_trh</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
<span class="p">});</span>
<span class="c1">// ...</span>
<span class="p">});</span>
<span class="c1">// ...</span>
<span class="p">}));</span>
</code></pre></div></div>
<p>不考虑异常处理, 我们可以以如下方式实现<code class="language-plaintext highlighter-rouge">task_region_handle_gen<Ex></code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="k">class</span> <span class="nc">task_region_handle_gen</span> <span class="p">{</span>
<span class="n">Ex</span><span class="o">&</span> <span class="n">m_ex</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">void</span><span class="o">></span> <span class="o">></span> <span class="n">m_futures</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">task_region_handle_gen</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">)</span><span class="o">:</span> <span class="n">m_ex</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span> <span class="p">{}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="kt">void</span> <span class="n">run</span><span class="p">(</span><span class="n">F</span><span class="o">&&</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_futures</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">m_ex</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">forward</span><span class="p">(</span><span class="n">f</span><span class="p">));</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">wait_all</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">wait_for_all</span><span class="p">(</span><span class="n">m_futures</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span> <span class="n">m_futures</span><span class="p">.</span><span class="n">end</span><span class="p">());</span>
<span class="c1">// handle excetions if you need</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>可以看到由于提案并没有要求<code class="language-plaintext highlighter-rouge">wait_all</code>的时候用什么策略<code class="language-plaintext highlighter-rouge">join</code>, 所以基本的实现中只是单纯地调用了<code class="language-plaintext highlighter-rouge">wait_for_all</code>. 如果我们要引入前几节的成果, 我们也容易写出另一个实现:</p>
<pre><code class="language-C++">template<typename Ex>
class task_region_handle_gen {
Ex& m_ex;
std::vector<boost::future<void> > m_futures;
public:
task_region_handle_gen(Ex& ex): m_ex(ex) {}
template<typename F>
void run(F&& f) {
m_futures.push_back(fork(m_ex, std::forward(f));
}
void wait_all() {
join(ex, m_futures.begin(), m_futures.end());
// handle excetions if you need
}
};
</code></pre>
<p>(迭代器版本的<code class="language-plaintext highlighter-rouge">join</code>的实现就留作练习吧)</p>
<p><code class="language-plaintext highlighter-rouge">task_region</code>并不是一个好名字, 所以后来的提案(N4411)做出了修改, 以<code class="language-plaintext highlighter-rouge">define_task_block</code>替换<code class="language-plaintext highlighter-rouge">task_region</code>, 以<code class="language-plaintext highlighter-rouge">task_block</code>替换<code class="language-plaintext highlighter-rouge">task_region_handle_gen</code>[3]:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">define_task_block</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">](</span><span class="n">task_block</span><span class="o">&</span> <span class="n">tb</span><span class="p">)</span> <span class="p">{</span>
<span class="n">tb</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="o">&</span><span class="p">]{</span>
<span class="n">define_task_block</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">](</span><span class="k">auto</span><span class="o">&</span> <span class="n">inner_tb</span><span class="p">)</span> <span class="p">{</span>
<span class="n">inner_tb</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
<span class="p">});</span>
<span class="c1">// ...</span>
<span class="p">});</span>
<span class="c1">// ...</span>
<span class="p">}));</span>
</code></pre></div></div>
<p>看起来是不是更加清晰了呢(确信.jpg)?</p>
<h2 id="directed-acyclic-graph">directed acyclic graph</h2>
<p>我们可以写出子任务间有依赖的代码:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">foobar</span><span class="p">()</span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f1</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">foo</span><span class="p">);</span>
<span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f2</span> <span class="o">=</span> <span class="n">fork</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="p">[</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span>
<span class="n">join</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">f1</span><span class="p">);</span>
<span class="n">bar</span><span class="p">();</span>
<span class="p">});</span>
<span class="n">join</span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">f1</span><span class="p">,</span> <span class="n">f2</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>假设<code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>都不会再<code class="language-plaintext highlighter-rouge">fork</code>, 这里可能死锁吗? 我们来分析一下.</p>
<p><code class="language-plaintext highlighter-rouge">join</code>的时候, 任务队列可能有几种情况:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>都在;</li>
<li><code class="language-plaintext highlighter-rouge">foo</code>被其他线程执行, <code class="language-plaintext highlighter-rouge">bar</code>还在;</li>
<li><code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>分别被两个线程执行;</li>
<li><code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>被同一个线程执行:</li>
</ul>
<p>假如<code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>都在, <code class="language-plaintext highlighter-rouge">join</code>首先会取出<code class="language-plaintext highlighter-rouge">foo</code>来执行, 此后又有两种可能: 继续取出<code class="language-plaintext highlighter-rouge">bar</code>执行, 这样对<code class="language-plaintext highlighter-rouge">f1</code>的依赖没有问题; <code class="language-plaintext highlighter-rouge">bar</code>被其他线程执行, 这个线程会<code class="language-plaintext highlighter-rouge">join</code> <code class="language-plaintext highlighter-rouge">f1</code>, 但<code class="language-plaintext highlighter-rouge">bar</code>已经被执行了, 不会死锁. 所以, 这种情况都不会死锁.</p>
<p>假如<code class="language-plaintext highlighter-rouge">foo</code>被其他线程执行, <code class="language-plaintext highlighter-rouge">bar</code>还在, <code class="language-plaintext highlighter-rouge">join</code>会取出<code class="language-plaintext highlighter-rouge">bar</code>执行, <code class="language-plaintext highlighter-rouge">foo</code>被其他线程执行, 只要等一下, <code class="language-plaintext highlighter-rouge">f1</code>就<code class="language-plaintext highlighter-rouge">ready</code>了, 也不会死锁.</p>
<p>假如<code class="language-plaintext highlighter-rouge">foo</code>和<code class="language-plaintext highlighter-rouge">bar</code>被同一个线程或不同线程执行, 显然没法死锁.</p>
<p>所以即使有一定程度的依赖, 也不会死锁; 事实上, 这个依赖图是有向无环图(DAG)就可以了[4], 甚至不要求是有向树. 为什么呢?</p>
<p>类似拓扑排序的卡恩算法, 我们设被依赖的任务有一个出度, 依赖的别人的任务有一个入度, 因为我们是有向无环图, 所以我们至少能找到一个入度为0的节点. 如果我们将这个节点及其出边移除掉, 我们要么得到一个新的有向无环图, 要么得到一个空图. 如此类推, 只要没有回路, 我们能把整个图的点移除掉.</p>
<p>那问题就在于, 我们的<code class="language-plaintext highlighter-rouge">join</code>时的<code class="language-plaintext highlighter-rouge">reschedule_until</code>能否保证能找到这样入度为0的点? 答案是可以的, 后进先出的fork是深度优先, 先进先出的fork是广度优先, 它们都是能遍历图的. 当<code class="language-plaintext highlighter-rouge">reschedule_until</code>找到了一个不依赖其他任务的任务, 就会完成这个任务, 这样这个任务的出边就相当于移除掉了.</p>
<p>同样我们可以得到, 有回路的图必然死锁.</p>
<p>当然, 以上讨论是建立在我们的图是从任务队列的某一个任务fork展开的. 那我们可以构造一些更邪恶的case, 比如说, 我们有n个线程, n+1个任务, 前n个任务依赖于最后一个任务, 如果我们不提交最后一个任务, 所有线程都会<code class="language-plaintext highlighter-rouge">reschedule_until</code>失败进入阻塞等待. 这时候再提交最后一个任务, 却没有线程去执行它, 然后真死锁了.</p>
<p>这不是一个容易解决的问题. 一种可能的方法是改成busy wait可以避免新任务没人执行, 浪费CPU. 另一种可能的解决方法是, <code class="language-plaintext highlighter-rouge">join</code>时注册条件变量到任务队列或线程池中, 使得新任务提交时<code class="language-plaintext highlighter-rouge">notify</code>一堆条件变量, 这样你注册和移除又增加竞争. 具体使用什么方法需要看实际需求, 如果任务很多很密集, busy wait就不错, 如果任务比较零散, 那注册条件变量增加的竞争就不算明显.</p>
<h2 id="总结">总结</h2>
<p>综上所述, 线程数固定的线程池的fork/join, 有以下要求:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">reschedule_until</code></li>
<li>每个工作线程维护一个deque</li>
<li>work stealing负载均衡</li>
<li>任务集是有向无环图</li>
<li>(根据需要)新提交的任务唤醒阻塞的join</li>
</ul>
<p>对于<code class="language-plaintext highlighter-rouge">reschedule_until</code>可能导致的调用栈过深的问题, 虽然通过让<code class="language-plaintext highlighter-rouge">fork</code>后进先出可以有一定程度的减轻, 但是更根本的解决方法是”直接切换调用栈”, 这便是n:m有栈协程的方案, 比如go语言的协程调度. 很久之后我们讨论协程的章节再详细讨论.</p>
<p>work stealing(工作窃取)帮助我们达成负载均衡后, 对于很多算法, 我们会递归地进行并发分解, 直到问题的”大小”小于某个阈值而不继续分解, 能充分地利用并发性. work stealing本身也有很多玩法[2], 下一篇我们将详细讨论这个话题.</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Daug Lea, <a href="http://gee.cs.oswego.edu/dl/papers/fj.pdf">A Java Fork/Join Framework</a>, June. 2000</li>
<li class="ref">[2] Pablo Halpern, Arch Robison, Hong Hong, Artur Laksberg, Gor Nishanov, Herb Sutter, <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4088.pdf">Task Region R3 | N4088</a>, June. 2014</li>
<li class="ref">[3] Pablo Halpern, Arch Robison, Hong Hong, Artur Laksberg, Gor Nishanov, Herb Sutter, <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4411.pdf">N4411 | Task Block (formerly Task Region) R4</a>, June. 2014</li>
<li class="ref">[4] IPCC, <a href="http://ipcc.cs.uoregon.edu/lectures/lecture-9-fork-join.pdf">Fork-Join Pattern</a>, UO CIS, 2014</li>
<li class="ref">[5] James Reinders著, 聂雪军译, <em>Intel Threading Building Blocks编程指南</em>. 北京, 机械工业出版社, 第1版, Jan. 2009</li>
<li class="ref">[6] Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill著, 张云泉, 贾海鹏, 袁良译, 并行编程模式. 北京, 机械出版社. 2014.11, p120~p124</li>
</ul>
C++并发型模式#12: condition_variable_any
2019-07-16T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-12-condition-variable-any
<h2 id="从condition_variable开始">从condition_variable开始</h2>
<p>如果我们去看<code class="language-plaintext highlighter-rouge">boost::condition_variable</code>源码, 我们会发现是pthread api的封装, 比如<code class="language-plaintext highlighter-rouge">condition_variable::wait</code>调用的其实是<code class="language-plaintext highlighter-rouge">pthread_cond_wait</code>. <code class="language-plaintext highlighter-rouge">pthread_cond_wait</code>自然只接受<code class="language-plaintext highlighter-rouge">pthread_mutex_t</code>, 进而, <code class="language-plaintext highlighter-rouge">condition_variable::wait</code>只接受<code class="language-plaintext highlighter-rouge">unique_lock<mutex></code>.</p>
<p>之所以接受<code class="language-plaintext highlighter-rouge">unique_lock</code>而不是<code class="language-plaintext highlighter-rouge">mutex</code>, 是因为C++里面<code class="language-plaintext highlighter-rouge">Lock</code>和<code class="language-plaintext highlighter-rouge">Mutex</code>是不同的concept, 由于篇幅关系我们不详细讨论, 这里简单地认为<code class="language-plaintext highlighter-rouge">Lock</code>比<code class="language-plaintext highlighter-rouge">Mutex</code>多一个<code class="language-plaintext highlighter-rouge">owns_lock</code>, 而<code class="language-plaintext highlighter-rouge">condition_variable</code>的语义要求企图通过<code class="language-plaintext highlighter-rouge">condition_variable</code>等待的线程持有这个锁.</p>
<p>这样, <code class="language-plaintext highlighter-rouge">condition_variable::wait</code>可以简单地写成:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">inline</span> <span class="kt">void</span> <span class="n">condition_variable</span><span class="o">::</span><span class="n">wait</span><span class="p">(</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">m</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">m</span><span class="p">.</span><span class="n">owns_lock</span><span class="p">())</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">throw_excpetion</span><span class="p">(</span><span class="n">condition_error</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="s">"mutex not owned"</span><span class="p">));</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="n">res</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">pthread_mutex_t</span><span class="o">*</span> <span class="n">the_mutex</span> <span class="o">=</span> <span class="n">m</span><span class="p">.</span><span class="n">mutex</span><span class="p">()</span><span class="o">-></span><span class="n">native_handle</span><span class="p">();</span>
<span class="n">pthread_cond_t</span><span class="o">*</span> <span class="n">this_cond</span> <span class="o">=</span> <span class="k">this</span><span class="o">-></span><span class="n">native_handle</span><span class="p">();</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">pthread_cond_wait</span><span class="p">(</span><span class="n">this_cond</span><span class="p">,</span> <span class="n">the_mutex</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">res</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">throw_excpetion</span><span class="p">(</span><span class="n">condition_error</span><span class="p">(</span><span class="n">res</span><span class="p">,</span> <span class="s">"failed in pthread_cond_wait"</span><span class="p">));</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这种要求在日常使用中自然是没有什么问题的. 但是, 当我们想实现<code class="language-plaintext highlighter-rouge">boost::wait_for_any</code>以及其他奇奇怪怪的东西, 我们就需要自定义奇怪的锁比如同时锁定多个对象(比如多个mutex). 然而<code class="language-plaintext highlighter-rouge">condition_variable</code>不接受这样自定义的锁.</p>
<p>好在boost和stl都提供了<code class="language-plaintext highlighter-rouge">condition_variable_any</code>, 它接受任何符合Lock concept的对象. 很显然, 这样的<code class="language-plaintext highlighter-rouge">condition_variable_any</code>不可能是api的简单封装, 那么, 它是怎么实现的呢?</p>
<h2 id="实现condition_variable_any">实现condition_variable_any</h2>
<p><code class="language-plaintext highlighter-rouge">condition_variable_any</code>接受任意类型的锁, 它的接口看起来像:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">condition_variable_any</span>
<span class="p">{</span>
<span class="nl">public:</span>
<span class="n">condition_variable_any</span><span class="p">();</span>
<span class="o">~</span><span class="n">condition_variable_any</span><span class="p">();</span>
<span class="nl">public:</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Lock</span><span class="p">></span>
<span class="kt">void</span> <span class="n">wait</span><span class="p">(</span><span class="n">Lock</span><span class="o">&</span> <span class="n">m</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">notify_one</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">notify_all</span><span class="p">();</span>
<span class="p">};</span>
</code></pre></div></div>
<p>要实现这个奇怪的<code class="language-plaintext highlighter-rouge">wait</code>, 首先我们得知道<code class="language-plaintext highlighter-rouge">condition_variable</code>的wait做了什么.</p>
<p>语义上, wait有三个步骤: 解锁, 等待, 再加锁. 听起来很简单对不对, 我们随手就能写出一个来:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// buggy version 1</span>
<span class="k">class</span> <span class="nc">condition_variable_any</span>
<span class="p">{</span>
<span class="nl">public:</span>
<span class="n">condition_variable_any</span><span class="p">()</span> <span class="p">{}</span>
<span class="o">~</span><span class="n">condition_variable_any</span><span class="p">()</span> <span class="p">{}</span>
<span class="nl">public:</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Lock</span><span class="p">></span>
<span class="kt">void</span> <span class="n">wait</span><span class="p">(</span><span class="n">Lock</span><span class="o">&</span> <span class="n">external</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span> <span class="n">lk</span><span class="p">(</span><span class="n">m_mutex</span><span class="p">);</span>
<span class="n">external</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="n">external</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">notify_one</span><span class="p">()</span> <span class="p">{</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">notify_all</span><span class="p">()</span> <span class="p">{</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span>
<span class="p">}</span>
<span class="nl">private:</span>
<span class="n">boost</span><span class="o">::</span><span class="n">mutex</span> <span class="n">m_mutex</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">condition_variable</span> <span class="n">m_cond</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<p>这有什么问题呢? 我们上一章节实现的<code class="language-plaintext highlighter-rouge">condition_variable::wait</code>就是可能抛异常的, 如果<code class="language-plaintext highlighter-rouge">condition_variable::wait</code>异常的, 我们的<code class="language-plaintext highlighter-rouge">external</code>就会保持解锁的状态退出<code class="language-plaintext highlighter-rouge">condition_variable_any::wait</code>, 这是不好的.</p>
<p>为了解决这个问题,我们可以去写一个RAII, 构造的时候解锁, 析构的时候加锁:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Lock</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">relock_guard</span> <span class="p">{</span>
<span class="n">Lock</span><span class="o">&</span> <span class="n">lk</span><span class="p">;</span>
<span class="n">relock_guard</span><span class="p">(</span><span class="n">Lock</span><span class="o">&</span> <span class="n">_lk</span><span class="p">)</span> <span class="o">:</span> <span class="n">lk</span><span class="p">(</span><span class="n">_lk</span><span class="p">)</span> <span class="p">{</span>
<span class="n">lk</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="p">}</span>
<span class="o">~</span><span class="n">relock_guard</span><span class="p">()</span> <span class="p">{</span>
<span class="n">lk</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>这样我们就异常安全多了:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// buggy version 2</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Lock</span><span class="p">></span>
<span class="kt">void</span> <span class="n">condition_variable_any</span><span class="o">::</span><span class="n">wait</span><span class="p">(</span><span class="n">Lock</span><span class="o">&</span> <span class="n">external</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="n">m_mutex</span><span class="p">);</span>
<span class="n">relock_guard</span><span class="o"><</span><span class="n">Lock</span><span class="o">></span> <span class="n">guard</span><span class="p">(</span><span class="n">external</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>然而, 这还是有问题, 条件变量的语义要求调用<code class="language-plaintext highlighter-rouge">wait</code>的时候, unlock和wait两个步骤是不可分割的, 虽然我们上面的<code class="language-plaintext highlighter-rouge">wait</code>确实有一个保护<code class="language-plaintext highlighter-rouge">condition_variable_any</code>内部状态的锁, 但是, 我们的<code class="language-plaintext highlighter-rouge">notify_one/notify_all</code>并没有去获取这个锁, 这会导致一种竞争条件.</p>
<p>考虑线程a, 线程b; 某一时刻, 线程a进到了<code class="language-plaintext highlighter-rouge">condition_variable_any::wait</code>, 锁了<code class="language-plaintext highlighter-rouge">m_mutex</code>, 解锁了<code class="language-plaintext highlighter-rouge">external</code>, 然后挂起了. 此时线程b调度上了cpu, 调用了<code class="language-plaintext highlighter-rouge">notify_one</code>, 因为没锁, 一切顺利地跑完了<code class="language-plaintext highlighter-rouge">notify_one</code>; 这时候线程a再调度回来, 再进入<code class="language-plaintext highlighter-rouge">m_cond.wait</code>的话, 就错过了这次notify. 过程参考:</p>
<table>
<thead>
<tr>
<th>thread a</th>
<th>thread b</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">external.lock()</code></td>
<td> </td>
</tr>
<tr>
<td>check predicate, decide to wait</td>
<td> </td>
</tr>
<tr>
<td>enter <code class="language-plaintext highlighter-rouge">condition_variable_any::wait</code></td>
<td> </td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">boost::unique_lock<boost::mutex> lk(m_mutex)</code></td>
<td> </td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">external.unlock()</code></td>
<td> </td>
</tr>
<tr>
<td> </td>
<td><code class="language-plaintext highlighter-rouge">external.lock()</code></td>
</tr>
<tr>
<td> </td>
<td>change predicate, decide to wake thred a</td>
</tr>
<tr>
<td> </td>
<td>enter <code class="language-plaintext highlighter-rouge">condition_variable_any::notify_one()</code></td>
</tr>
<tr>
<td> </td>
<td><code class="language-plaintext highlighter-rouge">m_cond.notify_one()</code></td>
</tr>
<tr>
<td> </td>
<td>exit <code class="language-plaintext highlighter-rouge">condition_variable_any::notify_one()</code></td>
</tr>
<tr>
<td> </td>
<td><code class="language-plaintext highlighter-rouge">external.unlock()</code></td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">m_cond.wait()</code></td>
<td> </td>
</tr>
</tbody>
</table>
<p>条件变量要求进到<code class="language-plaintext highlighter-rouge">wait</code>后, 至少解锁<code class="language-plaintext highlighter-rouge">external</code>之后的notify不会错过, 所以这个问题是需要解决的. 解决也很简单, <code class="language-plaintext highlighter-rouge">notify_one/notify_all</code>加个锁就是了:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// buggy version 3</span>
<span class="kt">void</span> <span class="n">condition_variable_any</span><span class="o">::</span><span class="n">notify_one</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="n">m_mutex</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">condition_variable_any</span><span class="o">::</span><span class="n">notify_all</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="n">m_mutex</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这样我们的原子性就好了.</p>
<p>然而, 滚动条出卖了一切, 这个实现依然是有问题的.</p>
<p>在buggy version 2中, 我们为了原子性, 先锁<code class="language-plaintext highlighter-rouge">m_mutex</code>, 后解锁<code class="language-plaintext highlighter-rouge">external</code>, 这没问题, 但是为了异常安全我们用的是RAII呀, 这意味着先构造<code class="language-plaintext highlighter-rouge">lk</code>, 后构造<code class="language-plaintext highlighter-rouge">guard</code>; 按照C++局部变量析构的顺序, 先构造的后析构, 就会使得<code class="language-plaintext highlighter-rouge">guard</code>比<code class="language-plaintext highlighter-rouge">lk</code>先析构, 也就是说, 先重新锁<code class="language-plaintext highlighter-rouge">externl</code>, 后解锁<code class="language-plaintext highlighter-rouge">m_mutex</code>.</p>
<p>听起来是不是就要死锁了? 是的, 这里会死锁!</p>
<p>考虑线程a, 线程b; 某时刻, 线程a进到<code class="language-plaintext highlighter-rouge">m_cond.wait</code>里面, 然后被唤醒, 然后过了<code class="language-plaintext highlighter-rouge">m_cond.wait</code>, 然后又被挂起了, 此时<code class="language-plaintext highlighter-rouge">external</code>是解锁的而线程a锁了<code class="language-plaintext highlighter-rouge">m_mutex</code>; 然后线程a挂起等待. 此时线程b调度到cpu上, 锁了<code class="language-plaintext highlighter-rouge">external</code>, 然后进到<code class="language-plaintext highlighter-rouge">condition_variable_any::wait</code>或<code class="language-plaintext highlighter-rouge">condition_variable_any::notify</code>, 企图获得<code class="language-plaintext highlighter-rouge">m_mutex</code>, 但是线程a已经占据了<code class="language-plaintext highlighter-rouge">m_mutex</code>, 线程b肯定是拿不到锁了, 但是, 因为线程b占据了<code class="language-plaintext highlighter-rouge">external</code>, 线程a无法再锁<code class="language-plaintext highlighter-rouge">external</code>, <code class="language-plaintext highlighter-rouge">wait</code>过程无法结束, <code class="language-plaintext highlighter-rouge">lk</code>无法析构, <code class="language-plaintext highlighter-rouge">m_mutex</code>无法解锁, 于是就愉快地死锁了. 过程参考:</p>
<table>
<thead>
<tr>
<th>thread a</th>
<th>thread b</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">external.lock()</code></td>
<td> </td>
</tr>
<tr>
<td>check predicate, decide to wait</td>
<td> </td>
</tr>
<tr>
<td>enter <code class="language-plaintext highlighter-rouge">condition_variable_any::wait</code></td>
<td> </td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">boost::unique_lock<boost::mutex> lk(m_mutex)</code></td>
<td> </td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">external.unlock()</code></td>
<td> </td>
</tr>
<tr>
<td>enter <code class="language-plaintext highlighter-rouge">m_cond.wait()</code></td>
<td> </td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">m_mutex.unlock()</code> for system cond wait</td>
<td> </td>
</tr>
<tr>
<td><strong><code class="language-plaintext highlighter-rouge">m_mutex.lock()</code> for system cond wake</strong></td>
<td><strong><code class="language-plaintext highlighter-rouge">external.lock()</code></strong></td>
</tr>
<tr>
<td> </td>
<td>change predicate, decide to wake thread a</td>
</tr>
<tr>
<td> </td>
<td>enter <code class="language-plaintext highlighter-rouge">condition_variable_any::notify_one()</code></td>
</tr>
<tr>
<td> </td>
<td><strong>going to <code class="language-plaintext highlighter-rouge">m_mutex.lock()</code></strong></td>
</tr>
<tr>
<td><strong>going to <code class="language-plaintext highlighter-rouge">external.lock()</code></strong></td>
<td> </td>
</tr>
</tbody>
</table>
<p>所以我们要提前<code class="language-plaintext highlighter-rouge">m_mutex</code>的解锁, 先解锁<code class="language-plaintext highlighter-rouge">m_mutex</code>, 后再锁<code class="language-plaintext highlighter-rouge">external</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// good</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Lock</span><span class="p">></span>
<span class="kt">void</span> <span class="n">condition_variable_any</span><span class="o">::</span><span class="n">wait</span><span class="p">(</span><span class="n">Lock</span><span class="o">&</span> <span class="n">external</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="n">m_mutex</span><span class="p">);</span>
<span class="n">relock_guard</span><span class="o"><</span><span class="n">Lock</span><span class="o">></span> <span class="n">guard</span><span class="p">(</span><span class="n">external</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">lock_guard</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="o">></span> <span class="n">unlocker</span><span class="p">(</span><span class="n">lk</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">adopt_lock</span><span class="p">);</span>
<span class="n">m_cond</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这样才是一个安全可靠的<code class="language-plaintext highlighter-rouge">condition_variable_any</code>.</p>
<h2 id="总结">总结</h2>
<p>到这里也许我们已经明白为什么标准库和boost都提供了<code class="language-plaintext highlighter-rouge">condition_variable_any</code>而不是让用户去自己实现, 因为写出正确的<code class="language-plaintext highlighter-rouge">condition_variable_any</code>确实不是一件容易的事情, 你需要考虑异常安全性, <code class="language-plaintext highlighter-rouge">unlock/wait</code>的原子性语义, 以及避免退出<code class="language-plaintext highlighter-rouge">wait</code>时可能的死锁; 虽然总共就没几行代码, 但即使是专业人士也很容易出现疏漏.</p>
<p>顺带一提, 因为其内部增加了一个mutex, 性能大概有所损失, 所以虽然<code class="language-plaintext highlighter-rouge">condition_variable_any</code>很方便, 什么类型的锁都能用, 但在只需要配合<code class="language-plaintext highlighter-rouge">unique_lock<mutex></code>使用的情况下, 用<code class="language-plaintext highlighter-rouge">condition_variable</code>可能会有更好的性能[2].</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Howard E. Hinnant, <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2406.html">Mutex, Lock, Condition Variable Rationale</a>, Sept. 2007</li>
<li class="ref">[2] cppreference, <a href="https://en.cppreference.com/w/cpp/thread/condition_variable_any">std::condition_variable_any</a>, Jan. 2019</li>
</ul>
C++并发型模式#11: 扩展future - async/then/when_any/when_all
2019-05-19T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-11-extended-future
<h2 id="从boostasync开始">从boost::async开始</h2>
<p>我们之前有个使用future的例子:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">boost</span><span class="o">::</span><span class="n">promise</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">pr</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">pr</span><span class="p">.</span><span class="n">get_future</span><span class="p">();</span>
<span class="n">boost</span><span class="o">::</span><span class="kr">thread</span> <span class="nf">tr</span><span class="p">([</span><span class="o">&</span><span class="p">]()</span> <span class="p">{</span>
<span class="n">pr</span><span class="p">.</span><span class="n">set_value</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="p">};</span>
<span class="n">assert</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">==</span> <span class="mi">42</span><span class="p">);</span>
</code></pre></div></div>
<p>与这个例子类似, 我们通常在工作线程只会用future返回个结果, 而得到这个结果后, 工作线程就完成工作了. 所以, 我们其实希望有个函数(或者别的什么)可以帮我们建好promise, 起好线程, 然后直接给我future就好了. 比如说:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">F</span><span class="p">></span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">async</span><span class="p">(</span><span class="n">F</span><span class="o">&&</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">promise</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">pr</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">pr</span><span class="p">.</span><span class="n">get_future</span><span class="p">();</span>
<span class="n">boost</span><span class="o">::</span><span class="kr">thread</span> <span class="n">tr</span><span class="p">([</span><span class="n">p</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">pr</span><span class="p">),</span> <span class="o">&</span><span class="n">func</span><span class="p">]()</span> <span class="k">mutable</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">p</span><span class="p">.</span><span class="n">set_value</span><span class="p">(</span><span class="n">func</span><span class="p">());</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception</span><span class="o">&</span> <span class="n">e</span><span class="p">)</span> <span class="p">{</span>
<span class="n">p</span><span class="p">.</span><span class="n">set_exception</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">});</span>
<span class="n">tr</span><span class="p">.</span><span class="n">detach</span><span class="p">();</span>
<span class="k">return</span> <span class="n">f</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">async</span><span class="o"><</span><span class="kt">int</span><span class="o">></span><span class="p">([](){</span> <span class="k">return</span> <span class="mi">42</span><span class="p">;});</span>
<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o"><<</span> <span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o"><<</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这里的<code class="language-plaintext highlighter-rouge">async</code>只是个名字, 并不是C#里的async/await, 你比如Qt里类似的函数就叫<code class="language-plaintext highlighter-rouge">QConcurrent::run</code>.</p>
<p>当然, boost的<code class="language-plaintext highlighter-rouge">async</code>没有这么简单, 一是boost不能用这么高版本的lambda表达式, 二是boost的<code class="language-plaintext highlighter-rouge">async</code>需要forward异步函数的参数, 三是, 有launch policy.</p>
<p>launch policy是个复杂的东西, boost中有好几个, 主要是<code class="language-plaintext highlighter-rouge">boost::launch::async</code>和<code class="language-plaintext highlighter-rouge">boost::launch::deferred</code>, 其中<code class="language-plaintext highlighter-rouge">boost::launch::async</code>是立即起一线程执行异步函数, 而<code class="language-plaintext highlighter-rouge">boost::launch::deferred</code>则是等待或获取结果的时候再在当前线程执行异步函数(boost1.62). 这些个policy是位或的关系, 同时存在的话会有一个优先级, 具体可查看文档[1].</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">launch</span><span class="o">::</span><span class="n">defered</span><span class="p">,</span> <span class="p">[](){</span> <span class="k">return</span> <span class="mi">42</span><span class="p">;});</span>
</code></pre></div></div>
<p>到了高版本的boost, 需要考虑的就不只是launch policy了, 我们还可以指定executor实例(这里将executor也认为是一种policy):</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">boost</span><span class="o">::</span><span class="n">executors</span><span class="o">::</span><span class="n">basic_thread_pool</span> <span class="n">pool</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">future</span><span class="o"><</span><span class="kt">int</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">async</span><span class="p">(</span><span class="n">pool</span><span class="p">,</span> <span class="p">[](){</span> <span class="k">return</span> <span class="mi">42</span><span class="p">;});</span>
<span class="n">assert</span><span class="p">(</span><span class="mi">42</span> <span class="o">==</span> <span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">());</span>
</code></pre></div></div>
<p>为了支持这么复杂的<code class="language-plaintext highlighter-rouge">boost::async</code>, 我们原本的future实现就不够用了, 我们需要加许多特性, boost历史上还顺便重构了一下future[2], 改善一下命名什么的, 我们下面就来写一遍新版本的future.</p>
<h2 id="async-with-policy">async with policy</h2>
<h3 id="重构future">重构future</h3>
<p>基本的结构其实跟原来一样的, 比如说, 还是有一个维护future状态的, 我们之前的博客中称为<code class="language-plaintext highlighter-rouge">future_object_base</code>, 现在boost给了个更好的名字<code class="language-plaintext highlighter-rouge">shared_state_base</code>, 有一个储存结果的, 之前叫<code class="language-plaintext highlighter-rouge">future_object</code>, 现在重命名为<code class="language-plaintext highlighter-rouge">shared_state</code>, 至于他们的数据成员, 我们可以先保持不变:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">struct</span> <span class="nc">shared_state_base</span> <span class="o">:</span> <span class="n">boost</span><span class="o">::</span><span class="n">enable_shared_from_this</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">std</span><span class="o">::</span><span class="n">list</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">condition_variable_any</span><span class="o">*></span> <span class="n">waiter_list</span><span class="p">;</span>
<span class="k">typedef</span> <span class="n">waiter_list</span><span class="o">::</span><span class="n">iterator</span> <span class="n">notify_when_ready_handle</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">exception_ptr</span> <span class="n">exception</span><span class="p">;</span>
<span class="kt">bool</span> <span class="n">done</span><span class="p">;</span>
<span class="k">mutable</span> <span class="n">boost</span><span class="o">::</span><span class="n">mutex</span> <span class="n">mutex</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">condition_variable</span> <span class="n">cond</span><span class="p">;</span>
<span class="n">waiter_list</span> <span class="n">external_waiters</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="p">};</span>
<span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">shared_state</span><span class="o">:</span> <span class="n">shared_state_base</span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">storage_type</span><span class="p">;</span>
<span class="n">storage_type</span> <span class="n">result</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>在新版本的future中, <code class="language-plaintext highlighter-rouge">unique_future</code>重命名为<code class="language-plaintext highlighter-rouge">future</code>; 但<code class="language-plaintext highlighter-rouge">future</code>本身却没有持有<code class="language-plaintext highlighter-rouge">shared_state</code>的实例, 而是其父类<code class="language-plaintext highlighter-rouge">basic_future</code>, 而<code class="language-plaintext highlighter-rouge">basic_future</code>甚至有一个擦除了类型的父类<code class="language-plaintext highlighter-rouge">base_future</code>, 但这个<code class="language-plaintext highlighter-rouge">base_future</code>没有任何卵用:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">base_future</span> <span class="p">{}</span>
<span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">class</span> <span class="nc">basic_future</span> <span class="o">:</span> <span class="k">public</span> <span class="n">base_future</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">future_ptr</span><span class="p">;</span>
<span class="n">future_ptr</span> <span class="n">m_future</span><span class="p">;</span>
<span class="n">basic_future</span><span class="p">(</span><span class="n">future_ptr</span> <span class="n">shared_state</span><span class="p">)</span><span class="o">:</span> <span class="n">m_future</span><span class="p">(</span><span class="n">shared_state</span><span class="p">)</span> <span class="p">{}</span>
<span class="c1">// ...</span>
<span class="p">};</span>
<span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">class</span> <span class="nc">future</span> <span class="o">:</span> <span class="k">public</span> <span class="n">basic_future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">friend</span> <span class="k">class</span> <span class="nc">promise</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">;</span>
<span class="k">friend</span> <span class="k">class</span> <span class="nc">shared_future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="p">};</span>
<span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">class</span> <span class="nc">promise</span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">future_ptr</span><span class="p">;</span>
<span class="n">future_ptr</span> <span class="n">m_future</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<h3 id="async函数">async函数</h3>
<p>boost支持很多的policy, 我们后面会逐个实现. 简单起见, 我们先从<code class="language-plaintext highlighter-rouge">policy_async</code>和<code class="language-plaintext highlighter-rouge">policy_defered</code>开始, 讨论为了支持launch policy的<code class="language-plaintext highlighter-rouge">async</code>需要给future增加怎样的接口.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">enum</span> <span class="n">launch_policy</span> <span class="p">{</span>
<span class="n">policy_none</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span>
<span class="n">policy_async</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">policy_defered</span> <span class="o">=</span> <span class="mi">2</span><span class="p">,</span>
<span class="n">policy_executor</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span>
<span class="n">policy_inherit</span> <span class="o">=</span> <span class="mi">8</span><span class="p">,</span>
<span class="n">policy_sync</span> <span class="o">=</span> <span class="mi">16</span><span class="p">,</span>
<span class="n">policy_any</span> <span class="o">=</span> <span class="n">policy_async</span> <span class="o">|</span> <span class="n">policy_deferred</span>
<span class="p">};</span>
</code></pre></div></div>
<p>如果我们再限制一下, 只接收<code class="language-plaintext highlighter-rouge">boost::function<T()></code>, 那会使得<code class="language-plaintext highlighter-rouge">async</code>函数更加简单:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">async</span><span class="p">(</span><span class="n">launch_policy</span> <span class="n">policy</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">&</span> <span class="n">policy_async</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">make_future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">func</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">&</span> <span class="n">policy_deferred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">make_future_deferred_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">func</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>可以看到就是根据policy用不同的工厂方法创建不同的实例. 我们可以去看一下这两个工厂方法是怎么样的.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">make_future_async_shared_state</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span><span class="k">new</span> <span class="n">future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">());</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">make_future_deferred_shared_state</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_deferred_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span><span class="k">new</span> <span class="n">future_deferred_shared_state</span><span class="p">(</span><span class="n">func</span><span class="p">));</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">future_async_shared_state</code>和<code class="language-plaintext highlighter-rouge">future_deferred_shared_state</code>是<code class="language-plaintext highlighter-rouge">shared_state</code>的派生. 可以看到, 这两个工厂方法的差别不大, <code class="language-plaintext highlighter-rouge">async_policy</code>是先构造智能指针, 然后二步初始化, (init不是虚函数, 分成两步可能是为了异常安全, 在boost中这里的<code class="language-plaintext highlighter-rouge">func</code>是右值引用); 而<code class="language-plaintext highlighter-rouge">deferred_policy</code>是直接用<code class="language-plaintext highlighter-rouge">func</code>构造. 二者都是用<code class="language-plaintext highlighter-rouge">shared_state</code>的智能指针构造<code class="language-plaintext highlighter-rouge">future</code>.</p>
<p>再深入<code class="language-plaintext highlighter-rouge">future_async_shared_state</code>的实现:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_async_shared_state</span><span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="n">future_async_shared_state</span><span class="p">()</span> <span class="o">:</span> <span class="n">super</span><span class="p">()</span> <span class="p">{}</span>
<span class="kt">void</span> <span class="n">init</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">self</span><span class="p">;</span>
<span class="n">self</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">static_pointer_cast</span><span class="o"><</span><span class="n">future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span><span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">());</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="kt">void</span><span class="p">()</span><span class="o">></span> <span class="n">task</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">bind</span><span class="p">(</span><span class="o">&</span><span class="n">future_async_shared_state</span><span class="o">::</span><span class="n">run</span><span class="p">,</span> <span class="n">self</span><span class="p">,</span> <span class="n">func</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="p">(</span><span class="n">task</span><span class="p">).</span><span class="n">detach</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">static</span> <span class="kt">void</span> <span class="n">run</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_async_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">that</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">that</span><span class="o">-></span><span class="n">mark_finished_with_result</span><span class="p">(</span><span class="n">func</span><span class="p">());</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">that</span><span class="o">-></span><span class="n">mark_execptional_finish</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>其核心方法是<code class="language-plaintext highlighter-rouge">init</code>和<code class="language-plaintext highlighter-rouge">run</code>, 其中<code class="language-plaintext highlighter-rouge">init</code>是起一个线程, 这个线程的执行体就是<code class="language-plaintext highlighter-rouge">run</code>, 而<code class="language-plaintext highlighter-rouge">run</code>中做的事情也很简单, 执行<code class="language-plaintext highlighter-rouge">func</code>并将其结果置入<code class="language-plaintext highlighter-rouge">shared_state</code>中. <code class="language-plaintext highlighter-rouge">mark_finished_with_result</code>就像<code class="language-plaintext highlighter-rouge">promise</code>的<code class="language-plaintext highlighter-rouge">set_value</code>一样:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="kt">void</span> <span class="n">shared_state</span><span class="o">::</span><span class="n">mark_finished_with_result</span><span class="p">(</span><span class="k">const</span> <span class="n">T</span><span class="o">&</span> <span class="n">res</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lock</span><span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">mutex</span><span class="p">);</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_finished_with_result_internal</span><span class="p">(</span><span class="n">res</span><span class="p">,</span> <span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="kt">void</span> <span class="n">shared_state</span><span class="o">::</span><span class="n">mark_finished_with_result_internal</span><span class="p">(</span><span class="k">const</span> <span class="n">T</span><span class="o">&</span> <span class="n">res</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">unique</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="o">></span> <span class="p">{</span>
<span class="n">result</span><span class="p">.</span><span class="n">reset</span><span class="p">(</span><span class="k">new</span> <span class="n">T</span><span class="p">(</span><span class="n">res</span><span class="p">));</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_finished_internal</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="kt">void</span> <span class="nf">mark_finished_internal</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="n">done</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="n">cond</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span>
<span class="k">for</span> <span class="p">(</span><span class="n">waiter_list</span><span class="o">::</span><span class="n">const_iterator</span> <span class="n">it</span> <span class="o">=</span> <span class="n">external_waiters</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="n">it</span> <span class="o">!=</span> <span class="n">external_waiters</span><span class="p">.</span><span class="n">end</span><span class="p">();</span>
<span class="o">++</span><span class="n">it</span><span class="p">)</span> <span class="p">{</span>
<span class="p">(</span><span class="o">*</span><span class="n">it</span><span class="p">)</span><span class="o">-></span><span class="n">notify_all</span><span class="p">();</span>
<span class="p">}</span>
<span class="c1">// TODO: do_continuation(lock);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">mark_finished_internal</code>是我们之前实现过的, 只是后面我们实现<code class="language-plaintext highlighter-rouge">then</code>的时候, 还需要实现<code class="language-plaintext highlighter-rouge">do_continuation</code>, 所以这里标记了TODO.</p>
<p>我们再来看<code class="language-plaintext highlighter-rouge">future_defered_shared_state</code>, 于<code class="language-plaintext highlighter-rouge">policy_async</code>不同, <code class="language-plaintext highlighter-rouge">policy_deferred</code>的意思是, 等到用户调<code class="language-plaintext highlighter-rouge">future::get()</code>或<code class="language-plaintext highlighter-rouge">future::wait()</code>的时候再执行<code class="language-plaintext highlighter-rouge">func</code>.</p>
<p>为了实现这样的行为, <code class="language-plaintext highlighter-rouge">shared_state</code>或其基类就需要在<code class="language-plaintext highlighter-rouge">wait</code>和<code class="language-plaintext highlighter-rouge">get</code>做特殊的处理, 而作为判断, 我们还需要加一个属性或者flag. 而这时候才执行的<code class="language-plaintext highlighter-rouge">func</code>就需要用回调或者虚函数去执行, boost中用的是虚函数:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">shared_state</span> <span class="o">:</span> <span class="n">shared_state_base</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">execute</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span><span class="p">)</span> <span class="p">{}</span>
<span class="c1">// ...</span>
<span class="p">};</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_deferred_shared_state</span> <span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">m_func</span><span class="p">;</span>
<span class="k">explicit</span> <span class="n">future_deferred_shared_state</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="o">:</span> <span class="n">m_func</span><span class="p">(</span><span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">shared_state_base</span><span class="o">::</span><span class="n">set_defered</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">execute</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">lock</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="n">T</span> <span class="n">res</span> <span class="o">=</span> <span class="n">m_func</span><span class="p">();</span>
<span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_finished_with_result_internal</span><span class="p">(</span><span class="n">res</span><span class="p">,</span> <span class="n">lock</span><span class="p">);</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_execptional_finish_internal</span><span class="p">(</span><span class="n">current_exception</span><span class="p">(),</span> <span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>需要注意, <code class="language-plaintext highlighter-rouge">execute</code>是从<code class="language-plaintext highlighter-rouge">wait</code>调过来的, 所以是带锁的, 调用的是<code class="language-plaintext highlighter-rouge">xxx_internal</code>等自备锁的接口. 而且, 我们还需要让<code class="language-plaintext highlighter-rouge">m_func</code>的执行在锁外, 所以执行时要解锁.</p>
<p>如何调到<code class="language-plaintext highlighter-rouge">execute</code>? 这个行为我们可以从<code class="language-plaintext highlighter-rouge">wait</code>开始看:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename T>
class basic_future : public base_future {
public:
typedef boost::shared_ptr<shared_state<T> > future_ptr;
future_ptr m_future;
// ...
void wait() const {
if (!m_future) {
boost::throw_exception(...);
}
m_future->wait(false);
}
};
struct shared_state_base : boost::enable_shared_from_this<shared_state_base> {
// ...
bool is_deferred;
launch_policy policy;
// ...
void wait(bool rethrow = true) {
boost::unique_lock<boost::mutex> lock(this->mutex);
wait_internal(lock, rethorw);
}
void wait_internal(boost::unique_lock<boost::mutex>& lock,
bool rethrow=true) {
if (is_defered) {
is_defered = false;
this->execute(lock);
}
while(!done) {
cond.wait(lock);
}
if (rethow && exception) {
boost::rethrow_exception(exception);
}
}
void set_deferred() {
is_defered = true;
policy = launch_policy::polocy_defered;
}
};
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">wait_internal</code>在锁下把<code class="language-plaintext highlighter-rouge">is_defered</code>置为<code class="language-plaintext highlighter-rouge">false</code>了, 保证了<code class="language-plaintext highlighter-rouge">execute</code>只会被执行一次.</p>
<h2 id="then-continuation">then continuation</h2>
<p>趁现在我们的future还不复杂, 先去把只支持policy的<code class="language-plaintext highlighter-rouge">then</code>实现了. 从上面的讨论我们可以看出, <code class="language-plaintext highlighter-rouge">then</code>操作叫continuation. 简单起见, 我们这里只讨论三种policy, <code class="language-plaintext highlighter-rouge">policy_inhert</code>就是从<code class="language-plaintext highlighter-rouge">this</code>的policy继承, <code class="language-plaintext highlighter-rouge">policy_executor</code>我们稍后再讨论.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">class</span> <span class="nc">future</span> <span class="o">:</span> <span class="k">public</span> <span class="n">basic_future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">friend</span> <span class="k">class</span> <span class="nc">promise</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">;</span>
<span class="k">friend</span> <span class="k">class</span> <span class="nc">shared_future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">then</span><span class="p">(</span><span class="n">launch</span> <span class="n">policy</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">assert</span><span class="p">(</span><span class="n">m_future</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">sentinel</span><span class="p">(</span><span class="n">m_future</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lock</span><span class="p">(</span><span class="n">sentinel</span><span class="o">-></span><span class="n">mutex</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">&</span> <span class="n">launch_policy</span><span class="o">::</span><span class="n">policy_async</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">make_future_async_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="o">*</span><span class="k">this</span><span class="p">,</span> <span class="n">func</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">&</span> <span class="n">launch_policy</span><span class="o">::</span><span class="n">policy_deferred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">make_future_deferred_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="o">*</span><span class="k">this</span><span class="p">,</span> <span class="n">func</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">policy</span> <span class="o">&</span> <span class="n">launch_policy</span><span class="o">::</span><span class="n">policy_sync</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">make_future_sync_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="o">*</span><span class="k">this</span><span class="p">,</span> <span class="n">func</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>虽然看起来很吓人, 实际上就是个工厂函数而已. 因为continuation必然会给当事future注册点什么, 所以这里将<code class="language-plaintext highlighter-rouge">*this</code>传到更具体的工厂去了.</p>
<p>这些工厂实际上也是构造<code class="language-plaintext highlighter-rouge">shared_state</code>的派生, 先来看一下<code class="language-plaintext highlighter-rouge">make_future_async_continuation_shared_state</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">make_future_async_continuation_shared_state</span><span class="p">(</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">,</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">cont</span><span class="p">)</span> <span class="p">{</span>
<span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_async_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="n">h</span><span class="p">(</span>
<span class="k">new</span> <span class="n">future_async_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">cont</span><span class="p">));</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>因为我们有几种称为xx_continuation_shared_state的派生(一个policy一个, 之后还有executor), 所以很自然地, 我们有一个基类叫<code class="language-plaintext highlighter-rouge">continuation_shared_state</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">continuation_shared_state</span><span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">m_parent</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">m_continuation</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">continuation_shared_state</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">func</span><span class="p">)</span>
<span class="o">:</span> <span class="n">m_parent</span><span class="p">(</span><span class="n">parent</span><span class="p">),</span> <span class="n">m_continuation</span><span class="p">(</span><span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">init</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_parent</span><span class="p">.</span><span class="n">m_future</span><span class="o">-></span><span class="n">add_continuation_ptr</span><span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">(),</span> <span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>其中, <code class="language-plaintext highlighter-rouge">init</code>是将自己注册到parent的continuation列表中了, 被改变的是parent的内容, 所以工厂函数也要传入parnet的锁.</p>
<p>那parent拿continuation做了什么呢? 我们回到<code class="language-plaintext highlighter-rouge">mark_finished_internal</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">shared_state_base</span><span class="o">::</span><span class="n">mark_finished_internal</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="n">done</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
<span class="n">cond</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span>
<span class="k">for</span> <span class="p">(</span><span class="n">waiter_list</span><span class="o">::</span><span class="n">const_iterator</span> <span class="n">it</span> <span class="o">=</span> <span class="n">external_waiters</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="n">it</span> <span class="o">!=</span> <span class="n">external_waiters</span><span class="p">.</span><span class="n">end</span><span class="p">();</span>
<span class="o">++</span><span class="n">it</span><span class="p">)</span> <span class="p">{</span>
<span class="p">(</span><span class="o">*</span><span class="n">it</span><span class="p">)</span><span class="o">-></span><span class="n">notify_all</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">do_continuation</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span> <span class="c1">// !!!</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">do_continuation</code>做了什么呢? 很显然就是一个个去执行对吧:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">shared_state_base</span> <span class="o">:</span> <span class="n">enable_shared_from_this</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">continuation_ptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">continuation_ptr</span><span class="o">></span> <span class="n">continuations</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="kt">void</span> <span class="n">do_continuation</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="k">this</span><span class="o">-></span><span class="n">continuations</span><span class="p">.</span><span class="n">empty</span><span class="p">())</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">continuation_ptr</span><span class="o">></span> <span class="n">to_launch</span> <span class="o">=</span> <span class="k">this</span><span class="o">-></span><span class="n">continuations</span><span class="p">;</span>
<span class="k">this</span><span class="o">-></span><span class="n">continuations</span><span class="p">.</span><span class="n">clear</span><span class="p">();</span>
<span class="n">lock</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="k">for</span> <span class="p">(</span><span class="k">auto</span> <span class="n">it</span> <span class="o">=</span> <span class="n">to_launch</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span> <span class="n">it</span> <span class="o">!=</span> <span class="n">to_launch</span><span class="p">.</span><span class="n">end</span><span class="p">();</span> <span class="o">++</span><span class="n">it</span><span class="p">)</span> <span class="p">{</span>
<span class="p">(</span><span class="o">*</span><span class="n">it</span><span class="p">)</span><span class="o">-></span><span class="n">launch_continuation</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">add_continuation_ptr</span><span class="p">(</span><span class="n">continuation_ptr</span> <span class="n">cont</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">)</span> <span class="p">{</span>
<span class="n">continuations</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">cont</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">done</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_continuation</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">launch_continuation</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>因为continuation的执行不在锁内, 所以执行时先把continuation取出来, 这是实现线程安全Observer的一种手法.</p>
<p>如果加入新的continuation时该future已经完成了, 就直接执行<code class="language-plaintext highlighter-rouge">do_continuation</code>, 注意, 上一次执行<code class="language-plaintext highlighter-rouge">do_continuation</code>时已经清空<code class="language-plaintext highlighter-rouge">continuation</code>, 所以不会重复执行.</p>
<p>而<code class="language-plaintext highlighter-rouge">launch_continutation</code>是虚函数, 会重写这个函数的都是<code class="language-plaintext highlighter-rouge">continuation_shared_state</code>的派生, 需要根据<code class="language-plaintext highlighter-rouge">launch_policy</code>来决定具体怎么处理, 比如<code class="language-plaintext highlighter-rouge">policy_async</code>就起了个线程:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_async_continuation_shared_state</span><span class="o">:</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_async_continuation_shared_state</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">func</span><span class="p">)</span>
<span class="o">:</span> <span class="n">super</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="n">launch_continuation</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">self</span> <span class="o">=</span> <span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">();</span>
<span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="p">(</span><span class="o">&</span><span class="n">super</span><span class="o">::</span><span class="n">run</span><span class="p">,</span> <span class="n">self</span><span class="p">).</span><span class="n">detach</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这里的<code class="language-plaintext highlighter-rouge">run</code>作为线程的执行体, 它会执行<code class="language-plaintext highlighter-rouge">m_continuation</code>并置入结果:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">continuation_shared_state</span><span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">m_parent</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">m_continuation</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="k">static</span> <span class="kt">void</span> <span class="n">run</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">that</span><span class="p">)</span> <span class="p">{</span>
<span class="n">continuation_shared_state</span><span class="o">*</span> <span class="n">f</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="n">continuation_shared_state</span><span class="o">*></span><span class="p">(</span><span class="n">that</span><span class="p">.</span><span class="n">get</span><span class="p">());</span>
<span class="k">if</span> <span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="p">{</span>
<span class="n">f</span><span class="o">-></span><span class="n">call</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">call</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">mark_finished_with_result</span><span class="p">(</span><span class="n">m_continuation</span><span class="p">(</span><span class="n">m_parent</span><span class="p">));</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_exceptional_finish</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">m_parent</span><span class="p">.</span><span class="n">reset</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">policy_deferred</code>有所不同, 如同<code class="language-plaintext highlighter-rouge">async(policy_deferred, ...)</code>得到的deferred的future只有在<code class="language-plaintext highlighter-rouge">wait</code>或<code class="language-plaintext highlighter-rouge">get</code>时才会回调<code class="language-plaintext highlighter-rouge">execute</code>一样, <code class="language-plaintext highlighter-rouge">then(policy_deferrred, ...)</code>得到的future也是这样. 这意味着, parent future在<code class="language-plaintext highlighter-rouge">do_continuation</code>时调用派生的<code class="language-plaintext highlighter-rouge">launch_continuation</code>也不会做什么, 一切还得等到你<code class="language-plaintext highlighter-rouge">wait</code>或<code class="language-plaintext highlighter-rouge">get</code>你得到的新future. 所以, <code class="language-plaintext highlighter-rouge">future_deferred_continuation_shared_state</code>需要重载是其实是<code class="language-plaintext highlighter-rouge">execute</code>方法:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_deferred_continuation_shared_state</span><span class="o">:</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_deferred_continuation_shared_state</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">func</span><span class="p">)</span>
<span class="o">:</span> <span class="n">super</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="n">super</span><span class="o">::</span><span class="n">set_deferred</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">execute</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lk</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">m_parent</span><span class="p">.</span><span class="n">wait</span><span class="p">();</span>
<span class="k">this</span><span class="o">-></span><span class="n">call</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">launch_continuation</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="p">};</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">continuation_shared_state</span><span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">m_parent</span><span class="p">;</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">m_continuation</span><span class="p">;</span>
<span class="c1">// ...</span>
<span class="kt">void</span> <span class="n">call</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lk</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">lk</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="n">R</span> <span class="n">res</span> <span class="o">=</span> <span class="n">m_continuation</span><span class="p">(</span><span class="n">m_parent</span><span class="p">);</span>
<span class="n">m_parent</span><span class="p">.</span><span class="n">reset</span><span class="p">();</span>
<span class="n">lk</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="n">mark_finished_with_result_internal</span><span class="p">(</span><span class="n">res</span><span class="p">,</span> <span class="n">lk</span><span class="p">);</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">mark_exceptional_finish_internal</span><span class="p">(</span><span class="n">current_exception</span><span class="p">(),</span> <span class="n">lk</span><span class="p">);</span>
<span class="n">lk</span><span class="p">.</span><span class="n">unlock</span><span class="p">();</span>
<span class="n">m_parent</span><span class="p">.</span><span class="n">reset</span><span class="p">();</span>
<span class="n">lk</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">m_parent</span><span class="p">.</span><span class="n">reset</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这里调的<code class="language-plaintext highlighter-rouge">call</code>是带锁版本, 注意事项上面已经提及, 要保持<code class="language-plaintext highlighter-rouge">m_continuation</code>的调用在锁外, 具体实现留作习题.</p>
<p>现在我们再来补充一下<code class="language-plaintext highlighter-rouge">make_future_deferred_continuation_shared_state</code>工厂函数:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">make_future_deferred_continuation_shared_state</span><span class="p">(</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">,</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">cont</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_defrred_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span>
<span class="k">new</span> <span class="n">future_defereed_continuation_shared_state</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">cont</span><span class="p">);</span>
<span class="p">)</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
<span class="err">那新跑出来的`</span><span class="n">policy_sync</span><span class="err">`是怎么回事呢</span><span class="o">?</span> <span class="err">其工厂方法没有什么变化</span><span class="o">:</span>
<span class="o">~~~</span><span class="n">c</span><span class="o">++</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">make_future_sync_continuation_shared_state</span><span class="p">(</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">,</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">cont</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_sync_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span>
<span class="k">new</span> <span class="n">future_sync_continuation_shared_state</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">cont</span><span class="p">);</span>
<span class="p">)</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>但是看其实现, 我们会发现它直接就调<code class="language-plaintext highlighter-rouge">call</code>了, 没有新开线程, 就是说, parent在哪个线程, 它就在哪个线程:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_sync_continuation_shared_state</span><span class="o">:</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_sync_continuation_shared_state</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">func</span><span class="p">)</span>
<span class="o">:</span> <span class="n">super</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">launch_continuation</span><span class="p">()</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">call</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<h2 id="when_anywhen_all">when_any/when_all</h2>
<p>在引入executor前, 我们先来实现when_all, when_any.</p>
<p>之前我们已经实现过wait_for_all, wait_for_any, 这两个函数是阻塞等待的, 但在已经有<code class="language-plaintext highlighter-rouge">then</code>的情况下, 我们希望有非阻塞的版本, 这就是when_all, when_any, 他们返回的是新的future, 而不会阻塞.</p>
<p>其实when_all, when_any的原理很简单, 就是另起以线程, 执行wait_for_all, wait_for_any. 但是我们上面讨论了很久的<code class="language-plaintext highlighter-rouge">deferred</code>, 这种future在<code class="language-plaintext highlighter-rouge">wait_for_any</code>中又是如何处理的呢? 我们从<code class="language-plaintext highlighter-rouge">when_any</code>开始, 方便起见, 我们用一个vector的类型的future:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">when_any</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">>&</span> <span class="n">those</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_when_any_vector_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span>
<span class="k">new</span> <span class="n">future_when_any_vector_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">those</span><span class="p">);</span>
<span class="p">);</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">();</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这里我们接受的是<code class="language-plaintext highlighter-rouge">future<T></code>的vector, 返回的是<code class="language-plaintext highlighter-rouge">std::vector<future<T> ></code>的future, 就是说, 返回值是一个future, 这个future的结果就是你传进来的那个vector. 而且这里没有指示具体哪个future完成了, 使用时需要自己遍历一下.</p>
<p>说回正题, 我们观察其结构, 跟我们上面讨论的各个工厂方法时非常类似的, 我们又要实现一个<code class="language-plaintext highlighter-rouge">future_when_any_vector_shared_state</code>(boost1.59中可以找<code class="language-plaintext highlighter-rouge">future_when_all_tuple_shared_state</code>):</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_when_any_vector_shared_state</span> <span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">m_futures</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_when_any_vector_shared_state</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">>&</span> <span class="n">futures</span><span class="p">)</span>
<span class="o">:</span> <span class="n">m_futures</span><span class="p">(</span><span class="n">futures</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">init</span><span class="p">()</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">run_deferred</span><span class="p">())</span> <span class="p">{</span>
<span class="n">future_when_any_vector_shared_state</span><span class="o">::</span><span class="n">run</span><span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">());</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="p">(</span>
<span class="o">&</span><span class="n">future_when_any_vector_shared_state</span><span class="o">::</span><span class="n">run</span><span class="p">,</span> <span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">()</span>
<span class="p">).</span><span class="n">detach</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">static</span> <span class="kt">void</span> <span class="n">run</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">that_</span><span class="p">);</span>
<span class="kt">bool</span> <span class="n">run_deferred</span><span class="p">();</span>
<span class="p">};</span>
</code></pre></div></div>
<p>可以看到, 对于<code class="language-plaintext highlighter-rouge">deferred</code>的问题, 这里是根据<code class="language-plaintext highlighter-rouge">run_deferred()</code>的返回值, 如果返回<code class="language-plaintext highlighter-rouge">true</code>, 就直接调<code class="language-plaintext highlighter-rouge">run</code>, <code class="language-plaintext highlighter-rouge">run</code>完了<code class="language-plaintext highlighter-rouge">when_any</code>就完成了; 如果返回<code class="language-plaintext highlighter-rouge">false</code>, 则开另一个线程继续.</p>
<p><code class="language-plaintext highlighter-rouge">run_deferred</code>在boost中的行为是, 遍历<code class="language-plaintext highlighter-rouge">m_futures</code>, 如果有<code class="language-plaintext highlighter-rouge">deferred</code>, 就执行之, 于是, <code class="language-plaintext highlighter-rouge">run_deferred</code>返回的时候自然是”存在一个future已经完成”的状态, <code class="language-plaintext highlighter-rouge">when_any</code>自然也完成了. 但boost是执行第一个没完成且是<code class="language-plaintext highlighter-rouge">deferred</code>的future, 我们可以改进一下, 先遍历一遍, 发现没有已经完成的, 再执行第一个发现的<code class="language-plaintext highlighter-rouge">deferred</code>future:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="nf">run_defereed</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">idx_deferred_not_ready</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">m_futures</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">m_futures</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
<span class="k">if</span> <span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">is_ready</span><span class="p">())</span> <span class="p">{</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">is_deferred</span><span class="p">())</span> <span class="p">{</span>
<span class="n">idx_deferred_not_ready</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">idx_deferred_not_ready</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">f</span> <span class="o">=</span> <span class="n">m_futures</span><span class="p">[</span><span class="n">idx_deferred_not_ready</span><span class="p">];</span>
<span class="k">return</span> <span class="n">f</span><span class="p">.</span><span class="n">run_if_is_deferred_or_ready</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这个给<code class="language-plaintext highlighter-rouge">shared_state_base</code>新加的<code class="language-plaintext highlighter-rouge">run_if_is_deferred_or_ready</code>方法是什么意思呢? 首先, 如果已经ready了, 也返回<code class="language-plaintext highlighter-rouge">true</code>, 使得<code class="language-plaintext highlighter-rouge">when_any</code>不用新开线程; 另外, 如果是<code class="language-plaintext highlighter-rouge">deferred</code>, 就执行并返回<code class="language-plaintext highlighter-rouge">true</code>. 所以, 这个函数返回<code class="language-plaintext highlighter-rouge">false</code>的情况只有”不是<code class="language-plaintext highlighter-rouge">deferred</code>且没ready”:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">bool</span> <span class="n">shared_state_base</span><span class="o">::</span><span class="n">run_if_is_deferred_or_ready</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lk</span><span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">mutex</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="k">this</span><span class="o">-></span><span class="n">is_deferred</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">is_deferred</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
<span class="k">this</span><span class="o">-></span><span class="n">execute</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">this</span><span class="o">-></span><span class="n">done</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>现在我们倒回去实现<code class="language-plaintext highlighter-rouge">future_when_any_vector_shared_state::run</code>:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_when_any_vector_shared_state</span> <span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">m_futures</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="c1">// ...</span>
<span class="k">static</span> <span class="kt">void</span> <span class="n">run</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">that_</span><span class="p">)</span> <span class="p">{</span>
<span class="n">future_when_any_vector_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">>*</span> <span class="n">that</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="n">future_when_any_vector_shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">>*></span><span class="p">(</span><span class="n">that_</span><span class="p">.</span><span class="n">get</span><span class="p">());</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">wait_for_any</span><span class="p">(</span><span class="n">that</span><span class="o">-></span><span class="n">m_futures</span><span class="p">);</span>
<span class="n">that</span><span class="o">-></span><span class="n">make_finished_with_result</span><span class="p">(</span><span class="n">that</span><span class="o">-></span><span class="n">m_futures</span><span class="p">);</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">that</span><span class="o">-></span><span class="n">mark_execeptional_finished</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">wait_for_any</code>就是我们之前实现的, 只是加了vector的重载而(其实用迭代器区间更好), 其实现留作习题.</p>
<p>既然实现了<code class="language-plaintext highlighter-rouge">when_any</code>, <code class="language-plaintext highlighter-rouge">when_all</code>就更不在话下了, 只是把<code class="language-plaintext highlighter-rouge">deferred</code>全部执行了而已, 其实现也留作习题.</p>
<h2 id="via-executor">via executor</h2>
<h3 id="async-via-executor">async via executor</h3>
<p>现在我们可以来考虑executor的问题了.</p>
<p>首先来看executor版本的async, 依旧是创建一个<code class="language-plaintext highlighter-rouge">shared_state</code>的派生:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Ex, typename T>
future<T> async(Ex& ex, boost::function<T()> func) {
return make_future_executor_shared_state<T>(ex, func);
}
template<typename Ex, typename T>
future<T> make_future_executor_shared_state(Ex& ex, boost::function<T()> func) {
boost::shared_ptr<future_executor_shared_state<T> > h(
new future_executor_shared_state<T>()
);
h->init(ex, func);
return future<T>(h);
}
</code></pre></div></div>
<p>虽然这里我们的executor是模板参数, 但是future本身是没有executor这个模板参数的. 我们可以在<code class="language-plaintext highlighter-rouge">init</code>提交完task就算了, 但是我们的<code class="language-plaintext highlighter-rouge">then</code>有<code class="language-plaintext highlighter-rouge">policy_inherit</code>, 所以future需要保存executor以便继承. 所以, 这个executor类型会想办法擦除掉, 现在假设我们已经知道怎么擦除了, 来看看<code class="language-plaintext highlighter-rouge">future_executor_shared_state</code>的实现:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_executor_shared_state</span><span class="o">:</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_executor_shared_state</span><span class="p">()</span> <span class="p">{}</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">void</span> <span class="n">init</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">T</span><span class="p">()</span><span class="o">></span> <span class="n">func</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">set_executor_policy</span><span class="p">(</span><span class="n">executor_ptr</span><span class="p">(</span><span class="k">new</span> <span class="n">executor_ref</span><span class="o"><</span><span class="n">Ex</span><span class="o">></span><span class="p">(</span><span class="n">ex</span><span class="p">)));</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="kt">void</span><span class="p">()</span><span class="o">></span> <span class="n">task</span> <span class="o">=</span> <span class="p">[</span><span class="n">self_</span> <span class="o">=</span> <span class="k">this</span><span class="o">-></span><span class="n">shared_from_this</span><span class="p">(),</span> <span class="n">func</span><span class="p">]()</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">self</span> <span class="o">=</span> <span class="n">static_pointer_cast</span><span class="o"><</span><span class="n">shared_state</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span><span class="p">(</span><span class="n">self_</span><span class="p">);</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">self</span><span class="o">-></span><span class="n">mark_finished_with_result</span><span class="p">(</span><span class="n">func</span><span class="p">());</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
<span class="n">self</span><span class="o">-></span><span class="n">mark_exceptional_finished</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">ex</span><span class="p">.</span><span class="n">submit</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>简单起见, 这里用lambda表达式. 首先将<code class="language-plaintext highlighter-rouge">ex</code>类型擦除后存到future中去, 然后将打包一个task, 这个task的工作就是执行<code class="language-plaintext highlighter-rouge">func</code>, 然后将结果置入future. 然后将task提交到executor, 至于executor怎么执行的, 就不管了.</p>
<p>然后我们来看类型擦除的部分. 首先看到<code class="language-plaintext highlighter-rouge">executor_ref</code>, 这玩意是boost.executor框架的工具, boost.executor框架实际上也提供了基于运行时多态的executor抽象基类, 那<code class="language-plaintext highlighter-rouge">executor_ref</code>就是将符合编译期Executor concept的类型包装成多态executor的派生:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="kt">void</span><span class="p">()</span><span class="o">></span> <span class="n">work</span><span class="p">;</span>
<span class="k">class</span> <span class="nc">executor</span> <span class="p">{</span>
<span class="nl">public:</span>
<span class="n">executor</span><span class="p">(){}</span>
<span class="k">virtual</span> <span class="o">~</span><span class="n">executor</span><span class="p">(){}</span>
<span class="nl">public:</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">close</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">closed</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">submit</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">try_executing_one</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">typedef</span> <span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">executor</span><span class="o">></span> <span class="n">executor_ptr</span><span class="p">;</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="k">class</span> <span class="nc">executor_ref</span> <span class="o">:</span> <span class="k">public</span> <span class="n">executor</span> <span class="p">{</span>
<span class="n">Ex</span><span class="o">&</span> <span class="n">m_ex</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">executor_ref</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">)</span> <span class="o">:</span> <span class="n">m_ex</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span> <span class="p">{}</span>
<span class="o">~</span><span class="n">executor_ref</span><span class="p">(){}</span>
<span class="nl">public:</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">close</span><span class="p">()</span> <span class="p">{</span>
<span class="n">m_ex</span><span class="p">.</span><span class="n">close</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">closed</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">m_ex</span><span class="p">.</span><span class="n">closed</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">submmit</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_ex</span><span class="p">.</span><span class="n">submit</span><span class="p">(</span><span class="n">w</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">try_executing_one</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">m_ex</span><span class="p">.</span><span class="n">try_executing_one</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>因为executor有了抽象基类, future可以保存抽象基类的指针, 派生类<code class="language-plaintext highlighter-rouge">executor_ref<Ex></code>的类型就被擦除了:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">shared_state_base</span> <span class="o">:</span> <span class="n">enable_shared_from_this</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="n">executor_ptr</span> <span class="n">ex</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">set_executor_policy</span><span class="p">(</span><span class="n">executor_ptr</span> <span class="n">aex</span><span class="p">)</span> <span class="p">{</span>
<span class="n">set_executor</span><span class="p">();</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">aex</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">set_executor_policy</span><span class="p">(</span><span class="n">executor_ptr</span> <span class="n">aex</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span><span class="p">)</span> <span class="p">{</span>
<span class="n">set_executor</span><span class="p">();</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">aex</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">set_executor</span><span class="p">()</span> <span class="p">{</span>
<span class="n">is_deferred</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
<span class="n">policy</span> <span class="o">=</span> <span class="n">launch_policy</span><span class="o">::</span><span class="n">policy_executor</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">executor_ptr</span> <span class="n">get_executor</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">ex</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<h3 id="then-via-executor">then via executor</h3>
<p>现在我们可以来写executor版本的<code class="language-plaintext highlighter-rouge">then</code>了:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">>::</span><span class="n">then</span><span class="p">(</span><span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">cont</span><span class="p">))</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">shared_state_base</span><span class="o">></span> <span class="n">sentinel</span><span class="p">(</span><span class="n">m_future</span><span class="p">);</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">></span> <span class="n">lock</span><span class="p">(</span><span class="n">sentinel</span><span class="o">-></span><span class="n">mutex</span><span class="p">);</span>
<span class="k">return</span> <span class="n">make_future_executor_continuation_shared_state</span><span class="o"><</span><span class="n">Ex</span><span class="p">,</span> <span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">ex</span><span class="p">,</span> <span class="n">lock</span><span class="p">,</span> <span class="k">this</span><span class="p">,</span> <span class="n">cont</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>这个个工厂函数也与我们上面写的几个差不多:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span> <span class="n">make_future_executor_continuation_shared_state</span><span class="p">(</span>
<span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">,</span>
<span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lock</span><span class="p">,</span>
<span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">cont</span><span class="p">)</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">future_executor_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="o">></span> <span class="n">h</span><span class="p">(</span>
<span class="k">new</span> <span class="n">future_executor_continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">cont</span><span class="p">)</span>
<span class="p">);</span>
<span class="n">h</span><span class="o">-></span><span class="n">init</span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="n">ex</span><span class="p">);</span>
<span class="k">return</span> <span class="n">future</span><span class="o"><</span><span class="n">R</span><span class="o">></span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">future_executor_continuation_shared_state</code>就是在<code class="language-plaintext highlighter-rouge">launch_continuation</code>中提交task:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">R</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">></span>
<span class="k">struct</span> <span class="nc">future_executor_continuation_shared_state</span><span class="o">:</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">typedef</span> <span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">></span> <span class="n">super</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">future_executor_continuation_shared_state</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">parent</span><span class="p">,</span> <span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="n">R</span><span class="p">(</span><span class="n">future</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">)</span><span class="o">></span> <span class="n">cont</span><span class="p">)</span>
<span class="o">:</span> <span class="n">super</span><span class="p">(</span><span class="n">parent</span><span class="p">,</span> <span class="n">cont</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// pass</span>
<span class="p">}</span>
<span class="o">~</span><span class="n">future_executor_continuation_shared_state</span><span class="p">(){}</span>
<span class="nl">public:</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Ex</span><span class="p">></span>
<span class="kt">void</span> <span class="n">init</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">unique_lock</span><span class="o"><</span><span class="n">boost</span><span class="o">::</span><span class="n">mutex</span><span class="o">>&</span> <span class="n">lk</span><span class="p">,</span> <span class="n">Ex</span><span class="o">&</span> <span class="n">ex</span><span class="p">)</span> <span class="p">{</span>
<span class="k">this</span><span class="o">-></span><span class="n">set_executor_policy</span><span class="p">(</span><span class="n">executor_ptr</span><span class="p">(</span><span class="k">new</span> <span class="n">executor_ref</span><span class="o"><</span><span class="n">Ex</span><span class="o">></span><span class="p">(</span><span class="n">ex</span><span class="p">)));</span>
<span class="n">super</span><span class="o">::</span><span class="n">init</span><span class="p">(</span><span class="n">lk</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">launch_continuation</span><span class="p">()</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">function</span><span class="o"><</span><span class="kt">void</span><span class="p">()</span><span class="o">></span> <span class="n">task</span> <span class="o">=</span> <span class="p">[</span><span class="n">self_</span> <span class="o">=</span> <span class="n">shared_from_this</span><span class="p">()]()</span> <span class="p">{</span>
<span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">>*</span> <span class="n">self</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o"><</span><span class="n">continuation_shared_state</span><span class="o"><</span><span class="n">R</span><span class="p">,</span> <span class="n">T</span><span class="o">>*></span><span class="p">(</span><span class="n">self_</span><span class="p">.</span><span class="n">get</span><span class="p">());</span>
<span class="n">self</span><span class="o">-></span><span class="n">call</span><span class="p">();</span>
<span class="p">}</span>
<span class="n">get_executor</span><span class="p">()</span><span class="o">-></span><span class="n">submit</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="总结">总结</h2>
<p>无论是<code class="language-plaintext highlighter-rouge">async</code>还是<code class="language-plaintext highlighter-rouge">then</code>, 都是根据条件构造不同的<code class="language-plaintext highlighter-rouge">shared_state</code>派生, 这个条件可以是policy也可以是executor. 对于<code class="language-plaintext highlighter-rouge">async</code>函数, <code class="language-plaintext highlighter-rouge">policy_async</code>是构造<code class="language-plaintext highlighter-rouge">shared_state</code>时立即起一线程执行异步函数, <code class="language-plaintext highlighter-rouge">policy_deferred</code>通过重载<code class="language-plaintext highlighter-rouge">execute</code>虚函数, 等用户调用<code class="language-plaintext highlighter-rouge">wait</code>或<code class="language-plaintext highlighter-rouge">get</code>时再执行其异步函数. 而executor则是向executor提交包装有异步函数的任务.</p>
<p>对于<code class="language-plaintext highlighter-rouge">then</code>函数, 与<code class="language-plaintext highlighter-rouge">async</code>函数类似, 构造不同的<code class="language-plaintext highlighter-rouge">shared_state</code>派生, 然后注册到parent future. parent future会在完成时调用其<code class="language-plaintext highlighter-rouge">launch_continuation</code>虚函数. 对于<code class="language-plaintext highlighter-rouge">policy_async</code>, 其<code class="language-plaintext highlighter-rouge">launch_continuation</code>也是立即起一线程执行cont函数. <code class="language-plaintext highlighter-rouge">policy_deferred</code>仍然时特别的, 它的<code class="language-plaintext highlighter-rouge">launch_continuation</code>什么也不做, 依旧是用户调用<code class="language-plaintext highlighter-rouge">wait</code>或<code class="language-plaintext highlighter-rouge">get</code>的时候才执行其异步函数. executor则是向executor提交包装有cont函数的任务.</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] boost, <a href="https://www.boost.org/doc/libs/1_61_0/doc/html/thread/synchronization.html#thread.synchronization.futures">Futures</a>, 1.70</li>
<li class="ref">[2] Vicente J. Botet Escriba, <a href="https://github.com/boostorg/thread/commit/45c87d392f78f5e123107c17a675fee4e2b19f5b">Refactor futures by adding a basic_future common class</a>, Nov.2012</li>
<li class="ref">[3] N. Gustafsson, A. Laksberg, H. Sutter, S. Mithani, [ N3634 - Improvements to std::future<T> and related APIs](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3634.pdf), May. 2013</T></li>
</ul>
C++并发型模式#10: 任务执行策略 - Executor
2019-03-25T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-10-executor
<h2 id="introduction">Introduction</h2>
<p>多线程编程中, 我们常常把任务分解成离散的工作单元(每个工作单元也许很小), 以期并行处理. 但是, 为每个工作单元创建线程(比如<code class="language-plaintext highlighter-rouge">boost::async</code>), 尤其是大量创建, 会存在一些不足:</p>
<ul>
<li>线程生命周期的开销非常高. 线程的创建和销毁都是需要时间的.</li>
<li>资源消耗. 活跃的线程会消耗系统资源, 尤其是内存. 根据平台不同, 可创建线程的数量也是有限的.</li>
<li>频繁的资源竞争和上下文切换, 降低CPU的使用效率.</li>
</ul>
<p>所以, 工作单元小而多的时候, 我们并不希望总是创建新线程. 似乎我们需要某种机制来控制什么线程执行什么工作单元. 这就是我们说的Executor框架, 它抽象了任务的执行策略.</p>
<p>这个策略可能是多种多样的, 也许是线程池, 也许是为每个单元创建新线程, 也许我们就希望单线程串行执行…</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Executor>
void do_some_work(Executor& ex) {
ex.submit([]() {
std::cout << "hello world" << std::endl;
});
}
int main() {
boost::executor::basic_thread_pool ex1(4);
booost::executor::thread_executor ex2;
do_some_work(ex1);
do_some_work(ex2);
// wait for finished
return 0;
}
</code></pre></div></div>
<p class="lang-cpp">通过模板(或者接口), 我们可以灵活地指定executor, 或者为不同性质的任务指定不同的executor.</p>
<p>实际上, 根据不同的线程数(number of execution contexts), 不同的任务排序策略(how they are prioritized), 不同的选择策略(how they are selected), executor分为几大类, 好多种[1]:</p>
<ol>
<li>
<p>线程池(Thread Pools)</p>
<ul>
<li><strong>simple unbounded thread pool</strong>: 将工作单元放到任务队列中, 然后维护一堆线程, 每个线程去任务队列取工作单元, 然后执行, 如此往复.</li>
<li><strong>bounded thread pool</strong>: 跟无界线程池很类似, 但是它的任务队列是有界的, 这限制了线程是中排队的工作单元的数量.</li>
<li><strong>thread-spawning executor</strong>: 总是为新任务创建新线程.</li>
<li><strong>prioritized thread pool</strong>: 任务队列是个优先队列.</li>
<li><strong>work stealing thread pool</strong>: 线程池本身有个主任务队列, 每个工作线程也维护了自己的任务队列. 当工作线程自己的任务队列没有任务时, 就会去主任务队列取任务或者别的工作线程那”偷”任务. 适用于任务比较小的情况, 可以避免在主任务队列上的频繁竞争.</li>
<li><strong>fork-join thread pool</strong>: 允许在任务中继续(递归地)分解(fork)并提交任务, 提交后进入等待时, 不是干等, 而是执行所在工作线程的任务队列的任务或者”偷”个任务回来执行. 等子任务完成后, 合并(join)得到任务自身的结果. 通常基于work stealing thread pool实现, 比如Java的ForkJoin框架.</li>
</ul>
</li>
<li>
<p>互斥执行(Mutual exclusion executors)</p>
<ul>
<li><strong>serial executor</strong>: 串行地执行, 也许在另一个线程, 但任务间是不会并发的, 所以不需要额外的互斥.</li>
<li><strong>loop executor</strong>: 跟serial executor类似, 但是执行的线程不是executor创建的, 而是别的调用者”给(donate)”的. 常用于测试.</li>
<li><strong>GUI thread executor</strong>: boost说的, 我也不知道什么意思.</li>
</ul>
</li>
<li>
<p>Inline Executor: submit的时候就把任务执行了(在提交者的线程), 故不需要队列, 也不起线程. 常用于任务很小, 没必要放别的线程执行, 或者出于性能考虑, 直接执行比较好, 但接口非得executor的情况.</p>
</li>
</ol>
<p>boost就列了这么多, 事实上我们还能列出好多来(比如folly, java.util.concurrent). 不过本文并不打算全部一次讲清楚<del>我没这么厉害</del>, 而是讲boost已经有的<code class="language-plaintext highlighter-rouge">basic_thread_pool</code>, <code class="language-plaintext highlighter-rouge">serial_executor</code>, <code class="language-plaintext highlighter-rouge">loop_executor</code>, <code class="language-plaintext highlighter-rouge">inline_executor</code>, 以及<code class="language-plaintext highlighter-rouge">thread_executor</code>(thread-spwaning executor).</p>
<p>work stealing 和 fork-join我们会分别单列一篇的讨论.</p>
<h2 id="boostexecutor">boost.executor</h2>
<p>boost的executor以闭包(closure)表示工作单元, 这里的闭包指无参数返回void的可调用对象, 接口上, 这个closure通常是模板的, 但executor内部储存的是<code class="language-plaintext highlighter-rouge">boost::function<void()></code>.</p>
<p>接受executor的接口要求executor是一个具备以下接口的concept:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typedef boost::function<void()> work;
class executor {
public:
template<typename Closure> void submit(Closure&& closure);
template<typename Closure> void submit(Closure& closure);
void close();
bool closed();
bool try_executing_one();
template <typename Pred> bool reschedule_until(const Pred& pred);
}
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">try_executing_one</code>和<code class="language-plaintext highlighter-rouge">reschedule_util</code>会在调用者的线程执行.</p>
<p>最典型的接受executor作为参数的是<code class="language-plaintext highlighter-rouge">boost::async</code>和<code class="language-plaintext highlighter-rouge">boost::future::then</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::executors::basic_thread_pool pool(4);
boost::executors::inline_executor iex;
boost::executors::serial_executor ser(pool);
auto f = boost::async(ser, []() {
std::cout << boost::this_thread::get_id() << std::endl;
}).then(iex, [](boost::future<void> f) {
std::cout << boost::this_thread::get_id() << std::endl;
}).then(pool, [](boost::future<void> f) {
std::cout << boost::this_thread::get_id() << std::endl;
});
f.wait();
</code></pre></div></div>
<p>首先<code class="language-plaintext highlighter-rouge">async</code>向<code class="language-plaintext highlighter-rouge">ser</code>提交了一个任务, 然后这个任务完成时, 回调把<code class="language-plaintext highlighter-rouge">then</code>的闭包<code class="language-plaintext highlighter-rouge">submit</code>到<code class="language-plaintext highlighter-rouge">iex</code>中, <code class="language-plaintext highlighter-rouge">iex</code>是在<code class="language-plaintext highlighter-rouge">submit</code>的时候执行, 所以输出的thread id应该与前面一致, 然后又回调, 把第二个<code class="language-plaintext highlighter-rouge">then</code>的闭包提交到pool, 所以第三个thread id与前两个不同.</p>
<p>如果不指定executor, 这个链式操作应当每一个都在新线程执行:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto f2 = boost::async([](){
std::cout << boost::this_thread::get_id() << std::endl;
}).then([](boost::future<void> f) {
std::cout << boost::this_thread::get_id() << std::endl;
}).then([](boost::future<void> f) {
std::cout << boost::this_thread::get_id() << std::endl;
});
f2.wait();
</code></pre></div></div>
<h3 id="boostinline_executor">boost.inline_executor</h3>
<p>我们先来看一下最简单的<code class="language-plaintext highlighter-rouge">inline_executor</code>, 提交即执行:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class inline_executor {
bool m_closed;
mutable boost::mutex m_mtx;
public:
inline_executor() : m_closed(false) {}
~inline_executor() { close(); }
void close() {
boost::lock_guard<boost::mutex> lk(m_mtx);
m_closed = true;
}
bool closed() {
boost::lock_guard<boost::mutex> lk(m_mtx);
return closed(lk);
}
bool closed(boost::lock_guard<boost::mutex>&) {
return m_closed;
}
template<typename Pred>
bool reschedule_until(const Pred&) {
return false;
}
bool try_executing_one() {
return false;
}
public:
void submit(work& w);
};
</code></pre></div></div>
<p>因为提交即执行, <code class="language-plaintext highlighter-rouge">try_executing_one</code>和<code class="language-plaintext highlighter-rouge">reschedule_until</code>都总是返回<code class="language-plaintext highlighter-rouge">false</code>. 你也许会问这两是做什么用的, 别急, 我们后面讲.</p>
<p><code class="language-plaintext highlighter-rouge">submit</code>我们还没写, 因为我们需要明确一点, 就是闭包执行的时候, boost.executor是要求不抛异常的, 如果抛了, 就<code class="language-plaintext highlighter-rouge">std::terminate()</code>, 另外, 为了符合<code class="language-plaintext highlighter-rouge">close</code>和<code class="language-plaintext highlighter-rouge">closed</code>语义, 即使是<code class="language-plaintext highlighter-rouge">inline_executor</code>也要考虑是否已经关闭, 已经关闭的话会抛异常, 抛的什么异常就看实现了, 比如boost的<code class="language-plaintext highlighter-rouge">inline_executor</code>在关闭时提交闭包就会跑<code class="language-plaintext highlighter-rouge">sync_queue_is_closed</code>异常, 其实它根本没有任务队列(摊手.jpg):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void inline_executor::submit(work& w) {
{
boost::lock_guard<boost::mutex> lk(m_mtx);
if (closed(lk)) {
BOOST_THROW_EXCEPTION( boost::sync_queue_is_closed() );
}
}
try {
w();
} catch(...) {
std::terminate();
return;
}
}
</code></pre></div></div>
<h3 id="boostthread_executor">boost.thread_executor</h3>
<p>然后我们可以来实现一下稍为复杂一点的<code class="language-plaintext highlighter-rouge">thread_executor</code>, 提交即创建线程, 事实上, 除了submit, 其他成员跟<code class="language-plaintext highlighter-rouge">inline_executor</code>是一样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class thread_executor {
typedef boost::scoped_thread<> thread_t;
std::vector<thread_t> m_threads;
bool m_closed;
mutable boost::mutex m_mtx;
public:
void submit(work& w) {
boost::lock_guard<boost::mutex> lk(m_mtx);
if (closed(lk)) {
BOOST_THROW_EXCEPTION( boost::sync_queue_is_closed() );
}
m_threads.reserve(m_threads.size() + 1); //确保有内存, 再创建thread
boost::thread th(w);
m_threads.push_back(thread_t(boost::move(th)));
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">scoped_thread<></code>是让<code class="language-plaintext highlighter-rouge">m_threads</code>析构的时候<code class="language-plaintext highlighter-rouge">join</code>线程. 也就是说, <code class="language-plaintext highlighter-rouge">thread_executor</code>的析构会等待所有线程完成, 即所有任务完成.</p>
<h3 id="boostbasic_thread_pool">boost.basic_thread_pool</h3>
<p>boost的<code class="language-plaintext highlighter-rouge">basic_thread_pool</code>是比较简单的线程池实现, 构造时创建所有工作线程, 使用简单的<code class="language-plaintext highlighter-rouge">sync_queue</code>做任务队列, 析构时中断所有工作线程.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">basic_thread_pool</span> <span class="p">{</span>
<span class="n">boost</span><span class="o">::</span><span class="n">thread_group</span> <span class="n">m_threads</span><span class="p">;</span>
<span class="n">sync_queue</span><span class="o"><</span><span class="n">work</span><span class="o">></span> <span class="n">m_tasks</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="n">basic_thread_pool</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">thread_count</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="kr">thread</span><span class="o">::</span><span class="n">hardware_concurrency</span><span class="p">()</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
<span class="o">~</span><span class="n">basic_thread_pool</span><span class="p">();</span>
<span class="nl">public:</span>
<span class="kt">bool</span> <span class="n">try_executing_one</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">close</span><span class="p">();</span>
<span class="kt">bool</span> <span class="n">closed</span><span class="p">();</span>
<span class="k">template</span><span class="o"><</span><span class="k">typename</span> <span class="nc">Pred</span><span class="p">></span>
<span class="kt">bool</span> <span class="n">reschedule_until</span><span class="p">(</span><span class="k">const</span> <span class="n">Pred</span><span class="o">&</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">submit</span><span class="p">(</span><span class="n">work</span><span class="o">&</span> <span class="n">w</span><span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>
<p>首先是构造函数创建工作线程:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">basic_thread_pool</span><span class="o">::</span><span class="n">basic_thread_pool</span><span class="p">(</span><span class="kt">size_t</span> <span class="n">thread_count</span><span class="p">)</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">thread_count</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="n">m_threads</span><span class="p">.</span><span class="n">create_thread</span><span class="p">(</span><span class="n">boost</span><span class="o">::</span><span class="n">bind</span><span class="p">(</span><span class="o">&</span><span class="n">basic_thread_pool</span><span class="o">::</span><span class="n">worker_thread</span><span class="p">,</span> <span class="k">this</span><span class="p">));</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">catch</span><span class="p">(...)</span> <span class="p">{</span>
<span class="n">close</span><span class="p">();</span>
<span class="k">throw</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">worker_thread</code>是工作线程的函数, 它实际上不断地从<code class="language-plaintext highlighter-rouge">m_task</code>取出任务并执行, 但要处理<code class="language-plaintext highlighter-rouge">thread_interrupted</code>异常:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void basic_thread_pool::worker_thread() {
try {
for (;;) {
work task;
try {
boost::concurrent::queue_op_status st = m_tasks.wait_pull(task);
if (st == boost::concurrent::queue_op_status::closed) {
return;
}
task();
} catch (boost::thread_interrupted&) {
return;
}
} // for
} catch (...) {
std::terminate();
return;
}
}
</code></pre></div></div>
<p>从对<code class="language-plaintext highlighter-rouge">wait_pull</code>返回的status判断, 我们可以知道<code class="language-plaintext highlighter-rouge">basic_thread_pool</code>的<code class="language-plaintext highlighter-rouge">close</code>和<code class="language-plaintext highlighter-rouge">closed</code>都是交由其任务队列完成的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void basic_thread_pool::close() {
m_tasks.close();
}
bool basic_thread_pool::closed() {
return m_tasks.closed();
}
</code></pre></div></div>
<p>然后是<code class="language-plaintext highlighter-rouge">reschedule_util</code>和<code class="language-plaintext highlighter-rouge">try_executing_one</code>, 之前的executor这两个函数都直接返回, 没做什么事情, 但在basic_thread_pool这里就不能这样了.</p>
<p>对于<code class="language-plaintext highlighter-rouge">reschedule_until</code>, 文档上是说, 只能在work内调用(“This must be called from a scheduled work”), 我一直没有看明白这什么意思. 看实现也许是让我们手动fork-join用的, 那我们先看一下实现:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename Pred>
bool basic_thread_pool::reschedule_until(const Pred& pred) {
do {
if (!try_executing_one()) {
return false;
}
} while (!pred());
return true;
}
bool try_executing_one() {
try {
work task;
if (m_tasks.try_pull(task) == queue_op_status::success) {
task();
return true;
}
return false;
} catch (...) {
std::terminate();
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">reschedule_until</code>一直都是调用<code class="language-plaintext highlighter-rouge">try_executing_one</code>自然谓词为真. 而这里的<code class="language-plaintext highlighter-rouge">try_executing_one</code>则是从任务队列中取出任务并执行. 任务队列为空时, <code class="language-plaintext highlighter-rouge">try_executing_one</code>会返回<code class="language-plaintext highlighter-rouge">false</code>, 这也会使<code class="language-plaintext highlighter-rouge">reschedule_until</code>返回. 所以<code class="language-plaintext highlighter-rouge">reschedule_until</code>的作用就是不断执行任务知道谓词为真或者任务队列为空.</p>
<p>为什么说我们可以用来手动fork_join呢? 平时我们在任务中继续给线程池添加任务并等待, 很容易造成死锁, 因为等待的时候你占着线程却不干活:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// will deadlock
basic_thread_pool pool;
for (int i = 0; i < 100; ++i) {
pool.submit([&pool]() {
std::vector<boost::future<int> > vec;
for (int i = 0; i < 100; ++i) {
vec.push_back(boost::async(pool, []()->int{
return 42;
}));
}
boost::wait_for_all(vec.begin(), vec.end());
});
}
pool.join();
</code></pre></div></div>
<p>有了<code class="language-plaintext highlighter-rouge">reschedule_until</code>, 你就可以不直接等待, 而是将所有子任务完成作为谓词, 调用<code class="language-plaintext highlighter-rouge">reschedule_until</code>. 这样, 你占着线程的还不断干活, 不白等, 也就不会死锁了.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
// won't deadlock
basic_thread_pool pool;
for (int i = 0; i < 100; ++i) {
pool.submit([&pool]() {
std::vector<boost::future<int> > vec;
for (int i = 0; i < 100; ++i) {
vec.push_back(boost::async(pool, []()->int {
return 42;
}));
}
pool.reschedule_until([&vec]()->bool {
return boost::algorithm::all_of(vec, [](const auto& f){
return f.is_ready();
});
});
});
}
pool.join();
</code></pre></div></div>
<p>剩下的是析构函数, 它会关闭任务队列, 并中断然后等待所有工作线程:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>basic_thread_pool::~basic_thread_pool() {
close();
join();
}
void basic_thread_pool::join() {
m_threads.interrupt_all();
m_threads.join_all();
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">submit</code>的话, 只是简单地将任务加到任务队列而已:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void basic_thread_pool::submit(work& w) {
m_tasks.push(w);
}
</code></pre></div></div>
<h3 id="boostserial_executor">boost.serial_executor</h3>
<p>serial_executor保证了没有工作单元会并发执行, 但并不会保证工作单元就是在一个线程上执行的. 所以, serial_executor需要指定底层的executor, 比如底层的executor是basic_thread_pool的话, 工作单元可能会在不同的线程中执行, 但是仍然保证不会并发.</p>
<p>其内部保证不会并发的机制就是……用future/promise机制等到前一个task执行完再执行下一个.</p>
<p>它的<code class="language-plaintext highlighter-rouge">try_executing_one</code>很好地体现了这一点:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool serial_executor::try_executing_one() {
work task;
try {
if (queue_op_status::success == m_tasks.try_pull(task)) {
boost::promise<void> p;
m_ex.submit([&](){
try {
task();
p.set_value();
} catch (...) {
p.set_exception(boost::current_exception());
}
});
p.get_future().wait();
} // if
} catch (...) {
std::terminate();
}
}
</code></pre></div></div>
<p>其中m_ex是我们构造<code class="language-plaintext highlighter-rouge">serial_executor</code>时传进来的底层executor, 在boost中, 为了擦除这个底层executor的类型, 用<code class="language-plaintext highlighter-rouge">generic_executor_ref</code>包装了一下, 具体代码可参见<code class="language-plaintext highlighter-rouge">boost/thread/executor/generic_executor_ref.hpp</code>, 这里不赘述, 就假装我们只支持一种类型的executor, 并直接引用好了.</p>
<p>boost中当然没用lambda, 这里只是为了方便, 但行为是一样的. 这里虽然捕获了异常, 但等待future的时候会再抛出然后terminate.</p>
<p>它的<code class="language-plaintext highlighter-rouge">worker_thread</code>比较有特点, 它调用的是自己的<code class="language-plaintext highlighter-rouge">try_executing_one</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void serial_executor::worker_thread() {
while (!closed()) {
schedule_one_or_yield();
}
while (try_executing_one()) {
}
}
void serial_executor::schedule_one_or_yield() {
if (!try_executing_one()) {
boost::this_thread::yield();
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">schedule_one_or_yield</code>是尝试执行一个任务, 否则<code class="language-plaintext highlighter-rouge">yield</code>放弃CPU. 第一个while结束的时候, 任务队列肯定是关闭的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool serial_executor::closed() {
return m_tasks.closed();
}
void serial_executor::close() {
m_tasks.close();
}
</code></pre></div></div>
<p>但是关闭的<code class="language-plaintext highlighter-rouge">sync_queue</code>仍然可以<code class="language-plaintext highlighter-rouge">try_pull</code>, 这样我们可以继续把队列中的元素拿出来. 所以, 第二个loop是为了把剩下的任务执行完.</p>
<h3 id="boostserial_executor_cont">boost.serial_executor_cont</h3>
<p>与<code class="language-plaintext highlighter-rouge">serial_exector</code>类似, boost有个叫<code class="language-plaintext highlighter-rouge">serial_executor_cont</code>的奇怪的executor.</p>
<p>为什么叫cont呢, 因为它的串行是用过future的continuation来做的, 也就是用<code class="language-plaintext highlighter-rouge">then</code>, 这样他不需要任务队列, 也不需要线程. 只要持有一个future, 每次submit都then下去, 然后……就串行了.</p>
<p>我们来看它神奇的<code class="language-plaintext highlighter-rouge">submit</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
void serial_executor_cont::submit(work& w) {
boost::lock_guard<boost::mutex> lk(m_mtx);
if (closed(lk)) {
BOOST_THROW_EXCEPTION( boost::sync_queue_is_closed() );
}
m_future = m_future.then(m_ex, [task = std::move(w)](boost::future<void> f)) {
try {
task();
} catch (...) {
std::terminate();
}
});
}
</code></pre></div></div>
<p>别在意这里capture用的是什么语法, 反正boost也不用lambda, 总之就是将<code class="language-plaintext highlighter-rouge">w</code>又包成一个闭包再传给<code class="language-plaintext highlighter-rouge">then</code>. 为了保证<code class="language-plaintext highlighter-rouge">task</code>执行有异常的时候调<code class="language-plaintext highlighter-rouge">terminate</code>, 我们需要包装一下而不是把<code class="language-plaintext highlighter-rouge">w</code>直接给<code class="language-plaintext highlighter-rouge">then</code>.</p>
<p>我们知道<del>我好像还没写来着</del>, <code class="language-plaintext highlighter-rouge">then</code>本质上是回调, 指定了executor的<code class="language-plaintext highlighter-rouge">then</code>就是回调的时候将闭包提交到executor那. 那它本质上跟上面的<code class="language-plaintext highlighter-rouge">serial_executor</code>有区别吗?</p>
<p>另外, 因为没有任务队列, <code class="language-plaintext highlighter-rouge">reschedule_until</code>和<code class="language-plaintext highlighter-rouge">try_executing_one</code>也没有意义, 应该说, boost里面,<code class="language-plaintext highlighter-rouge">serial_executor_cont</code> 根本没写<code class="language-plaintext highlighter-rouge">reschedule_until</code>.</p>
<p>那最开始的<code class="language-plaintext highlighter-rouge">m_future</code>怎么来的呢? 是<code class="language-plaintext highlighter-rouge">serial_execuytor_cont</code>构造的时候, <code class="language-plaintext highlighter-rouge">boost::make_ready_future</code>来的.</p>
<h3 id="boostloop_executor">boost.loop_executor</h3>
<p><code class="language-plaintext highlighter-rouge">loop_executor</code>有任务队列, 却没有线程, 因为它要我们”donate”一个线程, 也就是说, 我们找个线程去跑它里面的任务:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::executor::loop_executor ex;
ex.submit([]() {
std::cout << "hello world" << std::endl;
});
boost::thread tr(&boost::executor::loop_executor::loop, ex);
tr.join();
</code></pre></div></div>
<p>它提供了一个<code class="language-plaintext highlighter-rouge">loop</code>函数还给我们单独为之创建线程:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void loop_executor::loop() {
while (execute_one(/*wait=*/true)) {
}
while (try_executing_one()) {
}
}
bool loop_executor::execute_one(bool wait) {
work task;
try {
queue_op_status st = wait ? m_tasks.wait_pull(task) : m_tasks.try_pull(task);
if (st == queue_op_status::success) {
task();
return true;
}
return false;
} catch (...) {
std::terminate();
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">execute_one</code>是实际上执行的函数, <code class="language-plaintext highlighter-rouge">wait</code>参数只是决定pull的方式, 跟前面写的几种executor没什么区别. 而且很显然, 它会被用于实现<code class="language-plaintext highlighter-rouge">try_executing_one</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool loop_executor::try_executing_one() {
executo_one(false);
}
</code></pre></div></div>
<p>除了<code class="language-plaintext highlighter-rouge">loop</code>函数, <code class="language-plaintext highlighter-rouge">loop_executor</code>还提供了<code class="language-plaintext highlighter-rouge">run_queued_closures</code>, 让用户在调用线程执行任务, 比如主线程:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void loop_executor::run_queued_closures() {
sync_queue<work>::underlying_queue_type q = work_queue.underlying_queue();
while (!q.empty()) {
work& task = q.front();
task();
q.pop_front();
}
}
</code></pre></div></div>
<p>这大概通常是用来测试的. 也许你有些奇怪它为什么要把underlying_queue拿出来, 嗯, 我也觉得挺奇怪的. 这是因为, <code class="language-plaintext highlighter-rouge">underlying_queue()</code>这个成员函数是线程安全的, 而且, 它是将内部数据”移动”出来了. 也就是说, 这一步把已有的任务全都拿出来了, 后面加的不管. 至于”移动”之后, 任务队列还能不能用了? 我试了一下. 是可以的.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::executors::loop_executor ex;
boost::mutex mtx;
work f = [&]() {
mtx.lock();
std::cout << boost::this_thread::get_id() << std::endl;
mtx.unlock();
};
ex.submit(f);
ex.submit(f);
ex.run_queued_closures();
ex.submit(f);
ex.run_queued_closures();
</code></pre></div></div>
<h2 id="总结">总结</h2>
<p>boost executor框架给我们提供了一系列executor实现, 其中包括比较简单的线程池. 而boost executor的设计, 特意提供了主动执行executor中滞留任务的方法, 即<code class="language-plaintext highlighter-rouge">try_executing_one</code>和<code class="language-plaintext highlighter-rouge">reschedule_until</code>, 这使得我们可以较为自然地在任务中继续分割任务.</p>
<p>但boost executor也是不完善的, 还没有提供java中比较成熟的, 比如work-stealing thread pool或者fork-join thread pool. 我们会在后面的文章中讨论他们.</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] boost, <a href="https://www.boost.org/doc/libs/1_69_0/doc/html/thread/synchronization.html#thread.synchronization.executors">Executors and Schedulers – EXPERIMENTAL</a>, 1.69.0</li>
<li class="ref">[2] Chris Mysen, Niklas Gustafsson, Matt Austern, Jeffrey Yasskin, <a href="http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3785.pdf">Executors and schedulers, revision 3</a>, Qct. 2013</li>
<li class="ref">[3] Brian Goetz等著, 童云兰等译. Java并发编程实战, 北京, 机械工业出版社. 2012.2, p93~p109</li>
</ul>
C++并发型模式#9: 定时任务 - Scheduler
2019-03-09T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-9-scheduler
<p>这里说的Scheduler是维基上所说的”Scheduled-task pattern”[1], 而不是系统资源调度的那个”Scheduling(computing)”[2]. 毕竟, 基于线程的讨论, 我们不会打算去控制系统怎么调度线程. (也许等到我们讲到fiber的时候, 就需要自己调度fiber了).</p>
<p>所以这里的Scheduler就是定时任务(的调度), 比如1秒后做个什么事情, 8点20做个什么事情. 换成代码上的说法就是, 一个时间间隔(duration)后执行某项任务(task), 某个时间点(time point)执行某项任务(task).</p>
<h2 id="c中的时间">C++中的时间</h2>
<p>之前我们一直避开时间的讨论, 是因为时间确实是个复杂的东西, 而且boost中时间相关的库更是复杂. 不过好在boost::chrono进标准成了std::chrono, 我们就只讨论boost::chrono好了.</p>
<h3 id="时钟">时钟</h3>
<p>时钟(Clock) 在chrono中是一个Concept或者说Requirement, 它要求时钟类提供以下信息:</p>
<ul>
<li>当前时间(now)</li>
<li>从时钟获取到的时间值的类型(representation type, 通常是int, long之类的), 以及duration和time_point的typedef)</li>
<li>时钟的节拍周期(tick ratio)</li>
<li>时钟是否匀速(steady)计时</li>
</ul>
<p>chrono至少提供<code class="language-plaintext highlighter-rouge">system_clock</code>, <code class="language-plaintext highlighter-rouge">steady_clock</code>, <code class="language-plaintext highlighter-rouge">high_resolution_clock</code>三种时钟, 每种都符合上面说的Concept或者说Requirement.</p>
<p>我们先来看匀速(steady)的概念. 如果一个时钟是匀速且不可调整的, 那么这个时钟(类)就是匀速的, 比如说<code class="language-plaintext highlighter-rouge">boost::chrono::steady_clock</code>. 好吧, 听起来像废话, 但问题是, 系统时间它通常都是不匀速的. 系统时间是可调整的. 因为本地时钟漂移, 系统甚至自动调整时间. 所以先后两次<code class="language-plaintext highlighter-rouge">boost::chrono::system_clock::now()</code>返回的时间不是单调递增的, 而<code class="language-plaintext highlighter-rouge">boost::chrono::steady_clock::now()</code>这是单调递增的. 在多线程编程中, 使用匀速时钟是有到处的, 至少不会因为系统时钟调整而出现什么惊喜.</p>
<p>节拍周期指时钟每秒走多少拍, 比如每秒走25拍, 我们可以定义出<code class="language-plaintext highlighter-rouge">boost::ratio<1,25></code>, 显然, 这通常是编译期决定的, 而我们通常不会关心它(至少现在还不需要关心它).</p>
<h3 id="时间间隔">时间间隔</h3>
<p>duration, 它表示时间间隔, 时间段. 实现上它是个模板, 模板参数是上面说的节拍周期和时间值类型, 我们通常也不太关心具体怎么特化, 因为chrono已经帮我们定义好了一些typedef: nanosechonds, microseconds, milliseconds, seconds, minutes, hours. chrono还提供了他们之间算术运算符以及转换函数<code class="language-plaintext highlighter-rouge">boost::chrono::duration_cast</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::chrono::milliseconds d1(2333);
boost::chrono::seconds d2 = boost::chrono::duration_cast<boost::seconds>(d1);
//d2应该是2秒
</code></pre></div></div>
<p>因为时间值类型是某种整形, 所以小的往大的转, 就会截断(不是四舍五入).</p>
<h3 id="时间点">时间点</h3>
<p>time_point, 表示时间点, 时刻. 实现上它也是个模板, 参数比duration还多, 不过我们通常也不用关心, 因为我们都是使用clock提供的typedef的.</p>
<p>虽然时间点经常用于表述绝对时间, 但是我们却很少真去定义一个明确的时间点, 比如”9102年3月12日20点37分00秒”, 通常是从now开始, 加减一个duration出来的.</p>
<p>时间点经常用于条件变量的超时等待, 比如某场景下, 我们最多等500毫秒:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::condition_variable cond;
boost::mutex mtx;
bool done = false;
const auto timeout = boost::chrono::steady_clock::now() + boost::chrono::milliseconds(500);
boost::unique_lock<boost::mutex> lk(mtx);
while (!done) {
if (cond.wait_until(lk, timeout) == boost::cv_status::timeout) {
break;
}
}
</code></pre></div></div>
<p>因为要处理伪唤醒, 所以这里要用while, 如果这里用时间段的<code class="language-plaintext highlighter-rouge">wait_for</code>, while循环中你还得把过去的时间减掉, 否则可能一直伪唤醒, 一直重复进入等待, 等到天荒地老. 所以还是用时间点好了, 即使伪唤醒了, 下次等还是那个时间点.</p>
<h2 id="scheduler">scheduler</h2>
<p>boost的scheduler可以指定executor(executor我们下篇才讨论, 它决定任务在哪个线程(池)执行). 如果我们去掉指定executor的接口, scheduler只是使用一个线程执行task, 接口大概如下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class work;
class scheduler {
public:
typedef boost::chrono::steady_clock::time_point time_point;
typedef boost::chrono::steady_clock::duration duration;
public:
scheduler();
~scheduler();
public:
void submit_at(work w, const time_point& tp);
void submit_after(work w, const duration& dura);
};
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">submit_at</code>就是在<code class="language-plaintext highlighter-rouge">tp</code>这个时间点执行<code class="language-plaintext highlighter-rouge">w</code>; <code class="language-plaintext highlighter-rouge">submit_after</code>就是<code class="language-plaintext highlighter-rouge">dura</code>这么多时间后执行<code class="language-plaintext highlighter-rouge">w</code>.</p>
<p>这里的<code class="language-plaintext highlighter-rouge">class work</code>是需要我们去定义和实现的任务类, 实际上它可以是这样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class work_base {
public:
virtual void call() = 0;
};
class work : public work_base {
public:
virtual void call() {
//...
}
};
</code></pre></div></div>
<p>派生类各种重载虚函数<code class="language-plaintext highlighter-rouge">call</code>, 然后要执行的任务就在<code class="language-plaintext highlighter-rouge">call</code>里面实现;</p>
<p>也可以是这样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typdef boost::function<void()> work;
</code></pre></div></div>
<p>当然boost跟倾向于后者, 以闭包做work, 这样更泛用一些, 相关讨论可以查参考<a href="https://www.boost.org/doc/libs/1_69_0/doc/html/thread/synchronization.html#thread.synchronization.executors.rationale.closure">Closure</a>.</p>
<p>用起来就像:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boost::executors::scheduler<boost::chrono::steady_clock> sc;
sc.submit_after([] {
std::cout << "hello world" << std::endl;
}, boost::chrono::seconds(5));
</code></pre></div></div>
<p>所以, 综上所述, scheduler就是个可以指定时间执行任务的东西, 这个执行通常在别的线程, 所以它是并发编程模式的一种.</p>
<h2 id="scheduler的实现">scheduler的实现</h2>
<p>boost的scheduler派生自<code class="language-plaintext highlighter-rouge">boost::executors::detail::scheduled_executor_base</code>, 它提供基本的<code class="language-plaintext highlighter-rouge">submit_at</code>, <code class="language-plaintext highlighter-rouge">submit_after</code>实现[3].</p>
<p><code class="language-plaintext highlighter-rouge">boost::executor::detail::scheduled_executor_base</code>则派生自<code class="language-plaintext highlighter-rouge">boost::executor::priority_executor_base</code>, 它提供我们用来执行任务的线程的函数体(包括任务队列).</p>
<p><code class="language-plaintext highlighter-rouge">boost::executor::scheduler</code>本身这持有执行任务的线程(以及指定executor的一系列操作, 我们下篇在谈).</p>
<p>忽略模板, 派生结构如下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class priority_executor_base {};
class scheduled_executor_base : public priority_executor_base {};
class scheduler : public scheduled_executor_base {};
</code></pre></div></div>
<p>我们先来看scheduler, 它在构造时新建线程, 而线程的执行体这来自<code class="language-plaintext highlighter-rouge">priority_executor_base</code>的成员函数, 假设这个成员函数就叫<code class="language-plaintext highlighter-rouge">loop</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class priority_executor_base {
public:
void loop();
void close();
};
class scheduled_executor_base : public priority_executor_base {}
class scheduler : public scheduled_executor_base {
boost::thread m_thread;
public:
scheduler() : scheduled_executor_base(), m_thread(&priority_executor_base::loop, this) {}
~scheduler() {
priority_executor_base::close();
m_thread.interrupt();
m_thread.join();
}
};
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">priority_executor_base::close()</code>会管理任务队列, 这里的任务队列也是我们上一篇讨论的阻塞队列的衍生, 关闭时会唤醒所有等待的线程. 这样<code class="language-plaintext highlighter-rouge">m_thread</code>就能顺利退出.</p>
<p><code class="language-plaintext highlighter-rouge">scheduled_executor_base</code>实现的<code class="language-plaintext highlighter-rouge">submit_at</code>以及<code class="language-plaintext highlighter-rouge">submit_after</code>其实就是把任务放到任务队列里, 这里的任务队列是优先队列, 优先级由时间决定. 所以很自然的, 任务队列里储存的是时间点, <code class="language-plaintext highlighter-rouge">submit_after</code>也会把时间段加上<code class="language-plaintext highlighter-rouge">now()</code>变成时间点:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class sync_timed_queue;
class priority_executor_base {
public:
sync_timed_queue m_workq;
void close();
bool closed() const;
void loop();
};
class scheduled_executor_base : public priority_executor_base {
public:
typedef boost::chrono::steady_clock clock;
typedef typename clock::time_point time_point;
typedef typename clock::duration duration;
public:
scheduled_executor_base(){}
~scheduled_executor_base() {
if (!priority_executor_base::closed()) {
priority_executor_base::close();
}
}
void submit_at(work w, const time_point& tp) {
priority_executor_base::m_workq.push(w, tp);
}
void submit_after(work w, const duration& dura) {
priority_executor_base::m_workq.push(w, clock::now() + dura);
}
};
</code></pre></div></div>
<p>事实上, <code class="language-plaintext highlighter-rouge">priority_executor_base</code>的实现也不复杂, 因为排序, 超时等都封装到<code class="language-plaintext highlighter-rouge">sync_timed_queue</code>去了:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class sync_timed_queue;
class priority_executor_base {
public:
sync_timed_queue m_workq;
void close() {
m_workq.close();
}
bool closed() const {
return m_workq.closed();
}
void loop() {
// maybe support thread interrupted here, so use try catch
try {
for (;;) {
try {
work task;
queue_op_status st = m_workq.wait_pull(task);
if (st == queue_op_status::closed) {
return;
}
// execute task !
task();
} catch (boost::thread_interrupted&) {
return;
}
} // end for
} catch (...) { // task() may throw exeception
std::terminate();
return;
} // try
}
};
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">loop</code>函数实际上就是不停地从任务队列里拿出任务, 然后执行, 时间问题全然交予任务队列处理. 如果任务队列close了, 线程也就完成返回了. 所以, 这里的实现难点, 其实在<code class="language-plaintext highlighter-rouge">sync_timed_queue</code>.</p>
<p>事实上这里跳过了许多讨论, 比如有些scheduler示例[4]就不只是在一个线程上执行, 而是每submit一个任务就创建一个线程, 然后sleep到定的时间. 这样的问题是显然的, 因为我们的任务可能很多, 而系统允许的线程数量确是有限的. 而且, 这种写法产生很多线程, 影响debug.</p>
<h3 id="sync_timed_queue">sync_timed_queue</h3>
<p>boost的sync_timed_queue当然是接口繁多, 不过我们上面其实就用到了其中一些接口, 所以我们可以简化一下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct scheduled_type;
class sync_timed_queue : sync_queue_base<scheduled_type> {
public:
sync_timed_queue();
~sync_timed_queue();
public:
void push(const work& w, const time_point& tp);
queue_op_status wait_pull(work& w);
};
</code></pre></div></div>
<p>这里的<code class="language-plaintext highlighter-rouge">sync_queue_base</code>就是我们上篇分析多的<code class="language-plaintext highlighter-rouge">sync_queue_base</code>, 不过需要把<code class="language-plaintext highlighter-rouge">underlying_queue_type</code>改成<code class="language-plaintext highlighter-rouge">std::priority_queue</code>.</p>
<p><code class="language-plaintext highlighter-rouge">scheduled_type</code>是把<code class="language-plaintext highlighter-rouge">work</code>和<code class="language-plaintext highlighter-rouge">time_point</code>包在一起的结构, 以作为<code class="language-plaintext highlighter-rouge">std::priority_queue</code>的数据类型. <code class="language-plaintext highlighter-rouge">scheduled_type</code>需要实现<code class="language-plaintext highlighter-rouge">operator<</code>, 这个<code class="language-plaintext highlighter-rouge">operator<</code>是要求偏序的, 不过好在<code class="language-plaintext highlighter-rouge">boost::chrono::steady_clock::time_point</code>因为是匀速时钟的时间点, 已经是偏序的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct scheduled_type {
typdef boost::function<void()> work;
typedef boost::chrono::steady_clock::time_point time_point;
work data;
time_point time;
scheduled_type(const work& w, const time_point& tp);
scheduled_type(const scheduled_type& other);
scheduled_type& operator=(const scheduled_type& other);
};
bool operator < (const scheduled_type& lhs, const scheduled_type& rhs) {
return lhs.time > rhs.time; // 时间小的排前面
}
</code></pre></div></div>
<p>现在我们最关心的应该是<code class="language-plaintext highlighter-rouge">wait_pull</code>怎么实现的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>queue_op_status sync_timed_queue::wait_pull(work& w) {
boost::unique_lock<boost::mutex> lk(m_mtx);
return wait_pull(lk, w);
}
queue_op_status sync_timed_queue::wait_pull(boost::unique_lock<boost::mutex>& lk, const work& w) {
const bool has_been_closed = wait_until_not_empty_time_reached_or_closed(lk);
if (has_been_closed) {
return queue_op_status::closed;
}
pull(lk, w);
return queue_op_status::success;
}
void sync_timed_queue::pull(boost::unique_lock<boost::mutex>& lk, work& w) {
w = m_data.top().data;
m_data.pop();
}
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">wait_pull</code>是调用<code class="language-plaintext highlighter-rouge">wait_until_not_empty_time_reached_or_closed</code>来等待, 这个我们还没实现, 因为它比较复杂. 等到可以pull的时候, 就把<code class="language-plaintext highlighter-rouge">m_data</code>的<code class="language-plaintext highlighter-rouge">top</code>给<code class="language-plaintext highlighter-rouge">pop</code>出来. 一切就很明了, 就是这个<code class="language-plaintext highlighter-rouge">wait_to_pull</code>.</p>
<p><code class="language-plaintext highlighter-rouge">wait_until_not_empty_time_reached_or_closed</code>要做什么呢? 看名字就挺多的, 首先, 跟简单的<code class="language-plaintext highlighter-rouge">sync_queue</code>一样, 要等待非空; 其次, 即使非空了, 但指定的时间还没到, 也得等. 新的任务进来了, 得看一下新任务会不会更快到时间…直到非空且队首时间已到.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// 这里返回true表示队列关闭, 返回false表示可以pull
bool sync_timed_queue::wait_until_not_empty_time_reached_or_closed(boost::unique_lock<boost::mutex>& lk) {
for (;;) {
if (sync_queue_base::closed(lk)) {
return true;
}
while (!sync_queue_base::empty(lk)) {
if (time_reached(lk)) {
return false;
}
const time_point tp(m_data.top().time);
m_cond_not_empty.wait_until(lk, tp);
if (sync_queue_base::closed(lk)) {
return true;
}
}
if (sync_queue_base::closed(lk)) {
return true;
}
m_cond_not_empty.wait(lk);
}
}
</code></pre></div></div>
<p>我们看到它有个循环, 循环体中, 首先看一下队列有没有关闭. 然后如果队列非空, 则进超时等待, 等待的时长在内层的<code class="language-plaintext highlighter-rouge">while</code>循环中每次更新, 因为push会notify<code class="language-plaintext highlighter-rouge">m_cond_not_empty</code>, 所以有新任务进来的时候, 内层的<code class="language-plaintext highlighter-rouge">while</code>循环中的<code class="language-plaintext highlighter-rouge">wait_until</code>会唤醒, 然后(也许队首更新了)如果还是到时间, 就在此进入超时等待.</p>
<p>如果队列空的话, 则等待被<code class="language-plaintext highlighter-rouge">push</code>或<code class="language-plaintext highlighter-rouge">close</code>唤醒. 所以, 但此函数返回的时候, 要么队列关闭了, 要么就是队首的时间到了.</p>
<p><code class="language-plaintext highlighter-rouge">time_reached</code>其实比较简单, 只是简单地查询一下状态:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool sync_timed_queue::time_reached(boost::unique_lock<boost::mutex>& lk) const {
return clock::now() >= m_data.top().time;
}
</code></pre></div></div>
<p>然后我们来实现<code class="language-plaintext highlighter-rouge">push</code>, 大部分代码跟我们实现的<code class="language-plaintext highlighter-rouge">sync_queue</code>是一样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void sync_timed_queue::push(const work& w, const time_point& tp) {
push(scheduled_type(w, tp));
}
void sync_timed_queue::push(const scheduled_type& elem) {
boost::unique_lock<boost::mutex> lk(m_mtx);
sync_queue_base::throw_if_closed(lk);
push(elem, lk);
}
void sync_timed_queue::push(const scheduled_type& elem, boost::unique_lock<boost::mutex>& lk) {
m_data.push(elem);
sync_queue_base::notify_not_empty_if_needed(lk);
}
</code></pre></div></div>
<p>(这是boost 1.66的写法, 1.67~1.69可能有bug, 参考<a href="https://github.com/boostorg/thread/issues/271">issue 271</a> )</p>
<h3 id="on-executor">on executor</h3>
<p>也许看到这里你已经发现了一个问题, 我们的task是让一个线程执行的, 如果我们的task执行时间很长, 后面的task就可能被耽误了.</p>
<p>那么很自然的想法是, 每个task新开一个线程执行, 这样延时是小了, 但是task多了又会说, 调度浪费过多系统资源啦, 之类的. 放到一个线程池里执行, 也许又觉得延迟大了.</p>
<p>所以, 很C++地, 让用户自己决定好了. 这个task怎么跑? 你传什么executor, 它就怎么跑.</p>
<p>当然, 实际上scheduler的那个线程还在, 我们只是包装了一下task, 包装过的给scheduler, 到时间就把实际上的task提交到executor.</p>
<p>boost中executor是一个concept, 方便起见我们只要求这个concept有<code class="language-plaintext highlighter-rouge">void submit(work w)</code>. <code class="language-plaintext highlighter-rouge">submit</code>接受的也是<code class="language-plaintext highlighter-rouge">boost::function<void()></code>. 包装task的类我们称为<code class="language-plaintext highlighter-rouge">resubmitter</code>好了:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename Executor>
class resubmitter {
Executor& ex;
work func;
public:
resubmitter(Executor& ex, work w) : ex(ex), func(w) {}
void operator()() {
ex.submit(func);
}
};
</code></pre></div></div>
<p>那resubmitter怎么用的? boost又双叒叕包装了一下, 反正用起来就像:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scheduler sc;
basic_thread_pool ex;
sc.on(ex).after(boost::chrono::milliseconds(500)).submit([](){
std::cout << "hello world" << std::endl;
});
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">on</code>返回的是<code class="language-plaintext highlighter-rouge">scheduler_executor_wrapper</code>, <code class="language-plaintext highlighter-rouge">after</code>返回的是<code class="language-plaintext highlighter-rouge">resubmit_at_executor</code>. 嗯……总之我们知道他们需要以下接口:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename Executor>
class resubmit_at_executor {
scheduler& sch;
Executor& ex;
public:
typedef typename scheduler::clock clock;
public:
resubmit_at_executor(scheduler& sch, Executor& ex, const clock::time_point& tp);
~resubmit_at_executor();
public:
void submit(work w);
};
template <typename Executor>
class scheduler_executor_wrapper {
scheduler& sch;
Executor& ex;
public:
typedef typename scheduler::clock clock;
public:
scheduler_executor_wrapper(scheduler& sch, Executor& ex);
~scheduler_executor_wrapper();
public:
resubmit_at_executor<Executor> after(const clock::duration& dura);
resubmit_at_executor<Executor> at(const clock::time_point& tp);
};
class scheduler {
public:
template<typename Ex>
scheduler_executor_wraper<Ex> on(Ex& ex);
};
</code></pre></div></div>
<p>我们从<code class="language-plaintext highlighter-rouge">on</code>开始, 首先<code class="language-plaintext highlighter-rouge">on</code>就是为了得到一个<code class="language-plaintext highlighter-rouge">scheduler_executor_wrapper</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Ex>
scheduler_executor_wrapper<Ex> scheduler::on(Ex& ex) {
return scheduler_executor_wrapper<Ex>(*this, ex);
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">scheduler_executor_wrapper</code>的构造函数就是把<code class="language-plaintext highlighter-rouge">sch</code>和<code class="language-plaintext highlighter-rouge">ex</code>俩引用成员初始一下, 不赘述.</p>
<p><code class="language-plaintext highlighter-rouge">after</code>和<code class="language-plaintext highlighter-rouge">at</code>则是为了得到一个<code class="language-plaintext highlighter-rouge">resubmit_at_executor</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Ex>
resubmit_at_executor<Ex> scheduler_executor_wrapper<Ex>::after(const clock::duration& dura) {
return at(clock::now() + dura);
}
template<typename Ex>
resubmit_at_executor<Ex> scheduler_executor_wrapper<Ex>::at(const clock::time_point& tp) {
return resubmit_at_executor(sch, ex, tp);
}
</code></pre></div></div>
<p>最后是<code class="language-plaintext highlighter-rouge">resubmit_at_executor</code>, 其构造函数也是将引用成员初始一下, 不赘述. <code class="language-plaintext highlighter-rouge">submit</code>这是构造一个<code class="language-plaintext highlighter-rouge">resubmitter</code>, 然后提交到引用的scheduler去:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Ex>
void resubmit_at_executor<Ex>::submit(work w) {
sch.submit_at(resubmitter(ex, w), tp);
}
</code></pre></div></div>
<p>实际上scheduler和executor都是可以close的, <code class="language-plaintext highlighter-rouge">submit</code>要考虑是否已经closed了, 不过这部分代码不难<del>留作习题</del>.</p>
<h2 id="总结">总结</h2>
<p>scheduler使用优先队列, 把任务按时间排序, 无论接口上是时间点还是时间段, 储存在内部数据结构的都是时间点, 使得我们可以按顺序执行到时间的任务. scheduler内维护了一个线程, 用于执行任务, 但队首的时间点未到时, 会进入超时等待. 但是, 新任务入队会唤醒这个等待, 因为新任务可能会是新的队首.</p>
<p>由于任务的执行时间不定, 为了避免延迟, boost允许用户指定executor, 比如线程池. 到达指定的时间点时, 将任务提交到executor.</p>
<p>在其他资料上也许能见到”定时器(Timer)”, 这个概念, 它也是提交定时任务, 那它跟scheduler是不是一个东西呢? 先说结论: 我不知道! 可能的区别是, Timer允许提交周期性任务, 延迟太多则不执行之类的.</p>
<p>executor的具体讨论我们留作下一篇. 它抽象了我们执行任务的方法, 它可能是单一的线程, 可能是线程池, 可能为每个任务开一个线程, 也可能是复杂的”work stealing fork join thread pool”(不过boost应该不会这样, fork-join已经有task_region提案).</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Wikipedia, <a href="https://en.wikipedia.org/wiki/Scheduled-task_pattern">Scheduled-task pattern</a></li>
<li class="ref">[2] Wikipedia, <a href="https://en.wikipedia.org/wiki/Scheduling_(computing)">Scheduling (computing)</a></li>
<li class="ref">[3] boost, <a href="https://www.boost.org/doc/libs/1_69_0/doc/html/thread/synchronization.html#thread.synchronization.executors">Executors and Schedulers – EXPERIMENTAL</a>, 1.69.0</li>
<li class="ref">[4] 罗剑锋, <em>Boost程序库探秘 – 深度解析C++准标准库</em>. 第2版. 北京, 清华大学出版社, 2014, p578~p580</li>
</ul>
C++并发型模式#8: Blocking Queue
2019-01-14T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurrency-pattern-8-blocking-queue
<h2 id="前言">前言</h2>
<p>我想写future的async和then, 这需要executor; 为了写executor, 我需要thread pool; 在thread pool之前, 又想把scheduler写了.</p>
<p>然而在boost的计划[2]中, scheduler, thread pool, executor都在一个话题里讨论, 篇幅颇大, 所以我觉得还是先把<code class="language-plaintext highlighter-rouge">boost/thread/concurrent_queue</code>里的组件的解释一下, 比如sync_queue, 后面讲scheduler, thread pool, executor需要用到.</p>
<h2 id="blocking-queue">Blocking Queue</h2>
<p>线程安全队列的话题非常庞大, 可谓千里之行. 千里之行始于足下, 以Blocking Queue为第一步应该够简单了.</p>
<p><code class="language-plaintext highlighter-rouge">boost::concurrent::sync_queue</code>也是Blocking Queue的实现, 我们在直面sync_queue的繁复接口前, 还是先直接实现一个简单的Blocking Queue吧.</p>
<p>首先声明一下接口:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
class blocking_queue : boost::noncopyable {
std::queue<T> m_queue;
boost::condition_variable m_cond;
mutable boost::mutex m_mutex;
public:
blocking_queue() {}
void push(const T& val);
void pop(T& val);
bool try_pop(const T& val);
size_t size() const;
bool empty() const;
};
</code></pre></div></div>
<p>事实上, <code class="language-plaintext highlighter-rouge">size()</code>和<code class="language-plaintext highlighter-rouge">empty()</code>的意义不是特别大, 因为在线程安全对象外部, 需要调两个方法的操作都可能有竟态(有些地方就干脆把它们命名为<code class="language-plaintext highlighter-rouge">size_unsafe</code>, <code class="language-plaintext highlighter-rouge">empty_unsafe</code>了). 所以这里<code class="language-plaintext highlighter-rouge">pop(T& val)</code>就会拿到队首并出队, 而不是像<code class="language-plaintext highlighter-rouge">std::queue</code>通过<code class="language-plaintext highlighter-rouge">front()</code>拿队首然后通过<code class="language-plaintext highlighter-rouge">pop()</code>出队.</p>
<p>push是简单的, 加锁入队就行, 然后notify_one:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void blocking_queue::push(const T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
m_queue.push(val);
m_cond.notify_one();
}
</code></pre></div></div>
<p>pop则因为队列可能为空, 故而得等队列不为空:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void blocking_queue::pop(T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
while (m_queue.empty()) {
m_cond.wait(lk);
}
val = m_queue.front();
m_queue.pop();
}
void blocking_queue::try_pop(T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
if (!m_queue.empty()) {
val = m_queue.front();
m_queue.pop();
return true;
}
return false;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">size()</code>和<code class="language-plaintext highlighter-rouge">empty()</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>size_t size() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.size();
}
bool empty() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.empty();
}
</code></pre></div></div>
<p>因为锁的粒度大, 锁了整个<code class="language-plaintext highlighter-rouge">m_queue</code>, 这个实现并不能一个线程<code class="language-plaintext highlighter-rouge">push</code>的同时, 另一个线程<code class="language-plaintext highlighter-rouge">pop</code>. 但胜在简单, 事实上, muduo库的<code class="language-plaintext highlighter-rouge">BlockingQueue</code>也是这么写的[1].</p>
<h2 id="bounded-blocking-queue">Bounded Blocking Queue</h2>
<p>有界阻塞队列和无界阻塞队列, 最大的区别在于, 有界的它会满, 满了就阻塞后面的push. 一般来说, 有界的上限是初始化就给定的, 所以可以先分配好这么多内存, 这就省了写分配内存的开销.</p>
<p>接口有些许不同, push也有try版本了; 因为<code class="language-plaintext highlighter-rouge">push</code>和<code class="language-plaintext highlighter-rouge">pop</code>都会阻塞, 所以需要两个条件变量:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
class bounded_blocking_queue : boost::noncopyable {
boost::circular_buffer<T> m_queue;
boost::condition_variable m_cond_not_full;
boost::condition_variable m_cond_not_empty;
mutable boost::mutex m_mutex;
public:
bounded_blocking_queue(size_t max_size);
void push(const T& val);
bool try_push(const T& val);
void pop(T& val);
bool try_pop(const T& val);
size_t size() const;
size_t capacity() const
bool empty() const;
bool full() const;
};
</code></pre></div></div>
<p>通常可以用<code class="language-plaintext highlighter-rouge">boost::circular_buffer</code>作为底层容器. 所以<code class="language-plaintext highlighter-rouge">bounded_blocking_queue</code>实现也比较简单:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
bounded_blocking_queue(size_t max_size) : m_queue(max_size) {}
void bounded_blocking_queue::push(const T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
while (m_queue.full()) {
m_cond_not_full.wait(lk);
}
m_queue.push_back(val);
m_cond_not_empty.notify_one();
}
bool bounded_blocking_queue::try_push(const T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
if (!m_queue.full()) {
m_queue.push_back(val);
m_cond_not_empty.notify_one();
return true;
}
return false;
}
void bounded_blocking_queue::pop(T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
while (m_queue.empty()) {
m_cond_not_empty.wait(lk);
}
val = m_queue.front();
m_queue.pop_front();
m_cond_not_full.notify_one();
}
bool bounded_blocking_queue::try_pop(const T& val) {
boost::unique_lock<boost::mutex> lk(m_mutex);
if (!m_queue.empty()) {
val = m_queue.front();
m_queue.pop_front();
m_cond_not_full.notify_one();
return true;
}
return false;
}
size_t bounded_blocking_queue::size() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.size();
}
size_t bounded_blocking_queue::capacity() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.capacity();
}
bool bounded_blocking_queue::empty() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.empty();
}
bool bounded_blocking_queue::full() const {
boost::unique_lock<boost::mutex> lk(m_mutex);
return m_queue.full();
}
</code></pre></div></div>
<p>那几个const方法其实就<code class="language-plaintext highlighter-rouge">capacity()</code>是可靠的, 其他都只能得到调用瞬间的状态. 可以看出, bound_blocking_queue跟我们之前写的channel是差不多的, 除了没有select.</p>
<h2 id="boostsync_queue">boost.sync_queue</h2>
<p>sync_queue虽说本质上也是blocking queue, 但毕竟boost家出品, 接口和实现都复杂许多. 最大的区别在于, sync_queue它支持close.</p>
<p>close的时候, 所有阻塞的调用都会唤醒并返回, 所以各版本的push/pop方法都有返回值, 返回是真push/pop了, 还是close了, 还是try失败了. 这返回值得是个enum, 其声明如下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// boost/thread/concurrent_queues/queue_op_status.hpp
enum queue_op_status {
success = 0,
empty,
full,
closed,
busy,
timeout,
not_ready
};
</code></pre></div></div>
<p>boost里面确实有这么多, 虽然我并不打算讨论timeout.</p>
<p>然后我们声明一下接口:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename T>
class sync_queue: public sync_queue_base<T> {
public:
typedef T value_type;
sync_queue();
~sync_queue();
public:
void push(const value_type& x);
queue_op_status try_push(const value_type& x);
queue_op_status nonblocking_push(const value_type& x);
queue_op_status wait_push(const value_type& x);
// 我们愉快地忽略右值版本
void pull(value_type& elem);
value_type pull();
queue_op_status try_pull(value_type& elem);
queue_op_status nonblocking_pull(value_type& elem);
queue_op_status wait_pull(value_type& elem);
};
</code></pre></div></div>
<p>有<code class="language-plaintext highlighter-rouge">queue_op_status</code>作为返回值的好说, 如果queue被关闭了也会返回. try_xx和nonblocking_xx的区别, 在于try_xx会获取保护数据的锁, nonblocking_xx则连锁都是try的. void返回值的那两个, 如果queue被关闭了, 则会抛异常.</p>
<p><code class="language-plaintext highlighter-rouge">sync_queue_base</code>则提供了一些获取状态的接口, close, 以及数据成员:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename T>
class sync_queue_base {
public:
typedef T value_type;
typedef std::queue<T> underlying_queue_type;
typedef typename std::queue<T>::size_type size_type;
sync_queue_base();
~sync_queue_base();
public:
bool empty() const;
bool full() const;
size_type size() const;
bool closed() const;
void close();
protected:
mutable boost::mutex m_mtx;
boost::condition_variable m_cond_not_empty;
underlying_queue_type m_data;
bool m_closed;
};
</code></pre></div></div>
<p>boost中, <code class="language-plaintext highlighter-rouge">underlying_queue_type</code>实际上是通过模板参数决定的, 这里只是偷懒直接用了<code class="language-plaintext highlighter-rouge">std::queue</code>.</p>
<p>另外还有许多接受锁(<code class="language-plaintext highlighter-rouge">unique_lock</code>, <code class="language-plaintext highlighter-rouge">lock_guard</code>)为参数的保护成员, 这里方便起见, 只写<code class="language-plaintext highlighter-rouge">unique_lock</code>的版本:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename T>
class sync_queue_base {
// ...
protected:
bool empty(boost::unique_lock<boost::mutex>&) const {
return m_data.empty();
}
size_type size(boost::unique_lock<boost::mutex>&) const {
return m_data.size();
}
bool closed(boost::unique_lock<boost::mutex>&) const {
return m_closed;
}
bool full(boost::unique_lock<boost::mutex>&) const {
return false;
}
// 有一些是给派生类准备的
void throw_if_closed(boost::unique_lock<boost::mutex>& lk) {
if (closed(lk)) {
BOOST_THROW_EXCEPTION( sync_queue_is_closed() );
}
}
bool not_empty_or_closed(boost::unique_lock<boost::mutex>& lk) {
return !m_data.empty() || m_closed;
}
bool wait_until_not_empty_or_closed(boost::unique_lock<boost::mutex>& lk) {
while (empty(lk) && !closed(lk)) {
m_cond_not_empty.wait(lk);
}
if (!empty(lk)) {
return false; // success
}
return true; // closed;
}
void notify_not_empty_if_needed(boost::unique_lock<boost::mutex>& lk) {
m_cond_not_empty.notify_all();
}
// ...
};
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">close</code>是要notify所有等待者的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void close() {
{
boost::unique_lock<boost::mutex> lk(m_mtx);
m_closed = true;
}
m_cond_not_empty.notify_all();
}
</code></pre></div></div>
<p>剩下的<code class="language-plaintext highlighter-rouge">bool empty() const</code>几个应该是很好写的, 这里不赘述.</p>
<p>下面我们先来实现<code class="language-plaintext highlighter-rouge">push</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
void sync_queue<T>::push(const T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx);
sync_queue_base<T>::throw_if_closed(lk);
push(elem, lk);
}
template<typename T>
void sync_queue<T>::push(const T& elem, boost::unique_lock<boost::mutex>& lk) {
m_data.push_back(elem);
sync_queue_base<T>::notify_not_empty_if_needed(lk);
}
</code></pre></div></div>
<p>如果发现closed了, 就抛异常, 否则入队, notify. 抛异常和notify都是基类<code class="language-plaintext highlighter-rouge">sync_queue_base</code>就写好的, 所以说简洁也简洁.</p>
<p><code class="language-plaintext highlighter-rouge">pull</code>也是这样的, 只是判断closed就抛异常:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
void sync_queue<T>::pull(T& elem) {
boost::unique_lock<boost::mutex> lk(m_mtx);
const bool has_been_closed = sync_queue_base<T>::wait_until_not_empty_or_closed(lk);
if (has_beed_closed) {
sync_queue_base<T>::throw_if_closed(lk);
}
pull(elem, lk);
}
template<typename T>
void sync_queue<T>::pull(T& elem, boost::unique_lock<boost::mutex>& lk) {
elem = sync_queue_base<T>::m_data.front(); // 这里应该用move
sync_queue_base<T>::m_data.pop_front();
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">try_push</code>则需要返回status, 其实也只有<code class="language-plaintext highlighter-rouge">closed</code>和<code class="language-plaintext highlighter-rouge">success</code>两种而已:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
queue_op_status sync_queue<T>::try_push(const T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx);
return try_push(elem, lk);
}
template<typename T>
queue_op_status sync_queue<T>::try_push(const T& elem, boost::unique_lock<boost::mutex>& lk) {
if (sync_queue_base<T>::closed(lk)) {
return queue_op_status::closed;
}
push(elem, lk);
return queue_op_status::success;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">wait_push</code>跟<code class="language-plaintext highlighter-rouge">push</code>类似, 但close时会返回, 因为没有容量限制, 所以实际上不需要等待什么:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
queue_op_status sync_queue<T>::wait_push(const T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx);
return wait_push(elem, lk);
}
template<typename T>
queue_op_status sync_queue<T>::wait_push(const T& elem, boost::unique_lock<boost::mutex>& lk) {
if (sync_queue_base<T>::closed(lk)) {
return queue_op_status::closed;
}
push(elem, lk);
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">nonblocking_push</code>多一种状态, 就是<code class="language-plaintext highlighter-rouge">busy</code>, <code class="language-plaintext highlighter-rouge">busy</code>意味其他线程占用了锁, 所以<code class="language-plaintext highlighter-rouge">lk</code>构造时用了<code class="language-plaintext highlighter-rouge">try_to_lock</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
queue_op_status sync_queue<T>::nonblocking_push(const T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx, boost::try_to_lock);
if (!lk.owns_lock()) {
return queue_op_status::busy;
}
return try_push(elem, lk);
}
</code></pre></div></div>
<p>类似地, <code class="language-plaintext highlighter-rouge">pull</code>系列也可以写出来, 但是<code class="language-plaintext highlighter-rouge">pull</code>是需要等待队列非空的, 所以复杂一些:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename T>
queue_op_status sync_queue<T>::try_pull(T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx);
return try_pull(elem, lk);
}
template<typename T>
queue_op_status sync_queue<T>::try_pull(T& elem, boost::unique_lock<boost::mutex>& lk) {
if (sync_queue_base<T>::empty(lk)) {
if (sync_queue_base<T>::closed(lk)) {
return queue_op_status::closed;
}
return queue_op_status::empty;
}
pull(elem, lk);
return queue_op_status::success;
}
template<typename T>
queue_op_status sync_queue<T>::nonblocking_pull(T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx, boost::try_to_lock);
if (!lk.owns_lock()) {
return queue_op_status::busy;
}
return try_pull(elem, lk);
}
template<typename T>
queue_op_status sync_queue<T>::wait_pull(T& elem) {
boost::unique_lock<boost::mutex> lk(sync_queue_base<T>::m_mtx);
return wait_pull(elem, lk);
}
template<typename T>
queue_op_status sync_queue<T>::wait_pull(T& elem, boost::unique_lock<boost::mutex>& lk) {
const bool has_been_closed = sync_queue_base<T>::wait_until_not_empty_or_closed(lk);
if (has_been_closed) {
return queue_op_status::closed;
}
pull(elem, lk);
return queue_op_status::success;
}
template<typename T>
sync_queue<T>::value_type sync_queue<T>::pull() {
boost::unique_lock<boost::mutex> lk(m_mtx);
const bool has_been_closed = sync_queue_base<T>::wait_until_not_empty_or_closed(lk);
if (has_beed_closed) {
sync_queue_base<T>::throw_if_closed(lk);
}
}
template<typename T>
T sync_queue<T>::pull(boost::unique_lock<boost::mutex>& lk) {
// 还是有move的时候才提供这个版本比较好
typename T ret = std::move(sync_queue_base<T>::m_data.front());
sync_queue_base<T>::m_data.pop_front();
return ret;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">wait_pull</code>只能返回<code class="language-plaintext highlighter-rouge">closed</code>或<code class="language-plaintext highlighter-rouge">success</code>. 需要注意的是, 已经关闭的<code class="language-plaintext highlighter-rouge">sync_queue</code>也是可以继续<code class="language-plaintext highlighter-rouge">pull</code>出元素的, 队列空了的时候才抛异常.</p>
<h2 id="boostsync_bounded_queue">boost.sync_bounded_queue</h2>
<p>虽然说就是有界版本的sync_queue, 也是一个mutex, 两个condition_variable. 不过boost的<code class="language-plaintext highlighter-rouge">sync_bounded_queue</code>并没有使用<code class="language-plaintext highlighter-rouge">boost::circular_buffer</code>, 而是自己分配一块连续内存作环形队列.</p>
<p>与<code class="language-plaintext highlighter-rouge">sync_queue</code>不同的是, 它有<code class="language-plaintext highlighter-rouge">shared_ptr</code>版本的pull, 其实现如下(需要移动) :</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inline boost::shared_ptr<value_type> ptr_pull(unique_lock<mutex>& lk)
{
boost::shared_ptr<value_type> res =
boost::make_shared<value_type>(boost::move(data_[out_]));
out_ = inc(out_);
notify_not_full_if_needed(lk);
return res;
}
</code></pre></div></div>
<p>这里的xxx_if_needed是因为<code class="language-plaintext highlighter-rouge">sync_bounded_queue</code>记录了入队和等待出队数量.</p>
<p>然而却没有<code class="language-plaintext highlighter-rouge">shared_ptr</code>版本的push, 好吧, 他们开心就好.</p>
<h2 id="性能测试">性能测试</h2>
<p>为了对比我们之前写的channel, 我们这里用一下代码(传递一个<code class="language-plaintext highlighter-rouge">shared_ptr</code>)测量一下性能:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int test(const int concurrency) {
const int num = 1000 * 1000;
typedef boost::shared_ptr<int> data_type;
blocking_queue<data_type> queue;
boost::thread_group thg;
const auto begin = boost::chrono::steady_clock::now();
for (int tr = 0; tr < concurrency; ++tr) {
thg.create_thread([&]() {
data_type dat;
for (int i = 0; i < num; ++i) {
queue.wait_pull_front(dat);
}
});
}
for (int tr = 0; tr < concurrency; ++tr) {
thg.create_thread([&]() {
data_type dat(new int(42));
for (int i = 0; i < num; ++i) {
queue.wait_push_back(dat);
}
});
}
thg.join_all();
const auto end = boost::chrono::steady_clock::now();
return boost::chrono::duration_cast<boost::chrono::milliseconds>(end - begin).count();
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">concurrency</code>表示起多少读写线程, <code class="language-plaintext highlighter-rouge">concurrency</code>等于1时, 一个读线程, 一个写线程. 得到以下结果, xxx(n)表示buffer size是n:</p>
<table>
<thead>
<tr>
<th>(ms)</th>
<th>1</th>
<th>2</th>
<th>4</th>
<th>6</th>
<th>8</th>
<th>16</th>
<th>32</th>
</tr>
</thead>
<tbody>
<tr>
<td>blocking queue</td>
<td>149</td>
<td>282</td>
<td>606</td>
<td>842</td>
<td>1145</td>
<td>2355</td>
<td>4700</td>
</tr>
<tr>
<td>sync_queue</td>
<td>130</td>
<td>350</td>
<td>733</td>
<td>1109</td>
<td>1470</td>
<td>3056</td>
<td>6246</td>
</tr>
<tr>
<td>channel(100)</td>
<td>340</td>
<td>1484</td>
<td>5194</td>
<td>8812</td>
<td>12687</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>bounded blocking queue(100)</td>
<td>140</td>
<td>344</td>
<td>677</td>
<td>1038</td>
<td>1423</td>
<td>2902</td>
<td>6084</td>
</tr>
<tr>
<td>boost.sync_bounded_queue(100)</td>
<td>268</td>
<td>1467</td>
<td>3432</td>
<td>6556</td>
<td>11642</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>channel(1000)</td>
<td>161</td>
<td>417</td>
<td>874</td>
<td>1325</td>
<td>1743</td>
<td>3642</td>
<td>7396</td>
</tr>
<tr>
<td>bounded blocking queue(1000)</td>
<td>178</td>
<td>326</td>
<td>664</td>
<td>993</td>
<td>1351</td>
<td>2836</td>
<td>5894</td>
</tr>
<tr>
<td>boost.sync_bounded_queue(1000)</td>
<td>120</td>
<td>431</td>
<td>826</td>
<td>1463</td>
<td>2465</td>
<td>9696</td>
<td>-</td>
</tr>
<tr>
<td>channel(10000)</td>
<td>152</td>
<td>343</td>
<td>677</td>
<td>1013</td>
<td>1372</td>
<td>2740</td>
<td>5591</td>
</tr>
<tr>
<td>bounded blocking queue(10000)</td>
<td>155</td>
<td>308</td>
<td>672</td>
<td>1039</td>
<td>1393</td>
<td>2896</td>
<td>5831</td>
</tr>
<tr>
<td>boost.sync_bounded_queue(10000)</td>
<td>98</td>
<td>284</td>
<td>580</td>
<td>798</td>
<td>1152</td>
<td>3006</td>
<td>19308</td>
</tr>
</tbody>
</table>
<p>测试平台: VS 2017, Intel i3 7100(双核四线程, 请原谅我如此贫穷), Windows 10, 开优化, 50次取平均.</p>
<p>可以看到, buffer size比较小的时候, channel和boost.sync_bounded_queue的性能明显不及其他, 但buffer大了以后, 差距就不明显了</p>
<h2 id="总结">总结</h2>
<p>Blocking Queue是我们经常使用的线程安全数据结构, 比如放线程池里做任务队列. 它的实现也可以很简单, 如上所述. boost的<code class="language-plaintext highlighter-rouge">sync_queue</code>和<code class="language-plaintext highlighter-rouge">sync_bounded_queue</code>就是Blocking Queue和Bounded Blocking Queue的实现, 虽然boost里面看着一堆代码, 实际上还是经典的实现, 没什么黑科技, 就是重载多而已(boost1.68). 也可以实现成入队出队分别加锁, 性能会好一些[3];</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] 陈硕, <em>Linux多线程服务端编程: 使用muduo C++网络库</em>. 北京, 电子工业出版社, 2013, p64</li>
<li class="ref">[2] boost, <a href="https://www.boost.org/doc/libs/1_69_0/doc/html/thread/synchronization.html#thread.synchronization.executors">Executors and Schedulers – EXPERIMENTAL</a>, 1.69.0</li>
<li class="ref">[3] Anthony Williams, <em>C++并发编程实战</em>. 北京, 人民邮电出版社, 2015, p149~p160</li>
</ul>
C++并发型模式#7: 读写锁 - shared_mutex
2019-01-07T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurency-pattern-7-rwlock
<h2 id="读者-写者问题">读者-写者问题</h2>
<p>考虑有一块共享内存, 外加好些个线程需要访问这块共享内存, 虽然我们可以直接上mutex, 把访问全部互斥, 但是, 如果写入很少的情况写把读取也互斥了, 又感觉没什么必要, 并发读不好吗? 怎么让多个读者同时访问共享资源, 就是所谓的读者-写者问题.</p>
<p>读写锁, 又称”共享-互斥锁”, 便是试图解决这个问题, 使得读操作可以并发重入, 写操作则互斥.</p>
<p>读写锁有不同的优先策略, 一种是读者优先, 即只有全部读操作都完成, 写操作才可以进行, 但是这样如果一直都有读操作的话, 写操作会饿死–等很久很久, 等到天荒地老, 都没等到没读者的时候.</p>
<p>另一种是写者优先, 等待已经开始的读操作, 在完成写操作前不增加新读者.</p>
<p>读者优先的读写锁可以用两个mutex和一个counter简单实现一下[2]:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class shared_mutex {
int m_shared_count;
boost::mutex m_mutex_count;
boost::mutex m_mutex_write;
public:
shared_mutex() : m_shared_count(0) {}
void lock() {
m_mutex_write.lock();
}
void unlock() {
m_mutex_write.unlock();
}
void lock_shared() {
m_mutex_count.lock();
m_shared_count++;
if (m_shared_count == 1) {
m_mutex_write.lock();
}
m_mutex_count.unlock();
}
void unlock_shared() {
m_mutex_count.lock();
m_shared_count--;
if (m_shared_count == 0) {
m_mutex_write.unlock();
}
m_mutex_count.unlock();
}
};
</code></pre></div></div>
<p>因为boost及c++17中将读写锁称为shared_mutex, 所以这里的接口皆依boost, 读锁为<code class="language-plaintext highlighter-rouge">lock_shared()</code>, 写锁为<code class="language-plaintext highlighter-rouge">lock()</code>.</p>
<p>这里<code class="language-plaintext highlighter-rouge">m_mutex_count</code>是用来保护<code class="language-plaintext highlighter-rouge">m_shared_count</code>的; 第一个读锁时把<code class="language-plaintext highlighter-rouge">m_mutex_write</code>锁了, 最后一个读锁解时才解<code class="language-plaintext highlighter-rouge">m_mutex_write</code>, 所以只要还有读者, <code class="language-plaintext highlighter-rouge">lock()</code>就无法获得<code class="language-plaintext highlighter-rouge">m_mutex_write</code>. 所以, 如果读者源源不断, 写锁就一直锁不到.</p>
<h2 id="boost实现">boost实现</h2>
<p>boost的shared_mutex基于Alexander Terekhov提出的算法[1], <del>虽然我一直没找到来源</de>.</del></p>
<h3 id="shared_lock_guard-和-shared_lock">shared_lock_guard 和 shared_lock</h3>
<p>对普通的mutex, 我们有raii的lock_guard, 对shared_mutex, 自然也会有shared_lock_guard:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename SharedMutex>
class shared_lock_guard : boost::noncopyable {
SharedMutex& m_shared_mutex;
public:
explicit shared_lock_guard(SharedMutex& m) : m_shared_mutex(m) {
m_shared_mutex.lock_shared();
}
~shared_lock_guard() {
m_shared_mutex.unlock_shared();
}
};
</code></pre></div></div>
<p>对于普通的mutex, 我们有raii的更灵活的unique_lock, 对shared_mutex, 自然也会有shared_lock<del>其实还有upgrade_lock以及相互转换的各种lock, 能把名字记住已经不容易了</del>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
struct defer_lock_t{};
struct try_to_lock_t{};
struct adopt_lock_t{};
const defer_lock_t defer_lock={};
const try_to_lock_t try_to_lock={};
const adopt_lock_t adopt_lock={};
template<typename SharedMutex>
class shared_lock : boost::noncopyable {
SharedMutex* m_shared_mutex;
bool m_is_locked;
public:
shared_lock() : m_shared_mutex(NULL), m_is_locked(false) {}
explicit shared_lock(SharedMutex& m) : m_shared_mutex(&m), m_is_locked(true) {
lock();
}
shared_lock(SharedMutex& m, adopt_lock_t) : m_shared_mutex(&m), m_is_locked(true) {
}
shared_lock(SharedMutex& m, defer_lock_t) : m_shared_mutex(&m), m_is_locked(false) {
}
shared_lock(SharedMutex& m, try_to_lock_t) : m_shared_mutex(&m), m_is_locked(false) {
try_lock();
}
~shared_lock() {
if (owns_lock()) {
m_shared_mutex->unlock_shared();
}
}
void lock() {
if(owns_lock()) {
throw boost::lock_error();
}
m_shared_mutex->lock_shared();
m_is_locked = true;
}
bool try_lock() {
if(owns_lock()) {
throw boost::lock_error();
}
m_is_locked = m_shared_mutex->try_lock_shared();
return m_is_locked;
}
void unlock() {
if(!owns_lock()) {
throw boost::lock_error();
}
m_shared_mutex->unlock_shared();
m_is_locked = false;;
}
bool owns_lock() {
return m_is_locked;
}
};
</code></pre></div></div>
<p>因为<code class="language-plaintext highlighter-rouge">unique_lock</code>和<code class="language-plaintext highlighter-rouge">shared_lock</code>一般要求可以移动的, 所以用的是<code class="language-plaintext highlighter-rouge">SharedMutex*</code>, 而不是引用.</p>
<h3 id="shared_mutex">shared_mutex</h3>
<p>boost的读写锁并没有使用ptherad_rwlock, 而是用mutex和condition_variable实现, 一方面可能是跨平台的考虑, 一方面可能是因为boost提供读锁升级到写锁, 而pthread不提供. boost中的锁升级称为upgrade, <code class="language-plaintext highlighter-rouge">shared_mutex</code>也有<code class="language-plaintext highlighter-rouge">lock_upgrade</code>得到可升级的读锁, 但是简单起见, 我们下面先不考虑upgrade. (下面代码片段可能来自boost1.41, 也可能来自1.68, 但这两版本除了简单重构, 没有太大区别).</p>
<p>boost的shared_mutex中, 没有明确的优先级; 既然不是读者优先, 就得加写锁的时候, 先置一flag, 标记要即将加写锁, 阻塞其他新读者. 但是, 对于已经有的读锁, 写者是要等的; 这样, 我们需要两个条件变量, 一个给读者, 一个给写者. 另外, 写锁的互斥不是用mutex实现的, 而是又置了另一flag, 标记已经加了写锁, 其他写锁等着.</p>
<p>boost.shared_mutex将这些flags, 加上读者的计数, 集中成一个内部结构体, 称之为<code class="language-plaintext highlighter-rouge">state_data</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class shared_mutex {
struct state_data {
unsigned shared_count;
bool exclusived;
bool exclusive_entered;
};
state_data m_state;
boost::mutex m_mutex_state;
boost::condition_variable m_shared_cond;
boost::condition_variable m_exclusive_cond;
public:
shared_mutex(){}
~shared_mutex(){}
void lock_shared();
bool try_lock_shared();
void unlock_shared();
void lock();
bool try_lock();
void unlock();
};
</code></pre></div></div>
<p>其中<code class="language-plaintext highlighter-rouge">m_mutex_state</code>是保护<code class="language-plaintext highlighter-rouge">m_state</code>的. <code class="language-plaintext highlighter-rouge">exclusive_entered</code>表示即将加写锁, <code class="language-plaintext highlighter-rouge">exclusive_entered</code>为真时, 不能再加读锁. <code class="language-plaintext highlighter-rouge">exclusived</code>表示已经加了写锁, 进入互斥状态. <code class="language-plaintext highlighter-rouge">shared_count</code>则是读者数量.</p>
<p>因为之后还得加上upgrade相关的标记, <code class="language-plaintext highlighter-rouge">shared_state</code>还会变得更复杂, 所以, shared_mutex的实现中, 就给state_data加了些方法, 以便调用:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
class shared_mutex {
struct state_data {
unsigned shared_count;
bool exclusived;
bool exclusive_entered;
state_data() :
shared_count(0),
exclusived(false),
exclusive_entered(false) {}
bool can_lock_shared() const { return !(exclusived || exclusive_entered);}
bool no_shared() const { return shared_count == 0;}
bool one_shared() const { return shared_count == 1;}
bool can_lock() const { return no_shared() && !exclusived;}
void lock() {
exclusived = true;
}
void unlock() {
exclusived = false;
exclusive_entered = false;
}
void lock_shared() {
++shared_count;
}
void unlock_shared() {
--shared_count;
}
};
};
</code></pre></div></div>
<p>我们先来看写锁<code class="language-plaintext highlighter-rouge">shared_mutex::lock()</code>, 因为这是我们先前最清楚的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::lock() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
while (!m_state.can_lock()) {
m_state.exclusive_entered = true;
m_exclusive_cond.wait(lk);
}
m_state.exclusived = true;
}
</code></pre></div></div>
<p>首先将<code class="language-plaintext highlighter-rouge">exclusive_entered</code>设为<code class="language-plaintext highlighter-rouge">true</code>, 然后等待已经有的读锁完成, 再把<code class="language-plaintext highlighter-rouge">exclusived</code>设为<code class="language-plaintext highlighter-rouge">true</code>.</p>
<p>为什么<code class="language-plaintext highlighter-rouge">exclusive_entered</code>在while循环中? 因为boost的shared_mutex没有谁优先, 所以最后一个读锁解锁的时候, 得让正在等待的读写者公平竞争(就是把他们都唤醒, 谁抢到就是谁的), 于是最后一个读锁解锁的时候, 会将<code class="language-plaintext highlighter-rouge">exclusive_entered</code>置为false, 让读者有机会竞争. 这样一来, 写者可能被唤醒后发现机会被读者抢了, 然后就继续等, 为保公平, 就得再把<code class="language-plaintext highlighter-rouge">exclusive_entered</code>设为<code class="language-plaintext highlighter-rouge">true</code>, 否则可能再也竞争不过读者了.</p>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::try_lock()</code>有所不同, 因为它不会去等已有的读锁(其实<code class="language-plaintext highlighter-rouge">lk</code>也可以用<code class="language-plaintext highlighter-rouge">try_to_lock</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool shared_mutex::try_lock() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
if (!m_state.can_lock()) {
return false;
}
m_state.exclusived = true;
return true;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock</code>除了改变<code class="language-plaintext highlighter-rouge">m_state</code>之外, 还需要通知正在等待的读者和写者, 因为写者优先, 所以先通知写者:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
m_state.exclusived = false;
m_state.exclusive_entered = false;
m_exclusive_cond.notify_one();
m_shared_cond.notify_all();
}
</code></pre></div></div>
<p>因为通知正在等待的读者和写者这个操作以后还会有许多次, 我们就将之提取成<code class="language-plaintext highlighter-rouge">shared_mutex</code>的一个私有方法:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::notify_waiters() {
m_exclusive_cond.notify_one();
m_shared_cond.notify_all();
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::lock_shared()</code>其实也很简单, 只是改个计数而已:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::lock_shared() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
while (!m_state.can_lock_shared()) {
m_shared_cond.wait(lk);
}
m_state.lock_shared();
}
bool try_lock_shared() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
if (m_state.can_lock_shared()) {
m_state.lock_shared();
return true;
}
return false;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock_shared()</code> 的要点我们在解释<code class="language-plaintext highlighter-rouge">shared_mutex::lock()</code>便已指出, 最后一个读者解锁时要特殊处理一下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock_shared() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
m_state.unlock_shared();
if (m_state.no_shared()) {
m_state.exclusive_entered = false;
notify_waiters();
}
}
</code></pre></div></div>
<h3 id="升级">升级</h3>
<p>boost的shared_mutex提供了升级, 即从读锁升级为写锁, 叫<code class="language-plaintext highlighter-rouge">upgrade_lock</code>, 也可能叫<code class="language-plaintext highlighter-rouge">upgrade_mutex</code>; 这个升级并不是把读锁解了然后加个写锁这么简单, shared_mutex的升级隐含了一个目标, 就是升级后, 数据没被修改. 这使得只能有一个读锁是可升级的, 否则可能竞争, 如果可能竞争, 升级后就不知道有没有被别的线程修改. [1]</p>
<p>为了实现这个目标, 锁升级便有最高优先级, 即最后一个读锁解锁时, 先通知正在升级的锁, 然后再通知其他, 这得多一个条件变量.</p>
<p>下面我们开始实现, 首先给<code class="language-plaintext highlighter-rouge">state_data</code>加个flag, 保证只有一个可升级锁, 然后给shared_mutex加些新接口:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class shared_mutex {
struct state_data {
// ...
state_data() : /*...,*/ upgrade(false) */ {}
bool upgrade;
bool can_lock_upgrade() const { return can_lock_shared() && !upgrade;}
void lock_upgrade() {
++shared_count;
upgrade = true;
}
void unlock_upgrade() {
upgrade = false;
--shared_count;
}
// ...
};
boost::condition_variable m_upgrade_cond;
// ...
void lock_upgrade();
bool try_lock_upgrade();
void unlock_upgrade();
void unlock_upgrade_and_lock();
};
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::lock_upgrade()</code>跟<code class="language-plaintext highlighter-rouge">shared_mutex::lock_shared()</code>差不多, 只是多考虑新加的<code class="language-plaintext highlighter-rouge">upgrade</code>flag而已:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::lock_upgrade() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
while (!m_state.can_lock_upgrade()) {
m_shared_cond.wait(lk);
}
m_state.lock_upgrade();
}
bool shared_mutex::try_lock_upgrade() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
if (!m_state.can_lock_upgrade()) {
return false;
}
m_state.lock_upgrade();
return true;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock_upgrade()</code>需要注意如果还有读锁, 可以通知一下可能正在<code class="language-plaintext highlighter-rouge">lock_upgrade()</code>等的读者:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock_upgrade() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
m_state.unlock_upgrade();
if (m_state.no_shared()) {
m_state.exclusive_entered = false;
notify_waiters();
} else {
m_shared_cond.notify_all();
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock_upgrade_and_lock()</code>其实也是解读锁然后加写锁, 因为优先upgrade并不是这里保证的, 而是一会儿要修改的<code class="language-plaintext highlighter-rouge">unlock_shared()</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock_upgrade_and_lock() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
m_state.unlock_shared();
while (!m_state.no_shared()) {
m_upgrade_cond.wait(lk);
}
m_state.lock();
m_state.upgrade = false;
}
</code></pre></div></div>
<p>注意这里等的是<code class="language-plaintext highlighter-rouge">m_state.no_shared()</code>而不是<code class="language-plaintext highlighter-rouge">can_lock()</code>, 这是有理由的, 稍后解释.</p>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock_shared()</code>需要改一下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock_shared() {
boost::unique_lock<boost::mutex> lk(m_mutex_state);
m_state.unlock_shared();
if (m_state.no_shared()) {
if(m_state.upgrade) {
// As there is a thread doing a unlock_upgrade_and_lock that is waiting for state.no_shared()
// avoid other threads to lock, lock_upgrade or lock_shared, so only this thread is notified.
m_state.upgrade = false;
m_state.exclusived = true;
m_upgrade_cond.notify_one();
} else {
m_state.exclusive_entered = false;
}
notify_waiters();
}
}
</code></pre></div></div>
<p>这里需要注意, 如果是最后一个读锁了, <code class="language-plaintext highlighter-rouge">m_state.upgrade</code>仍然为true, 说明有upgrade_lock在升级,
需要将<code class="language-plaintext highlighter-rouge">m_state.exclusived</code>设为true, 所以其他<code class="language-plaintext highlighter-rouge">lock</code>, <code class="language-plaintext highlighter-rouge">lock_upgrade</code>, <code class="language-plaintext highlighter-rouge">lock_shared</code>都无法进行了, 只有即将被notify的<code class="language-plaintext highlighter-rouge">unlock_upgrade_and_lock</code>; 因为<code class="language-plaintext highlighter-rouge">m_state.exclusive</code>现在是<code class="language-plaintext highlighter-rouge">true</code>, 所以<code class="language-plaintext highlighter-rouge">unlock_upgrade_and_lock</code>只能等<code class="language-plaintext highlighter-rouge">no_shared()</code>, 不能等<code class="language-plaintext highlighter-rouge">can_lock()</code>.</p>
<p>另外, 为什么将<code class="language-plaintext highlighter-rouge">m_state.upgrade</code>设为false, 其实我不是很明白, 十多年前最开始的版本就有了, 但似乎没有什么地方需要它是false, 因为<code class="language-plaintext highlighter-rouge">exclusive</code>就能保证其他锁加不上了. 为此我去so上提了个<a href="https://stackoverflow.com/questions/54105754/why-boost-shared-mutex-unlock-shared-need-to-set-state-upgrade-to-false-in-the-l">问题</a>, 有人指出, 从状态机的视角考虑, <code class="language-plaintext highlighter-rouge">exclusive</code>和<code class="language-plaintext highlighter-rouge">upgrade</code>不该同时为<code class="language-plaintext highlighter-rouge">true</code>.</p>
<p>我们喜欢raii, 所以, <code class="language-plaintext highlighter-rouge">lock_upgrade()</code>也有对应的<code class="language-plaintext highlighter-rouge">upgrade_lock</code>, 而<code class="language-plaintext highlighter-rouge">unlock_upgrade_and_lock()</code>则是从<code class="language-plaintext highlighter-rouge">upgrade_lock</code>移动到<code class="language-plaintext highlighter-rouge">unique_lock</code>的时候使用的, 假如我们有移动构造:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template<typename Mutex>
unique_lock<Mutex>::unique_lock(upgrade_lock<Mutex>&& other):
m(other.m),is_locked(other.is_locked)
{
other.is_locked=false;
other.m = NULL;
if(is_locked)
{
m->unlock_upgrade_and_lock();
}
}
</code></pre></div></div>
<h2 id="stl实现">STL实现</h2>
<p>标准库中的shared_mutex是基于Howard E. Hinnant的提案[3], 但是C++17标准中没有支持升级, 所以下面也不讨论upgrade的情况.</p>
<p>简单地说, 这个实现中, 以两个条件变量作为两道”门”, 第一道门表示没有正在写, 第二道门表示没有正在读; 对于读者, 能过第一道门便可加读锁; 对于写者, 先过第一道门, 然后将第一道关了, 在过第二道门, 过了便是加上了写锁.</p>
<p>用一个<code class="language-plaintext highlighter-rouge">unsigned</code>储存所有状态, 第1位表示<code class="language-plaintext highlighter-rouge">exclusive_entered</code>, 其余位存读者数目, 一堆操作皆是位运算; 之所以只用一个<code class="language-plaintext highlighter-rouge">unsigned</code>, 是希望以后可以改成原子变量, 也算是一种优化读写锁性能的期望.</p>
<p>我们先声明一下接口:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
class shared_mutex {
std::mutex mut_;
std::condition_variable gate1_;
std::condition_variable gate2_;
unsigned state_;
/* example:
* sizeof(unsigned) == 4;
* CHART_BIT == 8;
* EXCLUSIVE_WAITING_BLOCKED_MASK == 0x80000000;
* MAX_SHARED_COUNT_MASK == 0x7fffffff;
* NO_EXCLUSIVE_NO_SHARED == 0x00000000;
*/
static const unsigned EXCLUSIVE_ENTERED_MASK = 1U << (sizeof(unsigned) * CHAR_BIT - 1);
static const unsigned MAX_SHARED_COUNT_MASK = ~EXCLUSIVE_ENTERED_MASK;
static const unsigned NO_EXCLUSIVE_NO_SHARED = 0;
public:
shared_mutex() : state_(NO_EXCLUSIVE_NO_SHARED) {}
// Exclusive ownership
void lock();
bool try_lock();
void unlock();
// Shared ownership
void lock_shared();
bool try_lock_shared();
void unlock_shared();
};
</code></pre></div></div>
<p>直接看位运算的代码怪眼花的, 于是这里整理一下, 以私有函数代替原来的位运算语句, 与上面的讨论一样, 这些私有函数都是对<code class="language-plaintext highlighter-rouge">state_</code>的操作, 调用前都假设已经获取到<code class="language-plaintext highlighter-rouge">mut_</code>了:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class shared_mutex {
// ...
private:
bool _exclusive_entered() const { return (state_ & EXCLUSIVE_ENTERED_MASK); }
unsigned _shared_count() const { return (state_ & MAX_SHARED_COUNT_MASK); }
bool _no_shared() const { return _shared_count() == 0;}
bool _full_shared() const { return _shared_count() == MAX_SHARED_COUNT_MASK; }
bool _can_lock() const { return state_ == NO_EXCLUSIVE_NO_SHARED; }
bool _can_lock_shared() const { return (!_exclusive_entered() && !_full_shared());}
void _lock_shared() {
const unsigned num = _shared_count() + 1;
state_ &= ~MAX_SHARED_COUNT_MASK;
state_ |= num;
}
void _unlock_shared() {
const unsigned num = _shared_count() - 1;
state_ &= ~MAX_SHARED_COUNT_MASK;
state_ |= num;
}
void _lock() {
state_ = EXCLUSIVE_ENTERED_MASK;
assert(_no_shared() && _exclusive_entered());
}
void _unlock() {
state_ = NO_EXCLUSIVE_NO_SHARED;
assert(_no_shared() && !_exclusive_entered());
}
void _enter_exclusive() {
state_ |= EXCLUSIVE_ENTERED_MASK;
}
// ...
};
</code></pre></div></div>
<p>毕竟<code class="language-plaintext highlighter-rouge">unsigned</code>是有限的, 读者数量也是有上限的, 满了就不给加了, 所以有<code class="language-plaintext highlighter-rouge">_full_shared()</code>表示已满, <code class="language-plaintext highlighter-rouge">_can_lock_shared()</code>也要求未满.</p>
<p>下面我们直接看<code class="language-plaintext highlighter-rouge">shared_mutex::lock()</code> 和 <code class="language-plaintext highlighter-rouge">shared_mutex::try_lock()</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::lock()
{
std::unique_lock<std::mutex> lk(mut_);
while (_exclusive_entered()) {
gate1_.wait(lk);
}
_enter_exclusive();
while (!_no_shared()) {
gate2_.wait(lk);
}
_lock(); // unnecessary
}
bool shared_mutex::try_lock()
{
std::unique_lock<std::mutex> lk(mut_, std::try_to_lock);
if (lk.owns_lock() && _can_lock()) {
_lock();
return true;
}
return false;
}
</code></pre></div></div>
<p>第一道门, 如果没其他写者进入, 则当前写者进入, 进入后关了门(<code class="language-plaintext highlighter-rouge">_enter_exclusive()</code>), 这样其他读者和写者都不能进了. 然后在第二道门前等所有读者出去, 自己进去, 这写锁便是加上了. 所以那句<code class="language-plaintext highlighter-rouge">_lock()</code>其实没有必要, 因为此时必然是互斥的.</p>
<p>对于<code class="language-plaintext highlighter-rouge">try_lock</code>, 连<code class="language-plaintext highlighter-rouge">mut_</code>都是try的, <code class="language-plaintext highlighter-rouge">_can_lock()</code>表示既没有读者, 也没有写者在第一道门内, 所以可直接过二道门, 完成加锁, 这时<code class="language-plaintext highlighter-rouge">_lock()</code>就是必须的了.</p>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock()</code>则会让<code class="language-plaintext highlighter-rouge">state_</code>回到没有读者, 也没有写者的状态:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock()
{
{
std::lock_guard<std::mutex> _(mut_);
_unlock();
}
gate1_.notify_all();
}
</code></pre></div></div>
<p>如果有写锁, 读者都会被阻在第一道门外, 所以这里notify的是<code class="language-plaintext highlighter-rouge">gate1_</code>.</p>
<p>那么, <code class="language-plaintext highlighter-rouge">shared_mutex::lock_shared()</code>就是读者等在第一道门的故事:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::lock_shared()
{
std::unique_lock<std::mutex> lk(mut_);
while (!_can_lock_shared()) {
gate1_.wait(lk);
}
_lock_shared();
}
bool shared_mutex::try_lock_shared()
{
std::unique_lock<std::mutex> lk(mut_, std::try_to_lock);
if (lk.owns_lock() && _can_lock_shared()) {
_lock_shared();
return true;
}
return false;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">shared_mutex::unlock()</code>稍复杂, 我们之前说过, <code class="language-plaintext highlighter-rouge">std::shared_mutex</code>考虑了读者满了的情况, 所以解锁时, 如果解锁前是满的, 解锁后自然不满了, 就得通知在门外等候的其他读者. 另外, 如果有写者在第一道门内, 最后一个读者离开时, 需通知该写者可以进第二道门了:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void shared_mutex::unlock_shared()
{
std::lock_guard<std::mutex> lk(mut_);
const bool full_shared_before = _full_shared();
_unlock_shared();
if (_exclusive_entered()) {
if (_no_shared()) {
gate2_.notify_one();
}
} else {
if (full_shared_before) {
gate1_.notify_one();
}
}
}
</code></pre></div></div>
<p>因为不用考虑升级, 所以代码还是稍稍简洁易懂一些, 看明白了上面这被我”整理”过的代码, 再去看文献[3,4]中的版本, 想必会更容易一些.</p>
<p>这个实现比boost的实现更偏向写者, boost中最后一个读者解锁时, 即通知在等的读者, 也通知在等的写者, 让他们都参与竞争. Hinnant觉得这样写者有饥饿嫌疑, 毕竟读者比写者多, 错失良机的话可能就是等很久了. 所以, stl的实现中, 如果有写者进到二道门, 则只通知该写者.</p>
<h2 id="被批判的读写锁">被批判的读写锁</h2>
<p>人们没少批判读写锁的性能问题[5,6,7].</p>
<p>从上面两个版本的实现便可看出, 无论boost还是stl, shared_mutex总得有个状态和计数, 那么, 为了保护这个状态, 自然有mutex, 这意味着, 无论我们加读锁还是加写锁, shared_mutex自己都得锁个mutex, 开销不可能比我们锁个mutex小[8].</p>
<p>所以, 临界区很小的时候, 读写锁可能不会比直接粗暴的mutex快; 临界区很大又说明代码写得不好, 缩小临界区是我等毕生心愿. 所以用不用读写锁还是测过才知道.</p>
<p>如果需要很高的性能, RCU(Read-Copy Update)是一种可行的选择[9], 不过需要系统支持. 我们以后讨论RCU的时候<del>此坑有缘再填系列</del>, 再具体评测读写锁和RCU的性能差异.</p>
<p>另外, 从正确性来说, 拿着读锁进行写操作也不是不可能, 这样就跟无保护并发写一样了; 实现上, 读锁是可重入的, 而写锁会阻塞其他读锁, 这可能造成读锁重入时死锁[8].</p>
<p>我自己工作中倒是没有碰到需要读写锁的时候, 自然也没被坑过, 所以这里就不作评价了.</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Anthony Williams. <a href="https://www.boost.org/doc/libs/1_69_0/doc/html/thread/synchronization.html#thread.synchronization.mutex_types.shared_mutex">Synchronization - Boost 1.69</a>, Dec.2018</li>
<li class="ref">[2] Raynal, Michel, <em>Concurrent Programming: Algorithms, Principles, and Foundations</em>. Springer. 2012</li>
<li class="ref">[3] Howard E. Hinnant, <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2406.html#shared_mutex">Mutex, Lock, Condition Variable Rationale</a>, Sept.2007</li>
<li class="ref">[4] Howard E. Hinnant, <a href="https://stackoverflow.com/a/28140784/5570232">How to make a multiple-read/single-write lock from more basic synchronization primitives?</a>, Jan.2015</li>
<li class="ref">[5] viboes, <a href="https://svn.boost.org/trac10/ticket/11798">Implementation of boost::shared_mutex on POSIX is suboptimal</a>, Nov.2015</li>
<li class="ref">[6] AlexeyAB, <a href="https://www.codeproject.com/Articles/1183423/%2FArticles%2F1183423%2FWe-make-a-std-shared-mutex-times-faster">We make a std::shared_mutex 10 times faster</a>, Jun. 2017</li>
<li class="ref">[7] Bryan Cantrill, Jeff Bonwick, <a href="https://queue.acm.org/detail.cfm?id=1454462">Real-world Concurrency</a>, <a href="http://delivery.acm.org/10.1145/1460000/1454462/p16-cantrill.pdf?ip=202.79.203.99&id=1454462&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1547107640_5649aab2cc6362093d444b794ca3c087">PDF</a>, Oct. 2008</li>
<li class="ref">[8] 陈硕, <em>Linux多线程服务端编程: 使用muduo C++网络库</em>. 北京, 电子工业出版社, 2013, p43 ~ 44</li>
<li class="ref">[9] 杨燚, <a href="https://www.ibm.com/developerworks/cn/linux/l-rcu/">Linux 2.6内核中新的锁机制–RCU</a>, July. 2005</li>
</ul>
计时等待引发的Bug: 一个不可避免的超时竟态
2019-01-02T00:00:00+00:00
http://dengzuoheng.github.io/a-bug-of-timedwait-an-unavoidable-race
<h2 id="疑云">疑云</h2>
<p>某日, QA给我报了个bug, 说咱家软件用过我写的某个功能A后, 再用另外一个功能B可能会卡死, 说看起来就像我的锅. 于是我把繁复的业务逻辑去掉, 代码看起来像下面这样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#include <QApplication>
#include <QMainWindow>
#include <QtConcurrentMap>
#include <QtConcurrentRun>
#include <QFuture>
#include <QThreadPool>
#include <QtTest/QTest>
#include <QFutureSynchronizer>
struct Task2 { // only calculation
typedef void result_type;
void operator()(int count) {
int k = 0;
for (int i = 0; i < count * 10; ++i) {
for (int j = 0; j < count * 10; ++j) {
k++;
}
}
assert(k >= 0);
}
};
struct Task1 { // will launch some other concurrent map
typedef void result_type;
void operator()(int count) {
QVector<int> vec;
for (int i = 0; i < 5; ++i) {
vec.push_back(i+count);
}
Task2 task;
QFuture<void> f = QtConcurrent::map(vec.begin(), vec.end(), task);
{
// without releaseThread before wait, it will hang directly
QThreadPool::globalInstance()->releaseThread();
f.waitForFinished(); // BUG: may hang there
QThreadPool::globalInstance()->reserveThread();
}
}
};
int main() {
QThreadPool* gtpool = QThreadPool::globalInstance();
gtpool->setExpiryTimeout(50);
int count = 0;
for (;;) {
QVector<int> vec;
for (int i = 0; i < 40 ; i++) {
vec.push_back(i);
}
// feature A, launch a task with nested map
Task1 task; // Task1 will have nested concurrent map
QFuture<void> f = QtConcurrent::map(vec.begin(), vec.end(),task);
f.waitForFinished(); // BUG: may hang there
count++;
// waiting most of thread in thread pool expire
while (QThreadPool::globalInstance()->activeThreadCount() > 0) {
QTest::qSleep(50);
}
// feature B, launch a task only calculation
Task2 task2;
QFuture<void> f2 = QtConcurrent::map(vec.begin(), vec.end(), task2);
f2.waitForFinished(); // BUG: may hang there
qDebug() << count;
}
return 0;
}
</code></pre></div></div>
<p>Bug在于, 这代码没法跑到天荒地老, 可能会hang在<code class="language-plaintext highlighter-rouge">waitForFinished</code>那. 以下环境可以重现:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Linux version 2.6.32-696.18.7.el6.x86_64; Qt4.7.4; GCC 3.4.5
Windows 7; Qt4.7.4; mingw 4.4.0
</code></pre></div></div>
<p>这里解释一下为什么这么写:</p>
<p>首先<code class="language-plaintext highlighter-rouge">Task1</code>嵌套一个<code class="language-plaintext highlighter-rouge">QtConcurrent::map</code>是因为<code class="language-plaintext highlighter-rouge">Task1</code>要完成一部分操作之后, 才知道要起多少<code class="language-plaintext highlighter-rouge">Task2</code>, 而且这部分操作也挺耗时的.</p>
<p><code class="language-plaintext highlighter-rouge">Task1</code>中间的<code class="language-plaintext highlighter-rouge">QThreadPool::globalInstance()->releaseThread()</code>是怎么回事呢? 因为等待<code class="language-plaintext highlighter-rouge">QtConcurrent::map</code>返回的QFuture是阻塞的(相对的<code class="language-plaintext highlighter-rouge">QtConcurrent::Run</code>返回的QFuture在自己的task还是开始运行的情况下, 可能会”偷”回来自己跑), 所以这个等待会占据一个线程在那傻等, 这种傻等的线程多了, 线程池的线程就都给占了. 所以要嵌套使用<code class="language-plaintext highlighter-rouge">QtConcurrent::map</code>肯定是要去动全局线程池的线程数的, 这里用的便是<code class="language-plaintext highlighter-rouge">releaseThread</code>:</p>
<blockquote>
<p><strong>void QThreadPool::releaseThread()</strong>
Releases a thread previously reserved by a call to <code class="language-plaintext highlighter-rouge">reserveThread()</code>.</p>
<p>Note: Calling this function without previously reserving a thread temporarily increases <code class="language-plaintext highlighter-rouge">maxThreadCount()</code>. <strong>This is useful when a thread goes to sleep waiting for more work, allowing other threads to continue. Be sure to call <code class="language-plaintext highlighter-rouge">reserveThread()</code> when done waiting, so that the thread pool can correctly maintain the <code class="language-plaintext highlighter-rouge">activeThreadCount()</code></strong>.</p>
<p>See also <code class="language-plaintext highlighter-rouge">reserveThread()</code>.</p>
<p><strong>void QThreadPool::reserveThread()</strong>
Reserves one thread, disregarding <code class="language-plaintext highlighter-rouge">activeThreadCount()</code> and <code class="language-plaintext highlighter-rouge">maxThreadCount()</code>.</p>
<p>Once you are done with the thread, call <code class="language-plaintext highlighter-rouge">releaseThread()</code> to allow it to be reused.</p>
<p>Note: This function will always increase the number of active threads. This means that by using this function, it is possible for <code class="language-plaintext highlighter-rouge">activeThreadCount()</code> to return a value greater than <code class="language-plaintext highlighter-rouge">maxThreadCount()</code>.</p>
<p>See also <code class="language-plaintext highlighter-rouge">releaseThread()</code>.</p>
</blockquote>
<p>如果你真的很怀疑这两函数有问题, 我们可以看一下它们的源码:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// qtlib4.7.4/src/corlib/concurrent/qthreadpool.cpp
void QThreadPool::reserveThread()
{
Q_D(QThreadPool);
QMutexLocker locker(&d->mutex);
++d->reservedThreads;
}
void QThreadPool::releaseThread()
{
Q_D(QThreadPool);
QMutexLocker locker(&d->mutex);
--d->reservedThreads;
d->tryToStartMoreThreads();
}
</code></pre></div></div>
<p>有锁呀, 也<code class="language-plaintext highlighter-rouge">tryToStartMoreThreads()</code>了呀, 这个<code class="language-plaintext highlighter-rouge">reservedThreads</code>是怎么回事呢? 它用在了<code class="language-plaintext highlighter-rouge">activeThreadCount()</code>和<code class="language-plaintext highlighter-rouge">tooManyThreadActive()</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int QThreadPoolPrivate::activeThreadCount() const
{
// To improve scalability this function is called without holding
// the mutex lock -- keep it thread-safe.
return (allThreads.count()
- expiredThreads.count()
- waitingThreads
+ reservedThreads);
}
void QThreadPoolPrivate::tryToStartMoreThreads()
{
// try to push tasks on the queue to any available threads
while (!queue.isEmpty() && tryStart(queue.first().first))
queue.removeFirst();
}
bool QThreadPoolPrivate::tooManyThreadsActive() const
{
const int activeThreadCount = this->activeThreadCount();
return activeThreadCount > maxThreadCount && (activeThreadCount - reservedThreads) > 1;
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">reservedThreads</code>越小, <code class="language-plaintext highlighter-rouge">activeThreadCount()</code>就越小, 就越能起更多线程, 看起来没毛病呀.</p>
<p>所以按道理上面的代码应该能一直跑下去才对, 怎么就hang了呢?</p>
<h2 id="线索">线索</h2>
<p>如果我们在hang的时候gdb进去<code class="language-plaintext highlighter-rouge">info threads</code>看一下, 会发现hang住时线程数量并没有想象中那么多, 除主线程外, 就没两个了. 所以, 我猜测这是可能Qt的bug, QThreadPool可能没有维护好<code class="language-plaintext highlighter-rouge">activeThreadCount()</code>.</p>
<p>具体怎么没维护好, 我们得研究一下参与<code class="language-plaintext highlighter-rouge">activeThreadCount()</code>计算的几个值. gdb下在<code class="language-plaintext highlighter-rouge">activeThreadCount()</code>内打断点, 然后在hang住的时候, 通过gdb <code class="language-plaintext highlighter-rouge">print gti->activeThreadCount()</code>进到断点(gti是指向<code class="language-plaintext highlighter-rouge">QThreadPool::globalInstance()</code>的临时变量).</p>
<p>你要说哪个值不正常嘛… <code class="language-plaintext highlighter-rouge">reservedThreads</code>挺正常的, 就是我们设出来的, <code class="language-plaintext highlighter-rouge">allThreads</code>和<code class="language-plaintext highlighter-rouge">expiredThreads</code>其实看不出来正不正常. <code class="language-plaintext highlighter-rouge">waitingThreads</code>这时候是负的, 看起来就很可疑.</p>
<p>嗯, 确实很可疑, 咋一看, 代码里面没有让<code class="language-plaintext highlighter-rouge">waitingThreads</code>变成负数的场景. 会改变这个值的地方就两个, <code class="language-plaintext highlighter-rouge">QThreadPoolThread::run()</code>和<code class="language-plaintext highlighter-rouge">QThreadPoolPrivate::tryStart</code>.</p>
<p>其中, <code class="language-plaintext highlighter-rouge">tryStart</code>看起来像这样(有删减):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool QThreadPoolPrivate::tryStart(QRunnable *task)
QMutexLocker locker(&mutex);
taskQueue.append(task); // Place the task on the task queue
if (waitingThreads > 0) {
// there are already running idle thread. They are waiting on the 'runnableReady'
// QWaitCondition. Wake one up them up.
waitingThreads--;
runnableReady.wakeOne();
} else if (runningThreadCount < maxThreadCount) {
startNewThread(task);
}
}
</code></pre></div></div>
<p>而<code class="language-plaintext highlighter-rouge">run</code>看起来像这样(有删减):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void QThreadPoolThread::run()
{
QQMutexLocker locker(&manager->mutex);
for(;;) {
QRunnable *r = manager->queue.takeFirst();
do {
if (r) {
// run the task
locker.unlock();
r->run();
locker.relock();
}
// if too many threads are active, expire this thread
if (manager->tooManyThreadsActive())
break;
r = manager->queue.takeFirst();
} while (r != 0);
// if too many threads are active, expire this thread
bool expired = manager->tooManyThreadsActive();
if (!expired) {
++manager->waitingThreads;
registerTheadInactive();
// wait for work, exiting after the expiry timeout is reached
expired = !manager->runnableReady.wait(locker.mutex(), manager->expiryTimeout);
++manager->activeThreads;
if (expired)
--manager->waitingThreads; //<- break here
}
if (expired) {
manager->expiredThreads.enqueue(this);
registerTheadInactive();
break;
}
}
}
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">tryStart</code>里面, 只有<code class="language-plaintext highlighter-rouge">waitingThreads</code>大于0才会减, 而这个过程有锁保护. 所以, 要有问题肯定也是<code class="language-plaintext highlighter-rouge">run</code>中的, 因为<code class="language-plaintext highlighter-rouge">QWaitCondition::wait</code>会解锁, 于是我在<code class="language-plaintext highlighter-rouge">//<- break here</code>那里加了个条件断点, 如果<code class="language-plaintext highlighter-rouge">waitingThreads</code>等于0的时候中断. 如果能断在那, 之后就自减, 就会减成负数了.</p>
<p>嗯, 确实能断在那.</p>
<h2 id="真相">真相</h2>
<p>我想聪明的你已经意识到问题了, 条件变量计时等待的时候, 如果超时的瞬间被notify了, 怎么办? 算超时还是算信号?</p>
<p>我们看pthread<a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_timedwait.html">文档</a>的说法:</p>
<blockquote>
<p>It is important to note that when pthread_cond_wait() and pthread_cond_timedwait() return without error, the associated predicate may still be false. Similarly, when pthread_cond_timedwait() returns with the timeout error, the associated predicate may be true due to an <strong>unavoidable race</strong> between the expiration of the timeout and the predicate state change.</p>
</blockquote>
<p>我们看boost1.66的说法:</p>
<blockquote>
<p>When this function returns true:</p>
<ul>
<li>A notification (or sometimes a spurious OS signal) has been received</li>
<li>Do not assume that the timeout has not been reached</li>
<li>Do not assume that the predicate has been changed</li>
</ul>
<p>When this function returns false:</p>
<ul>
<li>The timeout has been reached</li>
<li><strong>Do not assume that a notification has not been received</strong></li>
<li>Do not assume that the predicate has not been changed</li>
</ul>
</blockquote>
<p>也就是说, 我们可以知道确实超时了, 不知道有没有被signal. 那我们已经很接近真相了.</p>
<p>我们来还原一下案发现场, 某一时刻, 某线程A完成了所有task, <code class="language-plaintext highlighter-rouge">++manager->waitingThreads</code>, 进入计时等待. 过了一会, 另一线程B给线程池加了个task, 因为<code class="language-plaintext highlighter-rouge">manager->waitingThreads > 0</code>所以回收了这个过期的线程A, 并notify唤醒它. 巧的是, notify的时候线程A的计时等待超时了, 线程A以为自己真的过期了, 就不再工作, 进入过期队列了, 这样<code class="language-plaintext highlighter-rouge">waitingThreads</code>就多减了一次, <code class="language-plaintext highlighter-rouge">waitingThreads</code>就会变成负数, 线程池的状态就被破坏了.</p>
<h2 id="结案">结案</h2>
<p>其实早在2013年, 人们就发现了这个bug, 即<a href="https://bugreports.qt.io/browse/QTBUG-3786">QTBUG-3786</a>, 这个问题在Qt4.8.6被修复(<a href="https://github.com/nonrational/qt-everywhere-opensource-src-4.8.6/blob/master/changes-4.8.6">release log</a>), 大家可以看这个<a href="https://github.com/qt/qtbase/commit/a9b6a78e54670a70b96c122b10ad7bd64d166514#diff-6d5794cef91df41c39b5e7cc6b71d041">diff</a>).</p>
<p>因为用一个整数无法可靠地维护好<code class="language-plaintext highlighter-rouge">waitingThreads</code>, 这里QThreadPool换成了waitingThreads队列, 在进入计时等待前入队, 唤醒或超时时尝试移出, 如果已经被<code class="language-plaintext highlighter-rouge">tryStart</code>回收出队了, 自然队列里面就没有这个线程, 也就没法移出了; 反过来, 如果移出成功了, 就说明没有notify, 真过期了.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// if too many threads are active, expire this thread
bool expired = manager->tooManyThreadsActive();
if (!expired) {
manager->waitingThreads.enqueue(this);
registerThreadInactive();
// wait for work, exiting after the expiry timeout is reached
runnableReady.wait(locker.mutex(), manager->expiryTimeout);
++manager->activeThreads;
if (manager->waitingThreads.removeOne(this))
expired = true;
}
if (expired) {
manager->expiredThreads.enqueue(this);
registerThreadInactive();
break;
}
</code></pre></div></div>
<p>注意这里把条件变量移动到QThreadPoolThread里了, 也就说每个QThreadPoolThread有一个条件变量, 这样<code class="language-plaintext highlighter-rouge">tryStart</code>回收时就可以先出队, 再notify, 并且只有指定的线程会被唤醒.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (waitingThreads.count() > 0) {
// recycle an available thread
enqueueTask(task);
waitingThreads.takeFirst()->runnableReady.wakeOne();
return true;
}
</code></pre></div></div>
<p>另外, 改动QWaitCondition也是一种方案, 但这样要求条件变量的唤醒确实是队列的, 这个依赖实现, 并有性能损失, 详细分析可以参考文献[2][3].</p>
<p>在生产环境中, 因为还无法立刻升级到Qt4.8.7, 所以需在<code class="language-plaintext highlighter-rouge">waitForFinished()</code>前加一个<code class="language-plaintext highlighter-rouge">releaseThreads()</code>, 让线程池再加一个线程, 这样就不会卡住了, 因为至少有新加的这个线程在工作.</p>
<h2 id="后日谈">后日谈</h2>
<p>你以为这样就结束了? 只要升到4.8.6, 最开始的代码就能运行到天荒地老了? 不, 真相仍然在迷雾之中, 即使是Qt4.8.7, 这代码依然会hang, 这还有一个bug(因为Qt5.12不hang了)! 这又是另一个故事了…</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] tunglt, <a href="https://stackoverflow.com/a/53760809/5570232">QtConcurrent: why releaseThread and reserveThread cause deadlock?
</a>, 2018</li>
<li class="ref">[2] Olivier Goffart, <a href="https://woboq.com/blog/qwaitcondition-solving-unavoidable-race.html">QWaitCondition: Solving the Unavoidable Race</a>, 2014</li>
<li class="ref">[3] Cort Ammon, Nemo, <a href="https://stackoverflow.com/questions/18642385/why-does-pthread-cond-timedwait-doc-talk-about-an-unavoidable-race">Why does pthread_cond_timedwait doc talk about an “unavoidable race”?</a>, 2013</li>
</ul>
C++并发型模式#6: 管程 - monitor
2018-12-31T00:00:00+00:00
http://dengzuoheng.github.io/cpp-concurency-pattern-6-monitor
<h2 id="管程是什么">管程是什么</h2>
<p>从大学起我就有两个问题很不解, 为什么monitor会翻译成管程, 以及这玩意为什么叫monitor! 可能每一篇讨论monitor的文章, 都需要先介绍什么是monitor, 所以说, 起名字是编程活动中最困难的事情, 也许没有之一.</p>
<p>在遥远的过去(1970s), 人们没什么同步工具可以用, 只好用semaphore, 我们之前讨论过了, semaphore同时具备互斥和信号的语义, 使得人们使用semaphore时需要格外小心. 为了人们更容易写出正确的代码, Brinch Hansen(1973)和Hoare(1974)提出了一种高级的同步原语, 称为monitor[1].</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- An example from the Mesa language
StorageAllocator.MONITOR =
BEGIN
StorageAvailable: CONDITION;
FreeList: POINTER;
Allocate: ENTRY PROCEDURE RETURNS [p: POINTER] =
BEGIN
WHILE FreeList = NIL DO
WAIT StorageAvailable
ENDLOOP;
p <- FreeList; FreeList <- p.next;
END;
Free: ENTRY PROCEDURE [p: POINTER] =
BEGIN
p.next <- FreeList; FreeList <-p;
NOTIFY StorageAvailable
END;
END.
</code></pre></div></div>
<p>为什么说高级呢? 因为管程是一个由过程, 变量及数据结构等组成的集合, 它们组成了一个特殊的模块或软件包[1]. 如同上面的例子中我们用<code class="language-plaintext highlighter-rouge">MONITOR</code>修饰类, 所以我们得说某某代码是管程, 某某自定义类是管程. 相对而言, semaphore就是低级的.</p>
<p>管程保证了同一时刻只能有一个线程在管程内, 这意味这管程提供了互斥访问. 这一切通常是编译器提供的, 也就是说管程是编程语言的组成部分. <del>很明显, C++没有</del></p>
<p>因为其互斥性, 而且管程内既有数据也有过程, 所以没有语法级支持管程的语言中, 也会称为线程安全对象[4]. 比如我们写个线程安全队列, 我们就可以说这个是管程. 但线程安全对象不一定就是管程, 因为经典定义下管程的全部方法体都是互斥的, 而线程安全对象却没有这个要求.</p>
<p>那编程语言怎么支持的管程? 通常也是让对象内部包含semaphore, mutex, condition_variable. 所以, mutex+condition_varable是实现管程的手段之一, 而管程是高级的, 它不关心互斥和信号是怎么实现的.</p>
<h2 id="管程的语义">管程的语义</h2>
<p>假设我们有两个线程, 线程B在管程内, 线程A在等, 比如说等资源, 然后线程A notify了, 资源可用了, 这时候怎么办? 谁应该在管程内?</p>
<p>这个怎么办会产生三种不同的语义[2]. Mesa语义, Hoare语义和Brinch Hansen语义(是的! 这俩提出者的monitor语义不一样!).</p>
<p>Mesa是第一种支持管程的编程语言. 在Mesa中, monitor有wait queue和entry queue, 那么, 一个线程要么在wait queue中, 要么在entry queue中, 要么在管程中. 在管程中的线程出来之后, entry queue的队首就进入管程.</p>
<p>Mesa语义就是线程A被signal后, 线程B继续在管程中, 线程A进入entry queue, 等线程B离开管程, 线程A再进入管程.</p>
<p><img src="/images/monitor_mesa.jpg" alt="Mesa" /></p>
<p>Brinch Hansen语义非常类似, 也是有wait queue和entry queue, 但是Brinch Hansen语义要求signal发生在线程A离开管程的时候, 也就是说, signal之后, 线程B就离开管程了, 线程A就自然进入管程了.</p>
<p><img src="/images/monitor_bh.jpg" alt="Brinch Hansen" /></p>
<p>Hoare语义最复杂, 因为它还有signal queue. Hoare语义中, 在等的线程A在wait queue, signal发生时, 线程B被从管程中移到signal queue中, 而线程A则从wait queue移到管程中, 等线程A离开管程后线程A再回来.</p>
<p><img src="/images/monitor_hoare.jpg" alt="Hoare" /></p>
<p>语义问题在参考文献[2]解释的很清楚, 大家可以看看; 许多语言实现的也是Mesa语义, 比如Java[3]; 但是, 对于C++用户来说, 使用条件变量来notify的话, Mesa还是Brinch Hansen取决于你什么时候把锁解了.</p>
<h2 id="c中的管程">C++中的管程</h2>
<h3 id="基于派生的管程">基于派生的管程</h3>
<p>能不能把管程模型写成模板类之类的东西? 虽然很少, 但还是可以的. <a href="https://github.com/zeroc-ice/ice">ice库
</a>就有一个通过继承monitor基类来让自己变成monitor的实现, 实现的是mesa语义[3].</p>
<p>我们用boost来抄袭一遍的话(这里参考的是ice3.7的<a href="https://github.com/zeroc-ice/ice/blob/3.7/cpp/include/IceUtil/Monitor.h">源码</a>, 会是这样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class mesa_monitor : boost::noncopyable {
public:
typedef boost::unique_lock<mesa_monitor> lock_type;
friend class lock_type;
mesa_monitor() : m_notify(0) {}
public:
void lock() const {
m_mutex.lock();
m_notify = 0; // 进入管程时要把m_notify归0
}
void unlock() const {
notify_impl(m_notify);
m_mutex.unlock();
}
bool try_lock() const {
bool ret = m_mutex.try_lock();
if (ret) {
m_notify = 0;
}
return ret;
}
void wait() const {
notify_impl(m_notify);
m_cond.wait(m_mutex);
m_notify = 0;
}
void notify_one() {
if (m_notify != -1) {
++m_notify;
}
}
void notify_all() {
m_notify = -1;
}
private:
void notify_impl(int nnotify) const {
if (nnotify != 0) {
if (nnotify = -1) {
m_cond.notify_all();
return;
} else {
while (nnotify > 0) {
m_cond.notify_one();
--nnotify;
}
}
}
}
private:
mutable boost::condition_variable_any m_cond;
mutable boost::mutex m_mutex;
mutable int m_notify;
};
</code></pre></div></div>
<p>看起来有些奇怪, notify的时候只是记录了要notify多少下, 实际调用<code class="language-plaintext highlighter-rouge">condition_varaiable::notify_one</code>的是<code class="language-plaintext highlighter-rouge">wait</code>和<code class="language-plaintext highlighter-rouge">unlock</code>; 这里设定了wait和unlock是离开monitor的操作, 所以此时会唤醒正在等待的线程. 这也使得<code class="language-plaintext highlighter-rouge">notify_one</code>不会立刻唤醒其他线程.</p>
<p>一堆<code class="language-plaintext highlighter-rouge">const</code>和<code class="language-plaintext highlighter-rouge">mutable</code>是为了使用mesa_monitor的类可以在const的方法中可以调用. 使用mesa_monitor的threadsafe_queue如下:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template <typename T>
class threadsafe_queue : mesa_monitor {
std::queue<T> m_data;
public:
threadsafe_queue() {}
void pop(T& val) {
mesa_monitor::lock_type lk(*this);
while (m_data.empty()) {
wait();
}
val = m_data.front();
m_data.pop();
}
bool try_pop(T& val) {
mesa_monitor::lock_type lk(*this);
if (m_data.empty()) {
return false;
}
val = m_data.front();
m_data.pop();
return true;
}
void push(const T& val) {
mesa_monitor::lock_type lk(*this);
m_data.push(val);
notify_one();
}
};
</code></pre></div></div>
<h3 id="管程包装器">管程包装器</h3>
<p>基于派生的管程毕竟是侵入式的, 如果单纯的只是想实现互斥访问, 我们还可以用一些比较黑暗的魔法, 比如重载<code class="language-plaintext highlighter-rouge">operator-></code>(std::forward要求了C++11)[5]:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// test in vsc2017
template<class T>
class monitorized
{
public:
template<typename ...Args>
monitorized(Args&&... args) : m_obj(std::forward<Args>(args)...) {}
struct monitorized_helper
{
monitorized_helper(monitorized* mon) : m_mon(mon), m_lk(mon->m_lock) {}
T* operator->() { return &m_mon->m_obj; }
monitorized* m_mon;
std::unique_lock<std::mutex> m_lk;
};
monitorized_helper operator->() { return monitorized_helper(this); }
monitorized_helper lock() { return monitorized_helper(this); }
T& unsafe_ref() { return m_obj; }
private:
T m_obj;
std::mutex m_lock;
};
</code></pre></div></div>
<p>这里的思路是你调用<code class="language-plaintext highlighter-rouge">monitorized</code>的<code class="language-plaintext highlighter-rouge">operator->()</code>时, 返回的是一个<code class="language-plaintext highlighter-rouge">monitorized_helper</code>实例, 而<code class="language-plaintext highlighter-rouge">monitorized_helper</code>构造时会加锁, 而实际调成员函数的是<code class="language-plaintext highlighter-rouge">monitorized_helper</code>的<code class="language-plaintext highlighter-rouge">operator->()</code>, 这基于一个奇怪的<a href="https://stackoverflow.com/a/12365484">特性</a>, 当<code class="language-plaintext highlighter-rouge">operator-></code>被重载时, 它会折叠到最终结果[6], 所以下面这个例子, 包多少层都是可以的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct example {
void foo() {}
};
struct first_wapper {
explicit first_wapper(example* _e) : e(_e) {}
first_wapper(const first_wapper& rhs) : e(rhs.e) {}
example* operator->() { return e; }
example* e;
};
struct second_wapper {
explicit second_wapper(example* _e) : e(_e) {}
second_wapper(const second_wapper& rhs) : e(rhs.e) {}
first_wapper operator->() { return first_wapper(e); }
example* e;
};
struct third_wapper {
second_wapper operator->() { return second_wapper(&e);}
example e;
};
int main() {
third_wapper w;
w->foo();
return 0;
}
</code></pre></div></div>
<p>于是monitorized用起来是这样的:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>monitorized<std::queue<int> > q;
boost::thread tr1([&]() {
for (int i = 0; i < 100; ++i) {
q->push(i);
}
});
</code></pre></div></div>
<p>当然这样并不能真的实现线程安全的队列, 但确实每个成员函数都是加锁的.</p>
<h2 id="总结">总结</h2>
<p>monitor应当是编程语言的支持, C++没有支持, 虽然我们可以用一些方法写得像monitor, 但并不比直接使用mutex和condition_variable靠谱. 至于其他特性, 我觉得参考文献[7]总结得挺好的, 不必赘述.</p>
<p><strong>Reference:</strong></p>
<ul>
<li class="ref">[1] Andrew S. Tanenbaum. 陈向群, 马洪兵等译. 现代操作系统(第三版). 机械工业出版社. 2012.</li>
<li class="ref">[2] Gregory Kesden, <a href="https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/lecture9.html">Monitors and Condition Variables</a></li>
<li class="ref">[3] Mark Spruiell, <a href="https://doc.zeroc.com/pages/viewpage.action?pageId=5048235">The C++ Monitor Class</a>. Apr.2011</li>
<li class="ref">[4] wikipedia, <a href="https://en.wikipedia.org/wiki/Monitor_(synchronization)">Monitor (synchronization)</a></li>
<li class="ref">[5] Mike Vine, <a href="https://stackoverflow.com/a/48408987">Making a C++ class a Monitor (in the concurrent sense)</a></li>
<li class="ref">[6] David Rodríguez - dribeas, <a href="https://stackoverflow.com/a/10678920">How arrow-> operator overloading works internally in c++?</a></li>
<li class="ref">[7] Fruit_初, <a href="https://www.jianshu.com/p/8b3ed769bc9f">Monitors</a>, March, 2017.</li>
</ul>