Use Intel TBB's parallel_sort even for nested parallelism.

TBB has a global task scheduler (that's one of the reason TBB is not linked statically but dyanmically instead). This allows control over all running threads, enabling us to use nested parallelism and the scheduler doing all the task allocation itself. That is, nested parallel execution such as in parallel_for(seq, [](const auto& rng){ parallel_sort(rng); }); is no problem at all, as the scheduler still claims control over the global environment. Therefore, use `parallel_sort` Range overload where possible. References: - https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/algorithms.htm - https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/algorithms/parallel_sort_func.htm - https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/task_scheduler.htm - https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/task_scheduler/task_scheduler_init_cls.htm - https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#tbb_userguide/Initializing_and_Terminating_the_Library.htm
2015-09-09 17:22:51 +02:00
parent dfac34beac
commit 9231335eef
11 changed files with 93 additions and 71 deletions
@@ -143,7 +143,7 @@ int main(int argc, char *argv[])
        std::vector<TarjanEdge> graph_edge_list;
        auto number_of_nodes = LoadGraph(argv[1], coordinate_list, graph_edge_list);

-        tbb::parallel_sort(graph_edge_list.begin(), graph_edge_list.end());
+        tbb::parallel_sort(graph_edge_list);
        const auto graph = std::make_shared<TarjanGraph>(number_of_nodes, graph_edge_list);
        graph_edge_list.clear();
        graph_edge_list.shrink_to_fit();