Commit Graph

4772 Commits

Author SHA1 Message Date
Daniel J. Hofmann
9b952ff48c Improve debug build performance while keeping symbols.
- remove profiling/coverage mix from debug build, as it is useless as of
  now, re-enable this for a separate coverage build in the future

- use gcc's `-ggdb` and `-Og` flag (requires recent gcc) to provide
  better debug information targeted for gdb and optimize what we can

- use `-fno-inline` and `-fno-omit-stack-pointer`, in order to be able
  to jump around in gdb without functions being gone and keeping the
  stack reference
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
c5064710a8 Re-enable position independent code, but in a portable way.
CMake 2.8.9 introduce a `POSITION_INDEPENDENT_CODE` property.

This sets `-fPIE` on executables, giving us back optimizations such as
inlining of global variables and functions, while setting `-fPIC` on
libraries.

Although we do not need position independent code on executables, it
seems like some gcc versions (like 4.9.2) have issues in combinations
with `_FORTIFY_SOURCE`.

On shared libraries, CMake should per documentation even use position
independent code by default.

References:

- http://www.cmake.org/cmake/help/v3.0/prop_tgt/POSITION_INDEPENDENT_CODE.html#prop_tgt:POSITION_INDEPENDENT_CODE
- http://public.kitware.com/pipermail/cmake-developers/2012-May/015839.html
- https://github.com/Project-OSRM/osrm-backend/pull/1647
- cae59c7395
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
57e522065a Add linker optimizations and dead code and data elimination.
Linkers also have options we can configure! The most usefull feature is
to give every function its own section. This results in some bloat at
compile time, but at link time now the linker can do dead code and data
elimination by simply discarding appropriate sections.

This works by splitting the `.text` section in a way that makes it
possible to later only pull in sections that are actually referenced.

That is, the basic idea is to keep the matching between sections and
functions intact, so we can optimize based on it in the linking stage.

Note: there's still an issue with how `libOSRM.a` gets build. CMake
currently passes the linker flags on to ar, in order to create a static
library. But ar does not understand the linker's flags.

Referenes:

- https://sourceware.org/binutils/docs/ld/Options.html#Options
- http://elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdfMCþ"
- http://www.cmake.org/cmake/help/v3.0/variable/CMAKE_EXE_LINKER_FLAGS.html
- http://www.cmake.org/cmake/help/v3.0/variable/CMAKE_MODULE_LINKER_FLAGS.html
- http://www.cmake.org/cmake/help/v3.0/variable/CMAKE_SHARED_LINKER_FLAGS.html
- http://www.cmake.org/cmake/help/v3.0/variable/CMAKE_STATIC_LINKER_FLAGS.html
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
7143daf500 There is no CMAKE_LINKER_FLAGS variable.
There really isn't; deal with it.

Also, those are not linker flags but instead meant for the compiler.

References:

- http://www.cmake.org/cmake/help/v3.0/manual/cmake-variables.7.html
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
71a00fc01b Make lto detection more robust and not resetting cxx flags when lto fails.
This refines the last commit of parallelizing lto.

Discussion: this is ugly as hell, dispatching 1/ on the availability of
the `-flto` flag, then 2/ on the compiler since GCC allows `-flto=n`
whereas Clang for example does not.

I tried setting the CMake property `INTERPROCEDURAL_OPTIMIZATION`,
without any effect. All I could see was some lto related utilities in
the cmake debug output, but not in the actual compiler or linker
invocation.

This would eliminate the need for our hacks, with 1/ using an option
`WITH_LTO` setting `ON` by default, and based on this value setting the
`INTERPROCEDURAL_OPTIMIZATION` flag with CMake doing the actual work of
selecting the best LTO method on the target platform.

By the way, this also fixes a bug where we reset the `CMAKE_CXX_FLAGS`
to a variable that was never defined, resulting in setting the flags to
an empty string. Yay CMake, as usual.

References:

- http://www.cmake.org/cmake/help/v3.0/prop_tgt/INTERPROCEDURAL_OPTIMIZATION.html
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
941483c14d Parallelize optimization and code generation for link time optimization.
This parallelizes the `-flto` feature resulting in parallel optimization
and code generation for link time optimization based on the number of
logical processors available.

Note: this has the side-effect of using more memory during linking.

References:

- https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (see: -flto)
- http://www.cmake.org/cmake/help/v3.0/module/ProcessorCount.htmMC
2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
17d8e65c64 Silence unused variable warnings 2015-09-30 18:20:00 +02:00
Daniel J. Hofmann
72c0feb048 Silence warnings for system headers that we or third_party transitively includes.
GCC with link time optimizations does not to respect this mode
unfortunately, reuslting in warnings in release (default) build
mode from system includes such as boost, luabind and so on.
2015-09-30 18:18:36 +02:00
Daniel J. Hofmann
9e20dbe226 Remove -fPIC flag from build system.
This remove the `-fPIC` flag, indicating position independant code
generation, from the build system.

Citing GCC's official code generation docs:

> This option makes a difference on the m68k, PowerPC and SPARC.

We do not support any of these architectures, so remove the flag!

References:

- https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#Code-Gen-Options
2015-09-30 18:18:36 +02:00
Daniel J. Hofmann
06f2738c03 Add stricter compiler warnings to build system.
These are for standard compliance and should on by default:

    -Wall -Wextra -pedantic

The problem is that even `-Wall` and `-Wextra` does not cover all
warnings, as to not break backward compatibility. Clang therefore
has the `-Weverything` flag, that really includes everything but is
overkill for the day to day development.

Thus, we in addition add:

    -Wuninitialized -Wunreachable-code

to guard against undefined behavior from reading uninitialized variables
and warn for unreachable code.

With:

    -Wstrict-overflow=1

the compiler warns us when it's doing optimizations based on the fact
that signed integer overflows are undefined behavior.

With:

    -D_FORTIFY_SOURCE=2

we tell the compiler to replace functions like strcpy with strncpy where
it can do so, resulting in cheap and useful buffer overflow protection.

References:

- https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
- https://securityblog.redhat.com/2014/03/26/fortify-and-you/
- https://wiki.debian.org/Hardening
2015-09-30 18:18:36 +02:00
Daniel J. Hofmann
5a257416ca Completely rip out Boost's Spirit / Karma for casting.
This rips out the Bost Spirit / Karma conversion code, using the stdlib
and lightweight alternatives instead.

The main benefit is an immense decrease in compilation times, for every
translation unit that requires the `util/cast.hpp` header.

Note: compared to the version before, there is a minor change in
behavior: the double `-0` was printed as `0` before and is now printed
as `-0`. This comes from the IEE754 standard, specifying signed zeros,
that is `+0` and `-0`. Interesting for us: JavaScript uses IEE754,
resulting in no breakage if used in arithmetic.

Small test case, left hand side was before, right hand side is now:

    $ ./a.out
    -1.123457 vs -1.123457
    -1 vs -1
    -1.3 vs -1.3
    0 vs -0
    0 vs 0
    0 vs 0
    1.3 vs 1.3
    1.123457 vs 1.123457

References:

- https://en.wikipedia.org/wiki/Signed_zero
- http://www.boost.org/doc/libs/1_59_0/doc/html/boost/algorithm/trim_right_if.html
- http://www.boost.org/doc/libs/1_59_0/doc/html/boost/algorithm/is_any_of.html
2015-09-29 16:15:54 +02:00
Daniel J. Hofmann
f9f0ffb64d Remove hand written conversion code and replace with stdlib features.
With C++11 the stdlib gains:

- `std::stoi` function family to convert from `std::string` to integral type

- `std::to_string` to convert from number types to `std::string`

The only reason for hand-writing the conversion code therefore is
performance. I benchmarked an `osrm-extract` with the hand-written code
against one with the stdlib conversion features and could not find any
significant difference (we switch back and forth between C++ and Lua,
shaving off a few us in conversion doesn't gain us much).

Formatting arithmetic types in the default format with given precision
requires streams, but is doable in a few lines of idiomatic stdlib code.

For this, there is now the following function template available:

    template <Arithmetic T, int Precision = 6>
    inline std::string to_string_with_precision(const T);

that requires integral or floating point types and returns a formatted
string in the defaukt format with the given precision applied.

In addition this completely rips out Boost.Spirit from the `casts.hpp`
header, resulting in faster compile times.

Boom!

References:

- http://en.cppreference.com/w/cpp/string/basic_string/stol
- http://en.cppreference.com/w/cpp/string/basic_string/to_string
- http://www.kumobius.com/2013/08/c-string-to-int/
2015-09-29 16:15:54 +02:00
Daniel J. Hofmann
31cf8a8813 Remove Boost.Filesystem v3 fix for Boost < 1.48, refactor call sites.
We needed this for Boost < 1.48, but per our Wiki on building OSRM:

> On Ubuntu 12.04 you will be limited to OSRM tag v0.3.10 because
> later versions **require Boost v1.49+** and installing this
> causes problems with libluabind-dev package.

Thus, rip it out!

To keep the commits atomic and isolated, I also refactored all call
sites that used the functionality from the portability fix.

While doing this, I also simplified the monster of around ~100 lines of
file path checking --- lambda's are awesome' use them!

References:

- http://stackoverflow.com/a/1750710
- https://github.com/Project-OSRM/osrm-backend/wiki/Building-on-Ubuntu
2015-09-29 16:15:54 +02:00
Daniel J. Hofmann
98b7e0a407 Refactor bearing implementation.
- removes `noexcept` specifier as we can not guarantee for not throwing

- uses a namespace instead of a struct + static function combination

- asserts for heading degree in [0, 360] range (both sides inclusive!)

- header only since implementation does not hide anything

- adds `inline` specifier as compiler hint
2015-09-29 16:15:54 +02:00
Daniel J. Hofmann
7ed63d2ab5 Remove TBB usage from hot code paths 2015-09-28 20:37:09 +02:00
Daniel J. Hofmann
6e6b38e8e9 Revert the usage of TBB's iterator pair taking overloads.
This reverts the range based overload usage introduced in @6b2bf495.

Old TBB versions do not provide the range overloads.
2015-09-28 20:37:09 +02:00
Daniel J. Hofmann
829b9d96e4 Revert parallelization on algorithms that are used in the server. Let node do this.
This reverts @6b2bf49 on the server algorithms.
2015-09-28 20:26:29 +02:00
Daniel J. Hofmann
85cef7e37c Revert parallelization on util that is used in the server. Let node do this.
This reverts @6b2bf49 on the server component utils.
2015-09-28 20:26:29 +02:00
Daniel J. Hofmann
c526bec798 Revert parallelization on server part. Let node do this.
This reverts @6b2bf49 on the server components.

We do not want to parallelize there, as node should be used for
parallelizing the user requests onto multiple processes.
2015-09-28 20:26:03 +02:00
Daniel J. Hofmann
9231335eef Use Intel TBB's parallel_sort even for nested parallelism.
TBB has a global task scheduler (that's one of the reason TBB is not
linked statically but dyanmically instead). This allows control over all
running threads, enabling us to use nested parallelism and the scheduler
doing all the task allocation itself.

That is, nested parallel execution such as in

    parallel_for(seq, [](const auto& rng){
      parallel_sort(rng);
    });

is no problem at all, as the scheduler still claims control over the
global environment.

Therefore, use `parallel_sort` Range overload where possible.

References:

- https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/algorithms.htm
- https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/algorithms/parallel_sort_func.htm
- https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/task_scheduler.htm
- https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#reference/task_scheduler/task_scheduler_init_cls.htm
- https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm#tbb_userguide/Initializing_and_Terminating_the_Library.htm
2015-09-28 20:26:03 +02:00
Daniel J. Hofmann
dfac34beac Do not use an incomplete type with value semantics 2015-09-28 16:50:36 +02:00
Daniel J. Hofmann
82dd5d8ccf Use Boost.Optional instead of custom optional monad implementation.
This switches out the `<variant/optional.hpp>` implementation of the
optional monad to the one from Boost.

The following trick makes sure we keep compile times down:

- use `<boost/optional/optional_fwd.hpp>` to forward declare the
  optional type in header, then include the full blown optional header
  only in the implementation file.

- do the same for the files we touch, e.g. forward declare osmium types,
  allowing us to remove the osmium header dependency from our headers:

      `namespace osmium { class Relation; }

  and then include the appropriate osmium headers in the implementation
  file only. We should do this globally...

References:

- http://www.boost.org/doc/libs/1_59_0/libs/optional/doc/html/index.html
- https://github.com/osmcode/libosmium/issues/123
2015-09-28 15:00:21 +02:00
Daniel J. Hofmann
be506f7121 Change integer_range's .size() member function return type to size_t.
Instead of the return type being the templated `Integer` parameter.

The integer type and the size of the range are not connected.
2015-09-28 15:00:21 +02:00
Daniel J. Hofmann
2470494009 Implement saity checks for irange and its returned type iterator_range.
The implementation does not support backwards counting ranges, but fails
to assert on this condition. Fix this once and for all.
2015-09-28 15:00:21 +02:00
Daniel J. Hofmann
f95a4b9b46 Remove iterator_range dead code 2015-09-28 15:00:21 +02:00
Daniel J. Hofmann
6b444a0877 Do not include Boost.Thread is a sub-header is good enough.
`boost::thread_specific_ptr` lives in `<boost/thread/tss.hpp>`.

In addition, fix the includes in the touched header.

Reference:

- http://www.boost.org/doc/libs/1_59_0/doc/html/thread/thread_local_storage.html
2015-09-28 15:00:20 +02:00
Daniel J. Hofmann
5c4a845b55 Remove template-heavy Boost.MPL headers where not needed.
This removed mpl headers from the code base, where not needed.

This mostly affects unit tests, where mpl's type list is actually only
used once to automatically generate tests for multiple types (see ref).

In addition, this commit also fixes the includes in the touched headers.

Resulting in 1/ reduces build times and 2/ proper includes.

Reference:

- http://www.boost.org/doc/libs/1_59_0/libs/test/doc/html/boost_test/tests_organization/test_cases/test_organization_templates.html#ref_BOOST_AUTO_TEST_CASE_TEMPLATE
2015-09-28 15:00:20 +02:00
Daniel J. Hofmann
468c01056f Replace custom replace utility with the stdlib's replace algorithm.
This removes the custom `replaceAll` function, replacing it with
`std::replace` from the stdlib's `<algorithm>` header.

This also removes the respective unit test.

More importantly, this removes the dependency on the
`<boost/algorithm/string.hpp>` header in the `string_util.hpp` header.
2015-09-28 15:00:20 +02:00
Daniel J. Hofmann
397078758e Remove boost/thread from rtree, include header for hash_combine in unit test.
The `static_rtree.hpp` header included `<booost/thread.hpp>` without using
anything from this header.

Removing it showed why:

the unit test for the rtree no longer built, since it was missing symbols
for Boost's `hash_combine`, used in the unit test.

Instead of relying on `<boost/thread.hpp>` including the proper header
for `hash_combine` by chance that we only use in the unit test, do the
following:

- remove `<boost/thread.hpp>` from the rtree implementation
- add `<boost/functional/hash.hpp>` to the rtree unit test

As always, include what you use.
2015-09-28 15:00:20 +02:00
Daniel J. Hofmann
c9af06c9e0 Remove hand-written ConcurrentQueue class template.
We already rely on Intel TBB, which provides battle-tested
concurrency containers, such as:

- `concurrent_queue`,
- `concurrent_bounded_queue`,
- `concurrent_priority_queue`.

The `ConcurrentQueue` class template was never used. If the need
comes up again, we should strongly prefer those instead of writing
one ourselves.

References:

- https://www.threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_queue_cls.htm
- https://www.threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_bounded_queue_cls.htm
- https://www.threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_priority_queue_cls.htm
2015-09-28 15:00:20 +02:00
Patrick Niklaus
5a7e663b1d Merge pull request #1707 from arnekaiser/develop
Bugfix: allow POST request without POST data
2015-09-27 17:57:31 +02:00
akaiser
e0550cd20b Bugfix: allow POST request without POST data 2015-09-24 14:40:35 +02:00
Daniel Patterson
5844231a37 Include (road) name of matched nodes in addition to coordinate. 2015-09-23 17:53:34 +02:00
Lauren Budorick
8d435638e1 Delete accidental/extraneous files 2015-09-23 10:33:27 -04:00
Freenerd
55cad1b3ac Refactor alternative route test 2015-09-23 15:54:23 +02:00
Daniel J. Hofmann
9deadc1371 Static analysis: integration with the Static Analyzer.
This provides a wrapper script to invoke the Static Analyzer on the code
base. The script simply wraps your commands, that is you have to do the
following:

    ..scripts/analyze cmake ..
    ..scripts/analyze cmake --build .

Note: the Static Analyzer is integrated in Xcode, so if you are on a
Mac, consider using Xcode natively instead of this wrapper script that
will only give you HTML output.

Reference:

- http://clang-analyzer.llvm.org/
2015-09-22 17:32:32 +02:00
Daniel J. Hofmann
998abf05ba Integration scripts for Clang's Modernize and Tidy tool.
New directory: `scripts/`, in which small scripts for developers reside.

- `modernize`: runs all cpp files through `clang-modernize`, respecting
  out targeted compiler versions, applying C++11 transformations, doing
  syntax checks and formatting --- in parallel.

- `tidy`: runs all cpp files through `clang-tidy`, with selected
  warnings only, since we do not want to warn on every small detail.

Please check the talk slides for `clang-tidy` linked in the references!

References:

- http://clang.llvm.org/extra/clang-tidy/
- http://llvm.org/devmtg/2014-04/PDFs/Talks/clang-tidy%20LLVM%20Euro%202014.pdf
- http://clang.llvm.org/extra/clang-tidy/checks/list.html
- https://github.com/Project-OSRM/osrm-backend/pull/1603
2015-09-22 17:32:32 +02:00
Daniel J. Hofmann
aab5092da3 Use Readme.md as mainpage untill we have something better. 2015-09-22 16:26:21 +02:00
Daniel J. Hofmann
65ee5c4bbb Exclude unit tests and benchmarks from doxygen and make it more robust.
Only specify the flags we change from the default.

    doxygen -g Doxyfile

Generates a default Doxyfile.

Also, make the docs not depend on `dot`, but conditionally create graphs
if `dot` is available, and if not still generate docs.
2015-09-22 16:26:21 +02:00
Daniel J. Hofmann
42ab938a19 No longer generate XML from Doxygen, was used for Breathe+Sphinx integration. 2015-09-22 16:26:21 +02:00
Daniel J. Hofmann
2891de2fcd Add dependency on Dot to CMakeLists for Doxygen integration.
Reference:

- http://www.cmake.org/cmake/help/v3.0/module/FindDoxygen.html
2015-09-22 16:26:21 +02:00
Daniel J. Hofmann
ed3758874d Target developers with doxygen output, more callgraphs, internals.
See the changed flags for their detailed description, in short: this
makes the doxygen output even more awesome for developers.
2015-09-22 16:26:21 +02:00
Daniel Patterson
895d8179a2 Adds basic Doxygen support. Run and docs will end up in 2015-09-22 16:26:21 +02:00
Freenerd
e1ac1c4fdc Test that alternative route exists
Complement to a6b44a1470
2015-09-18 17:30:53 +02:00
Daniel Patterson
a6b44a1470 Revert alternative instructions array nesting to previous behaviour. 2015-09-17 09:06:51 -07:00
Daniel J. Hofmann
e8834a68f3 Script for fully automated test bisecting.
Automate cucumber tests bisecting by providing a `git bisect` script.

Because it is stored in source control, but bisecting changes the HEAD,
it is advised to first copy over the script to a place outside source
control, e.g. `/tmp`.

Usage:

    git bisect start HEAD HEAD~10
    bit bisect run /tmp/bisect_cucumber.sh

This automatically configures and builds OSRM, spawns the cucumber tests
and communicates with `git bisect` based on its return code.

Reference:

- man git-bisect
2015-09-16 19:13:31 +02:00
Daniel J. Hofmann
3279cbac24 Extend compressed output lifetime till the async write function finishes.
This extends the compressed output vector's lifetime, as we issue an
asynchronous write operation that only receives a non-owning buffer to
the compressed data.

When the compressed output vector then goes out of scope, its destructor
is called and the data gets (potentially) destroyed. If the asynchronous
write happens afterwards, it's accessing data that is no longer there.

This is the reason for race conditions --- well, for undefined behavior
in general, but it manifests in the routed _sometimes_ not responding at
all.

The fix works like this: keep the compressed output associated with a
connection. Connections inherit from `std::enable_shared_from_this` and
issues a `shared_from_this()` call, passing a `std::shared_ptr` to the
asynchronous write function, thus extending their lifetime.

Connecitons thus manage their lifetime by themselves, extending it when
needed (and of course via the `std::shared_pointers` pointing to it).

Buffer's non owning property, from the `async_write` documentation:

> One or more buffers containing the data to be written. Although
> the buffers object may be copied as necessary, ownership of the
> underlying memory blocks is retained by the caller, which must
> guarantee that they remain valid until the handler is called.

Reference:

- http://www.boost.org/doc/libs/1_59_0/doc/html/boost_asio/reference/async_write/overload1.html
2015-09-16 02:06:58 +02:00
bergwerkgis
5094bad838 kick off AppVeyor to test new binary Windows deps package, refs #1628 2015-09-15 12:23:25 +00:00
Daniel J. Hofmann
94af9b7f13 Caches iterators instead of invoking function calls on every iteration.
This caches iterators, i.e. especially the end iterator when possible.

The problem:

    for (auto it = begin(seq); it != end(seq); ++it)

this has to call `end(seq)` on every iteration, since the compiler is
not able to reason about the call's site effects (to bad, huh).

Instead do it like this:

    for (auto it = begin(seq), end = end(seq); it != end; ++it)

caching the end iterator.

Of course, still better would be:

    for (auto&& each : seq)

if all you want is value semantics.

Why `auto&&` you may ask? Because it binds to everything and never copies!

Skim the referenced proposal (that was rejected, but nevertheless) for a
detailed explanation on range-based for loops and why `auto&&` is great.

Reference:

- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3853.htm
2015-09-15 12:09:39 +02:00
Patrick Niklaus
8e02263084 Fix off-by one error in decoder and make padding deterministic. 2015-09-14 23:01:38 +02:00