Skip to content

Reduce binary size #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 10, 2016
Merged

Reduce binary size #79

merged 17 commits into from
May 10, 2016

Conversation

glynos
Copy link
Member

@glynos glynos commented May 8, 2016

This is a major change coming in.

In a private correspondence, someone told me that they were not able to use the URI on an embedded project because of the binary size. I measured the size of the libnetwork-uri.a file that we output:

Using Ubuntu 15.10, clang 3.7, boost 1.60

glynos@trillian:~/Development/uri/_build$ ls src/libnetwork-uri.a -l
-rw-rw-r-- 1 glynos glynos 11010822 May  8 18:47 src/libnetwork-uri.a

And optimizing for size (-Os):

glynos@trillian:~/Development/uri/_build$ ls src/libnetwork-uri.a -l
-rw-rw-r-- 1 glynos glynos 1373778 May  8 18:50 src/libnetwork-uri.a

This is considerable for a library that does little more than process a string. The explanation for the size comes from the fact that we use Boost.Spirit to perform the parsing. There are other reasons to move away from Boost.Spirit (the number of source files it pulls in, compile times, the error messages (!)), so I decided just to dive in and write a URI parser from scratch.

Now, I have updated the library so that all existing tests succeed with the new parser. It didn't take as long as I had first feared. It is pretty clean, and does not allocate any memory (apart from the original std::string object). It is faster to run and the compile times are much reduced. And the motivating factor (binary sizes) show that this was the right thing to do:

glynos@trillian:~/Development/uri/_build$ ls src/libnetwork-uri.a -l
-rw-rw-r-- 1 glynos glynos 667262 May  8 18:51 src/libnetwork-uri.a

And optimizing for size (-Os):

glynos@trillian:~/Development/uri/_build$ ls src/libnetwork-uri.a -l
-rw-rw-r-- 1 glynos glynos 183204 May  8 18:51 src/libnetwork-uri.a

That's a 16.5x improvement, and 7.5x with the optimization flag set.

Also, I feel that the new code is much cleaner. You don't need to be Boost.Spirit guru to understand it.

So what would be the downsides to re-writing the parser from scratch (apart from me losing part of a weekend)?

  1. Parts of the validation are incomplete - i.e. right now, the new parser is pretty permissive to errors in the host name and path. However, our test suite is fully green and showed no regression. Improvements can be made continuously.
  2. I didn't really look at the the other parts of the URI, though I eventually want to redesign these anyway. With one exception, all of the normalization, comparison, and resolution tests passed. The exception is very, very weird though, and I have commented out the offending code for now (on GCC and Clang on my local machine it works, but Travis totally rejects it - see uri.cpp lines 356-361).
  3. It doesn't work on MSVC at all. I tried with MSVC 2013, and I don't yet have access to MSVC 2015.

@deanberris
Copy link
Member

LGTM

Just to be clear, we're now using a hand-rolled parser. That's good. However, you were measuring a debug build of the library, right? Have you tried a release build to see if the symbols that are exported are still there?

I think it's the right direction and it's better we do this sooner than later.

@deanberris deanberris merged commit adba3f9 into cpp-netlib:master May 10, 2016
@glynos glynos deleted the reduce_binary_size branch May 18, 2016 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants