-
Notifications
You must be signed in to change notification settings - Fork 91
Include and path handling optimization #447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It sounds reasonable to match compilers. However gcc will look in current directory in this little test case:
|
This test case works because
However, this works:
|
I don't remember the motivations. but basically it sounds reasonable to match gcc. |
I don't mind at all that realFilename is removed. So imho feel free to continue. I would like to see how the tests will be affected.. |
06cd050
to
51842f7
Compare
I would like that you make a draft PR in cppcheck repo which updates simplecpp in cppcheck. It would be interesting to see if all the tests in Cppcheck will also be happy with these changes. |
I've made some additional changes to the path handling, including a pretty big simplification of how path strings are computed. This means that many helper functions have been able to be removed, which I think makes the code cleaner and easier to follow, and also a bit faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits.
I am pretty happy about this and feel that the code is a lot cleaner overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to get the has_include done first but then I approve this..
This is a draft with some suggested changes to how includes are handled. There are several style issues at the moment, but I would like to discuss the overall approach before addressing those.
The changes to the include lookup procedure are based on the observation that files are very often included multiple times, but very rarely included with a different name. To take advantage of this, the lookup procedure first tries to do an optimistic match using the quick and simple path canonicalization provided by
simplifyPath
. In the rare case that a file is included with a name that hasn't been seen before (due to absoluteness change, capitalization, or link), then and only then is that file compared against the cached files using a more accurate but expensive procedure.For the case where the file identity cannot be determined with
simplifyPath
, the file id is used (FILE_ID_INFO
on windows,st_dev
&st_ino
on linux). This combined with the fact that file identity checking is delayed until absolutely necessary means that all calls torealFilename
have been removed. This is where most of the timesave comes from.Using file id means that the lookup procedure has less platform-specific code (only for getting and comparing the file id), and the behavior for case-sensitivty and links is uniform across platforms. The behavior for symlinks matches that of gcc and clang, which is to allows pragma once to work when a file is included with a different name. There's no standard definition for what pragma once should do, but personally I think this is the desired behavior. Intuitively, there are no situations where it is desirable for
#pragma once
to produce redefinition errors if a file is included multiple times using different names.The code runs about 2 minutes faster for a large test case on windows with many nested includes (12m40s down from 14m40s), though it's a contrived test case so I'm not sure how representative that is of a real use case.
Some other things;
-I.
is explicitly passed, but simplecpp does. I would like to clarify whether or not this is intentional or if the behavior should match other compilers.