Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for incremental update #8

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

d3m3t3r
Copy link
Contributor

@d3m3t3r d3m3t3r commented Aug 12, 2024

Adds support for incremental update of the compile database if it already exists. Basically, if a new compilation entry is detected, it overrides the entry referring to the same source file if it already exists, otherwise it's simply added. The source file is uniquely identified by "canonized" path constructed from directory and file values.

An external library - nlohmann/json (https://github.com/nlohmann/json) - is used as JSON handling is significantly more complex now.

d3m3t3r added 8 commits June 28, 2024 16:38
The existing compile commands database is no longer completely
overwritten but rather updated.  Entries for removed sources are removed
and entries for recompiled sources are updated.
Avoid having multiple entries for the same source file when multiple the commands
refer to it non-canonically. E.g.:
 "cc foo.c" in "bar/" directory vs "cc ../foo.c" in "bar/baz/" directory
 "cc foo.c" vs "cc ./foo.c"
 "cc foo.c" in "bar" directory vs "cc ../bar/foo.cc" in "bar/" directory
 "cc foo.c" vs "cc <symlink to foo.c>"
 ...
Copy link
Owner

@i-ky i-ky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work!
I've left few comments, please have a look.

Makefile Outdated Show resolved Hide resolved
@@ -61,6 +61,7 @@ Extra options can precede `--` if needed.
[clang]: https://clang.llvm.org
[compile-db-gen]: https://github.com/sunlin7/compile-db-gen
[compiledb]: https://github.com/nickdiego/compiledb
[nlohmann/json]: https://github.com/nlohmann/json
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess from basset user's perspective the most useful link will be this one:

Suggested change
[nlohmann/json]: https://github.com/nlohmann/json
[nlohmann/json]: https://json.nlohmann.me/integration/package_managers/

Comment on lines -265 to +263
bool CompilationDatabase::is_source_file(const string &argument) {
bool is_source_file(const string &argument) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think that it is a bad idea to encapsulate is_source_file() into CompilationDtabase?

src/basset.cpp Outdated Show resolved Hide resolved
Comment on lines +200 to +202
if (!ifs) {
return true;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check seems insufficient. Input file may not exist, in which case it is OK to return true; early. But it could as well be permission issue that prevented opening the file, in which case it makes more sense to return false;.

src/basset.cpp Outdated

if (exists(path(directory) / file)) {
index[make_index_key(directory, file)] =
make_pair(directory, (*it)["arguments"]);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will definitely work with compilation database that was generated by basset. However, according to format description there may be "command" instead of "arguments". Would be nice to support both formats. But it means extra complexity to parse shell command into separate arguments. What do you think, is it doable?

src/basset.cpp Outdated
auto json = nlohmann::json::array();
for (const auto &kv : index) {
json += nlohmann::json::object({{"directory", kv.second.first},
{"file", kv.first},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, using key as "file" will have an effect that it will always be an absolute path, right?

Comment on lines +229 to +231
if (!ofs) {
return false;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid it may be too late for this check. save() is called in the very end, after (potentially very time-consuming) build process is over. It would be very disappointing for the user to find out at this point that output file is not writable and all the collected data has nowhere to be dumped. I think this check needs to be done earlier. Maybe when constructing CompilationDatabase?

src/basset.cpp Outdated Show resolved Hide resolved

if (!cdb) {
cerr << "cannot open '" << output << "'\n";
CompilationDatabase cdb(output, is_source_file);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe we should introduce separate arguments for input and output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants