Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script license header check and update in rust #2067

Closed
wants to merge 1 commit into from

Conversation

masih
Copy link
Member

@masih masih commented Oct 21, 2024

The previous license header update, implemented in bash, would duplicate the headers if run. The rules defined in that script also seem to have been inconsistently applied. Having said that it is not clear what the desired license header rule for this repo is.

The changes here implement the license check and update in rust. It runs far faster, is more maintainable, does not duplicate the license header if already present and would add it if it is missing.

The rules programmed in the rust implementation are specifically picked to reduce the churn in code, and hopefully by the grace of reviewers, learn what the intended rules for this repo should be.

The previous license header update, implemented in bash, would duplicate
the headers if run. The rules defined in that script also seem to have
been inconsistently applied. Having said that it is not clear what the
desired license header rule for this repo is.

The changes here implement the license check and update in rust. It runs
far faster, is more maintainable, does not duplicate the license header
if already present and would add it if it is missing.

The rules programmed in the rust implementation are specifically picked
to reduce the churn in code, and hopefully by the grace of reviewers,
learn what the intended rules for this repo should be.
@codecov-commenter
Copy link

codecov-commenter commented Oct 21, 2024

Codecov Report

Attention: Patch coverage is 0% with 54 lines in your changes missing coverage. Please review.

Project coverage is 75.46%. Comparing base (4cb7342) to head (6823eeb).

Files with missing lines Patch % Lines
scripts/check-license/src/main.rs 0.00% 54 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2067      +/-   ##
==========================================
- Coverage   75.72%   75.46%   -0.26%     
==========================================
  Files         153      154       +1     
  Lines       15677    15731      +54     
==========================================
  Hits        11871    11871              
- Misses       3806     3860      +54     
Files with missing lines Coverage Δ
ipld/blockstore/src/block.rs 61.29% <ø> (ø)
ipld/encoding/src/ipld_block.rs 58.41% <ø> (ø)
scripts/check-license/src/main.rs 0.00% <0.00%> (ø)

@masih masih requested a review from Stebalien October 21, 2024 20:05
@masih masih marked this pull request as ready for review October 21, 2024 20:05
@Stebalien
Copy link
Member

The previous license header update, implemented in bash, would duplicate the headers if run. The rules defined in that script also seem to have been inconsistently applied. Having said that it is not clear what the desired license header rule for this repo is.

I think the idea is:

  1. In general, make sure we have an SPDX line for all checked-in rust files.
  2. If we touch it, add a "copyright PL" line.

The rules are mostly consistent for new files, although they can end up duplicating lines on occasion.

The changes here implement the license check and update in rust. It runs far faster, is more maintainable, does not duplicate the license header if already present and would add it if it is missing.

I'm concerned that in this case, especially because this is rust, the cost of compiling likely exceeds the cost of running them (unless already cached). Even with the cache, It's only slightly faster to use rust and wouldn't even compare with a little bit of bash optimization.

@Stebalien
Copy link
Member

If we just want to make it faster: #2068

@@ -0,0 +1,12 @@
[package]
name = "check-license"
version.workspace = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use a separate version here and specify "publish = false".

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random rust comments.

Comment on lines +97 to +98
file.write_all(license.as_bytes())?;
file.write_all(b"\n")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd usually write this as:

writeln!(&file, "{license}")?;

String formatting uses compiler magic to do most of the work at compile time, so it'll be very fast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, if you want performance a bufwriter might be in order.

println!("Adding license header to: {}", path.display());

let file_content = fs::read_to_string(path)?;
let mut file = OpenOptions::new().write(true).truncate(true).open(path)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this doesn't actually need to be mutable because Write is implemented for &File (a trick to allow multiple concurrent writers without locking).

.collect();

let mut missing_licenses: Vec<&str> = Vec::new();
for (regex, is_required, text_to_add) in LICENSE_CHECKS.iter() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (regex, is_required, text_to_add) in LICENSE_CHECKS.iter() {
for (regex, is_required, text_to_add) in &LICENSE_CHECKS {

(does the same thing)

}
}

if !found_match && *is_required {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not skip early if not required?

Comment on lines +77 to +83
let mut found_match = false;
for line in &lines {
if regex.is_match(line) {
found_match = true;
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let found_match = lines.iter().find(|&l|regex.is_match(l)).is_some()

if !missing_licenses.is_empty() {
println!("Adding license header to: {}", path.display());

let file_content = fs::read_to_string(path)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if you're just going to write it back to a file, you might as well read it as bytes.

file.write_all(b"\n")?;
}

file.write_all(file_content.as_bytes())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note: there's no need to mimic cat here. You can just:

  1. Open the file.
  2. Write at the beginning.
  3. Close it.

continue;
}

if path.is_file() && path.extension().and_then(|s| s.to_str()) == Some("rs") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if path.is_file() && path.extension().and_then(|s| s.to_str()) == Some("rs") {
if path.is_file() && path.extension().map(|e| e == "rs").unwrap_or(false) {

(slightly shorter and &OsStr == &str is valid, IIRC)

/// checks all `.rs` files within the specified directory, skipping any directories that are listed
/// in the `IGNORED_DIRECTORIES` set. If a required license is missing, it is automatically added
/// to the beginning of the file.
fn main() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: main can return errors.

/// IGNORED_DIRECTORIES is a static set containing directories to ignore during the license check.
static ref IGNORED_DIRECTORIES: HashSet<PathBuf> = {
let mut set = HashSet::new();
set.insert(PathBuf::from("./target/"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set.insert(PathBuf::from("./target/"));
set.insert("./target/".into());

Rust is all about the magic (when it works).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looks like this is being used as a vec.

@masih
Copy link
Member Author

masih commented Oct 22, 2024

Thanks Steven, I have got more than I wanted out of this PR. I'm specifically grateful for random rust comments 🍻

@masih masih closed this Oct 22, 2024
@masih masih deleted the masih/check-license branch October 22, 2024 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants