Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove limit maximum number of files #11

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

DavidVentura
Copy link
Contributor

Fixes #10.

After the naive implementation (just bumping the u16 and protocol version), fastsync crashed with "too many open files" -- this is why I changed the fds to be opened on demand (leading to vanishing file precautions)

The test with 20k files runs in 0.8s on my machine.

@DavidVentura DavidVentura requested a review from ruuda November 26, 2024 17:23
Copy link

@kucharskim kucharskim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not ready yet.

src/main.rs Outdated
Comment on lines 481 to 489
// Resize the file to its final size already:
// * So that the file system can do a better job of allocating
// a single extent for it, and it doesn't have to fragment
// the file.
// * If we run out of space, we learn about that before we waste
// time on the transfer (although then maybe we should do it
// before we receive a chunk after all?).
// This can make debugging a bit harder, because when you look
// at just the file size you might think it's fully transferred.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. this is not resume friendly. If transfer fails in the middle, based on size we wouldn't know transfer was not done. I am not sure is this worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to implement the ability to resume we need to make more changes (eg: file list now does not send metadata), so I wouldn't worry about it now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be fine, if target is temporary location, which will be renamed into place after the transfer. That would be better.

Copy link
Contributor

@ruuda ruuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions. I admit that I am totally over-engineering this, but on the other hand, this is kind of the toy project where we get to do that and it’s not a lot of work, so let’s make it faster 😈

src/main.rs Outdated Show resolved Hide resolved
let (events_tx, events_rx) = std::sync::mpsc::channel::<SenderEvent>();
let cwd = env::current_dir().unwrap();
thread::spawn(|| {
let td = TempDir::new_in(".").unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to run this on a ramdisk, I didn’t do the math to check if my worry is justified, but I am worried about burning through my SSD if I run this on my laptop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now done in this weird fashion because we don't allow sending absolute paths

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, yes. But we could chdir into a directory in /tmp and run from there? (Or does that break the Rust test runner which by default runs tests in parallel?) (Sorry for being so paranoid, but I seem to be very good at corrupting btrfs file systems even when I’m not actively trying, and my root is on btrfs, I’m not very eager to tempt fate like this 🙊)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries, it is now like that (chdir to /tmp) which is what I meant by "a weird fashion"

src/main.rs Outdated Show resolved Hide resolved
// we effectively limit the amount of open files to the amount of send threads.
// However, this now introduces the possibility of files getting deleted between
// getting listed and their turn for transfer.
// A vanishing file is expected, it is not a transfer-terminating event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does mess up the progress measurement, but I suppose we could treat a vanished file as instantly completing all of its size bytes. (Which would mess up transfer rate estimates, but I think that’s okay. Ah oh actually we could decrement the total we expect if we see a vanishing file.)

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@ruuda ruuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good, but I caught a few more issues.

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Show resolved Hide resolved
src/main.rs Show resolved Hide resolved
Copy link
Contributor

@ruuda ruuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :shipit:

@ruuda ruuda dismissed kucharskim’s stale review January 7, 2025 18:45

“Not ready yet” comment no longer applies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2^16 file limit too small for a RocksDB directory
3 participants