Skip to content

According to an HN discussion, RG decides its read strategy (mmap vs. no-mmap) based on the predicted workload. In theory, could this be optimized if the file size were known ahead of time? #1769

Answered by BurntSushi
Qix- asked this question in General
Discussion options

You must be logged in to vote

I think you've kind of answered the question yourself already. You've already mentioned the main caveat: the additional stat call could indeed doom the strategy in many workloads. There are some circumstances where ripgrep will stat every file on Unix at least, but both are pretty uncommon. The first is if ripgrep has its output redirected to a file. It will stat each file it searches to ensure that it doesn't search the file its output is being redirected to (otherwise you could end up with an infinite feedback loop). The other is if the --max-filesize flag is used. A stat is necessary to determine whether a file's size is too big to search. Doing the memory map optimization you mention …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Qix-
Comment options

Answer selected by Qix-
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants