Multiline Regex #1694
-
I'm curious, how does the multiline regex mode work? When I use the -U mode to match new lines, I thought that ripgrep has read entire file.
on http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.raw.en.gz, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
ripgrep does read the entire file, so I think I would have to scrutinize how you are measuring that it's only using 8MB of memory. On my Linux system, my
The first command is yours exactly while the second command is the same, but explicitly disables the use of memory maps. I suspect that's what is confounding you here. Namely, instead of reading the entire file on to the heap, ripgrep will try to memory map the file first. In cases where you're searching a huge file, and especially in cases where ripgrep has to have the entire file in a single contiguous memory region, memory mapping can be substantially faster. On Linux, when I memory map a file like this, it's included in the "RSS" or "resident set size" of the process, and so correctly reports the maximum memory usage of that process. Note though that it is not necessarily the case that memory mapping a large file will use an amount of memory equivalent to the size of the file. It is really up to the operating system to manage it. That means, it's possible that the way you're measuring memory usage isn't quite correct. But it might also be possible that Windows really isn't loading more than 8MB of the file into memory at a time. Whatever the case, this is transparent to ripgrep. Sadly, I'm not a Windows user or expert, so I'm not sure which is the case here and wouldn't know how to discover it without research. The last command above tweaks the pattern slightly to remove the |
Beta Was this translation helpful? Give feedback.
ripgrep does read the entire file, so I think I would have to scrutinize how you are measuring that it's only using 8MB of memory. On my Linux system, my
time
command is configured to tell me peak memory usage. Here are a few different variations of your command: