Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any file changes that occurred before InotifyFileWatcher was started but after the preceding EOF would not be detected #1

Merged
merged 2 commits into from
Aug 31, 2021

Conversation

theencee
Copy link

Description

Tail starts the FileWatcher (both INotify and Polling versions) once it detects an EOF for the first time (if follow is enabled).

If in between the EOF and when the INotifyFileWatcher starts, the file is modified, it won't notice the changes made to the file. This fix checks manually right after the watcher begins (without using fsnotify) if the file was modified and sends a delete, modify or truncate notification as appropriate.

This isn't a major concern if the file that's being tailed has a lot of modifications (e.g. audit log), because the next write will end up picking up all modifications, but it's still accurate.

Note: The upstream has a PR that tries to start the watcher right before the tailing begins. Unfortunately this causes a few tests related to re-seeking to fail. That is because the watcher relies on the fact that when it's started, its initial position is the size of the file at the point it's started. It uses that to detect if a file was truncated. If the position is 0 (as it would be at the start), it would never detect the first truncation and lines would be missed. While that could be re-written, I feel it's safer to keep that behavior as others could be relying on it (it's been there since the original fork of this project).

Testing Performed

It's not straight forward to simulate this behavior via a unit test. However it can be faked by adding in explict sleep commands within the tail itself. Namely in the waitForChanges method of tail.go:

func (tail *Tail) waitForChanges() error {
	if tail.changes == nil {
                 time.Sleep(5 * time.Second)
		pos, err := tail.file.Seek(0, io.SeekCurrent)
		if err != nil {
			return err
		}
		tail.changes, err = tail.watcher.ChangeEvents(&tail.Tomb, pos)
		if err != nil {
			return err
		}
	}
        // snip...
}

Once that sleep is added it recreates the scenario where things could get written to post-EOF and pre-watcher start. Thanks @misberner for the suggestion!

For more details, ref the Confluence page about this section.

Copy link

@misberner misberner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to go back and compare this with the experiment I did a while back. If I am not mistaken, I needed to make changes in two different places

watch/inotify.go Outdated
util.Fatal("Failed to stat file %v: %v", fw.Filename, err)
}

if fw.Size > 0 && fw.Size > fi.Size() { // old file size was larger than now => truncated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can fi.Size() ever be negative? If not, fw.Size() > 0 is redundant

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't imagine os.Stat would return a negative file size. Removed

watch/inotify.go Outdated
changes.NotifyDeleted()
return nil, true
}
util.Fatal("Failed to stat file %v: %v", fw.Filename, err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems in line with the library, but I find it highly concerning if I can make a program crash by concurrently removing read access from the directory containing a watched file :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense even though it wouldn't apply to our use case. There are even comments like: // XXX: report this error back to the user in the code which I highly doubt would ever actually be fixed. In this case I think we should log an error and treat it as a deleted file

Copy link
Author

@theencee theencee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@misberner I think the 2nd change you are referring to is a separate issue that I'm not addressing here. Your suggested fix was in tail.go but I moved it to the itnotify watcher itself

watch/inotify.go Outdated
util.Fatal("Failed to stat file %v: %v", fw.Filename, err)
}

if fw.Size > 0 && fw.Size > fi.Size() { // old file size was larger than now => truncated
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't imagine os.Stat would return a negative file size. Removed

watch/inotify.go Outdated
changes.NotifyDeleted()
return nil, true
}
util.Fatal("Failed to stat file %v: %v", fw.Filename, err)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense even though it wouldn't apply to our use case. There are even comments like: // XXX: report this error back to the user in the code which I highly doubt would ever actually be fixed. In this case I think we should log an error and treat it as a deleted file

}
_ = RemoveWatch(fw.Filename)
changes.NotifyDeleted()
return nil, true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until I looked up the usage I thought that the bool indicated whether the file was modified or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's if the file was deleted or not

Copy link

@misberner misberner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at my local working copy and couldn't really figure out what I thought was missing here, so let's go ahead with that fix

@theencee theencee merged commit 4070356 into master Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants