Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments and notes are lost after changing the save location of the document #229

Open
cary-rowen opened this issue Jun 26, 2023 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@cary-rowen
Copy link
Collaborator

Describe the Problem

If the user changes the save location of the document, they will not be able to view comments and notes previously saved for this document.

To Reproduce

Steps to reproduce the behavior:

  1. Open a document.
  2. Add some notes or comments.
  3. Close the document, save the document to another location
  4. Reopen the document
  5. Unable to find saved notes and comments.

Expected behavior

Users can still view saved comments and annotations despite changing the save location.

If the problem is related to a file, indicate the file you have opened

None

Desktop (please complete the following information):

  • OS: [Windows 10 22H2 (AMD64) build 19045.3086]
  • Bookworm version [Latest Build]
  • Recent settings you may have changed in Bookworm [None]

Additional context

Can a document URI be an MD5 checksum?
@mush42

@mush42
Copy link
Collaborator

mush42 commented Jun 26, 2023

Hi @cary-rowen

Initially Bookworm used the content hash as an identifier, but this was an overkill for large documents, and is not applicable for documents that are loaded incrementally into memory.

Nevertheless, your point is valid, and the issue will be fixed as soon as possible.

Best

@cary-rowen
Copy link
Collaborator Author

Hi @mush42

Do you have any new ideas for a solution to this, or would it require too much of a change to implement?
When I reinstalled the OS, the books saved in oneDrive were forced to change locations, so I couldn't see the notes I saved before, LOL.

Best

@cary-rowen
Copy link
Collaborator Author

Not long ago, @mush42 provided a solution to this issue in a private chat with me:

#229: That one is easily fixable. For documents we load from the file system, we can use the sha1 hash of the document file and use it as an identifier in the database. Regardless of the path, as long as the bytes are the same, we load the same annotations.

However @mush42 you stated in your previous comment that there may be additional performance overhead associated with using file hash.

Do you or @pauliyobo have any thoughts to share about this?

Thanks

@cary-rowen
Copy link
Collaborator Author

Hi @pauliyobo

I just saw someone asking this question again in the Telegram group:

Hello, friends, greetings from Colombia. First of all, many thanks to the developers of this program, because it is truly excellent. I would like to ask you a question. I was recently reading a document in Bookworm. I set some bookmarks and made comments in different parts of the text. Then I wanted to change the document's location; that is, I moved it from the downloads folder to a new folder, but when I tried to retrieve the bookmarks and comments, they were no longer available. Is there any way to resolve this situation?

I noticed you made some changes to bookworm's database and did some in-depth research, would you be willing to fix this?

Thanks,
Cary

@pauliyobo
Copy link
Collaborator

Hello.
I think that the fundamental issue is that whenever the location for a document is changed, bookworm doesn't have a way to detect that.
So whenever the document is reopened from the new location, this will be treated as an entirely new document, hence why all annotations for it are lost.
There are a couple approaches we could use.

  1. We could either save the content hash once, and whenever a new book is opened, we could just query for it and act accordingly. I don't think we'll have a significant overhead, even more so if we just use a fast hashing algorithm.
  2. Since books do already have identifiers, we could just use what we already have and try to detect a location change through the document URI. We could query the database for a record with the same document title and the same document type. Though it's probably a good idea to prompt the user with a dialog asking whether the document detected is correct, and if so, if they would like the database record to be updated accordingly. If users are confident that they always want this behaviour, it might as well be put behind a configuration option.

@cary-rowen @mush42 let me know which approach could be better. I'm leaning more toward the second, simply because we'd be using elements we already have.

@mush42
Copy link
Collaborator

mush42 commented Dec 31, 2024

@pauliyobo I did use content hashes earlier but I removed it for some reason.

The justification for using hashes is that for editable documents such as word and plane text there is no such a thing as the same document. Since we use offsets they'll be invalid the moment the user edits the document. Content hashes can guard against this.

@pauliyobo
Copy link
Collaborator

pauliyobo commented Dec 31, 2024

@mush42
Makes sense. I didn't think of that scenario as I assumed the one we wanted to cover was specifically the one in which the location of the document would change, not the content.
Handling edited content is actually tricky, because then yeah, there's not probably much we can do to save the annotations related to that document. If there are ways to also take measures and update the annotations accordingly, do let me know.

@cary-rowen
Copy link
Collaborator Author

Even with the current solution (using document position), we are not able to properly handle annotations. If the annotated range of the editable document is updated, the offset may not change. Although the annotations exist, they may Already context-free

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants