Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry Policy and Batch Download/Clean Issue During Synchronization #1230

Open
AbderrahmaneAhmam opened this issue Aug 22, 2024 · 0 comments
Open

Comments

@AbderrahmaneAhmam
Copy link
Contributor

Hi,

I have detected an issue related to the retry policy and the handling of batches during the synchronization process. Specifically, the problem occurs when an error is encountered during the "applying changes" step. Here's a summary of the issue:

Current Behavior

When changes are being applied and an error occurs, the downloaded batches are typically deleted automatically by the following code:

// If the option to clean the folder is set && the message allows cleaning (it won't allow cleaning if the batch is an error batch)
if (this.Options.CleanFolder)
{
    // Before cleaning, check if we are not applying changes from a snapshot directory
    var cleanFolder = await this.InternalCanCleanFolderAsync(scopeInfo.Name, context.Parameters, message.Changes, cancellationToken, progress).ConfigureAwait(false);

    // Clear the changes because we don't need them anymore
    if (cleanFolder)
    {
        this.Logger.LogInformation($@"[InternalApplyChangesAsync]. Cleaning directory {{directoryName}}.", message.Changes.DirectoryName);
        message.Changes.TryRemoveDirectory();
    }
}

However, if a retry is initiated after an error, only the last batch is uploaded again. This can lead to issues such as foreign key constraint failures if the batch table has dependencies and those dependencies were not applied because the related batch does not exist. As a result, the error message may not accurately reflect the root cause. In cases where the last table has no dependencies, the synchronization may succeed even though previous tables were not properly applied.

Suggested Solutions

To address this issue, I suggest the following possible solutions:

  1. Separate Step for Applying Data: Introduce a separate step specifically for applying data. If an error occurs, the system should retry until the maximum retry count is reached. If the maximum retry count is reached without success, the batch files should be deleted.

  2. Retry from the First Batch: When an error occurs, the retry process should start from the first batch. This would involve re-downloading all the data. While this approach could lead to performance issues, it would ensure that all data is applied correctly.

  3. Batch Folder Management: Create a batch folder based on the client ID and avoid deleting it until the next synchronization. This way, the previous batch can be either deleted or retained based on the outcome of the next sync, helping to keep the disk space managed efficiently.

These improvements would help to ensure that data is applied correctly and that errors are handled more effectively during synchronization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant