Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle read errors in alloc_cluster() to avoid "infinite" loop #16

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

FreddieChopin
Copy link
Contributor

alloc_cluster() just ignores read errors, trying next cluster until it either succeeds (finds an empty one) or runs out of clusters (after checking all of them). A large volume may have quite a lot of clusters - eg. a 16 GB SD card with a standard format has about 2 million clusters. When a "persistent" read error happens during the alloc_cluster() (after a successful mount operation) - for example a volume is physically disconnected in a very inconvenient moment or the volume is/gets damaged and all further reads fail - then this loop becomes practically infinite. In one application we found a damaged SD card, for which reads of first 600-700 blocks work perfectly fine, but any read beyond that results in a SDIO interface timing-out (the card will not switch to expected state within specified time). As the timeout for the operation is ~100 ms, then the function would loop for over 2 days. The same card just fails to work in a PC, where any read beyond first ~350 kB (which is about 700 blocks) fails with an I/O error.

Fix this by returning from alloc_cluster() with an error when any read operation fails.

Fixes #15

alloc_cluster() just ignores read errors, trying next cluster until it
either succeeds (finds an empty one) or runs out of clusters (after
checking all of them). A large volume may have quite a lot of clusters -
eg. a 16 GB SD card with a standard format has about 2 million clusters.
When a "persistent" read error happens during the alloc_cluster() (after
a successful mount operation) - for example a volume is physically
disconnected in a very inconvenient moment or the volume is/gets damaged
and all further reads fail - then this loop becomes practically
infinite. In one application we found a damaged SD card, for which reads
of first 600-700 blocks work perfectly fine, but any read beyond that
results in a SDIO interface timing-out (the card will not switch to
expected state within specified time). As the timeout for the operation
is ~100 ms, then the function would loop for over 2 days. The same card
just fails to work in a PC, where any read beyond first ~350 kB (which
is about 700 blocks) fails with an I/O error.

Fix this by returning from alloc_cluster() with an error when any read
operation fails.

Fixes dlbeer#15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignoring read errors in alloc_cluster() can cause an "almost infinite" loop with broken SD cards
1 participant