Handle read errors in alloc_cluster() to avoid "infinite" loop #16

FreddieChopin · 2023-04-25T09:55:14Z

alloc_cluster() just ignores read errors, trying next cluster until it either succeeds (finds an empty one) or runs out of clusters (after checking all of them). A large volume may have quite a lot of clusters - eg. a 16 GB SD card with a standard format has about 2 million clusters. When a "persistent" read error happens during the alloc_cluster() (after a successful mount operation) - for example a volume is physically disconnected in a very inconvenient moment or the volume is/gets damaged and all further reads fail - then this loop becomes practically infinite. In one application we found a damaged SD card, for which reads of first 600-700 blocks work perfectly fine, but any read beyond that results in a SDIO interface timing-out (the card will not switch to expected state within specified time). As the timeout for the operation is ~100 ms, then the function would loop for over 2 days. The same card just fails to work in a PC, where any read beyond first ~350 kB (which is about 700 blocks) fails with an I/O error.

Fix this by returning from alloc_cluster() with an error when any read operation fails.

Fixes #15

alloc_cluster() just ignores read errors, trying next cluster until it either succeeds (finds an empty one) or runs out of clusters (after checking all of them). A large volume may have quite a lot of clusters - eg. a 16 GB SD card with a standard format has about 2 million clusters. When a "persistent" read error happens during the alloc_cluster() (after a successful mount operation) - for example a volume is physically disconnected in a very inconvenient moment or the volume is/gets damaged and all further reads fail - then this loop becomes practically infinite. In one application we found a damaged SD card, for which reads of first 600-700 blocks work perfectly fine, but any read beyond that results in a SDIO interface timing-out (the card will not switch to expected state within specified time). As the timeout for the operation is ~100 ms, then the function would loop for over 2 days. The same card just fails to work in a PC, where any read beyond first ~350 kB (which is about 700 blocks) fails with an I/O error. Fix this by returning from alloc_cluster() with an error when any read operation fails. Fixes dlbeer#15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle read errors in alloc_cluster() to avoid "infinite" loop #16

Handle read errors in alloc_cluster() to avoid "infinite" loop #16

FreddieChopin commented Apr 25, 2023

Handle read errors in alloc_cluster() to avoid "infinite" loop #16

Are you sure you want to change the base?

Handle read errors in alloc_cluster() to avoid "infinite" loop #16

Conversation

FreddieChopin commented Apr 25, 2023