Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tree: fix lba_count size calculation #803

Merged
merged 1 commit into from
Apr 3, 2024
Merged

Conversation

igaw
Copy link
Collaborator

@igaw igaw commented Mar 28, 2024

The kernel reports the size in 512 byte units, which is might not match with lba_size.

Fixes: linux-nvme/nvme-cli#2260

src/nvme/tree.c Outdated
* size is in 512 bytes units and lba_count is in lba_size which are not
* necessarily the same.
*/
ns->lba_count = size >> (ns->lba_shift - 9);
Copy link
Contributor

@ikegami-t ikegami-t Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems better to use SECTOR_SHIFT instead of the hard coded value 9 and also seems more better to use lba_size and SECTOR_SIZE to calculate lba_count as below as you mentioned on the issue ticket.

ns->lba_count = size * SECTOR_SIZE / ns->lba_size;

or

ns->lba_count = size / (ns->lba_size / SECTOR_SIZE);

lba_count = size / (logical_block_size/512)

By the way seems also the driver implementation itself better to be changed as correctly use the logical block size for NVMe specification to convert to the size instead of the sector size.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kernel uses nvme_sec_to_lba for converting the sector numbers to LBA numbers. I'd rather go with this version and avoid multiplication which can easily overflow.

/*
 * Convert a 512B sector number to a device logical block number.
 */
static inline u64 nvme_sect_to_lba(struct nvme_ns_head *head, sector_t sector)
{
	return sector >> (head->lba_shift - SECTOR_SHIFT);
}

SECTOR_SIZE: alright, let me add it, we don't have it defined yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way seems also the driver implementation itself better to be changed as correctly use the logical block size for NVMe specification to convert to the size instead of the sector size.

Not sure if I understand you correctly. Are you suggesting that the sysfs entry size should report size in bytes and not sector numbers? If so, the sysfs entries are Linux API and thus can't be changed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay thanks for the reply to add the SECTOR_SIZE. Sorry about the size I tried to mention as change the sector size unit itself from 512 bytes to 4,096 bytes as same with the logical_block_size. By the way is there any API documentation about the sys entry size? Just confirmed some ABI sysfs block desciptions related but seems not described about the block device sector size.

  1. /sys/block//queue/chunk_sectors
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/stable/sysfs-block#n145
  2. /sys/block//queue/max_hw_sectors_kb
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/stable/sysfs-block#n395
  3. /sys/block//queue/physical_block_size
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/stable/sysfs-block#n514

Also just reconfirmed the SECTOR_SHIFT implementation at first I thought the definition can be changed for the hardware device but the linux kernel should be supported on multiple hardware devices so now I am thinking it is only supported the one size unit as implemented 512 bytes. Is this understanding correct? (But still the block device size seems not described as the 512 bytes unit.)
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/blk_types.h#n31

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, all sysfs entries are considered API, but not all of them are documented. The size entries is not documented yet. Feel free to submit a patch.

The block layer supports larger physical block sizes, e.g. 4k bytes, but the core calculates all in
fixed sector size of 512 bytes. So there is no real limitation at this level. What is not supported is
larger block size I/O. There is an active discussion going on https://lore.kernel.org/linux-block/03ebbc5f-2ff5-4f3c-8c5b-544413c55257@suse.de/

Copy link
Contributor

@ikegami-t ikegami-t Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your advice and information. I will do try to understand them more later.

(Added)
Notes:

  1. If the larger physical block size supported in future the NVMe max segments size also can be increased since in the past tried to do it as below.
    https://lore.kernel.org/linux-nvme/20200323182324.3243-1-ikegami.t@gmail.com/
  2. The NVMe max segments size increased to 8,192 KB from 4,092 KB before as below.
    drivers/nvme/host/pci.c:
- #define NVME_MAX_KB_SZ  4096
+ #define NVME_MAX_KB_SZ  8192
  #define NVME_MAX_SEGS   128

The kernel reports the size in 512 byte units, which is might not match
with lba_size.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
@igaw igaw merged commit a347a09 into linux-nvme:master Apr 3, 2024
14 checks passed
@igaw igaw deleted the fix-lba-size branch April 3, 2024 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong capacity for drive under usage with nvme-cli 2.8.
2 participants