Warning: specifying a directory and file structure is easy. However, it is also very easy to make it too large. Width=5,depth=5,files=5
results in 3905 directories and 15625 files!
Vdbench allows 32 million files per FSD, 128 million when running 64bit java. About 64 bytes of Java heapspace is needed per file, possibly causing memory problems. You may have to update the Xmx
parameter in your ./vdbench
script.
警告:指定目录和文件结构很容易,但也很容易使其过大。Width=5,depth=5,files=5
会产生3905个目录和15625个文件!
Vdbench允许每个FSD最多有3200万个文件,在64位Java上运行时为1.28亿个文件。每个文件大约需要64字节的Java堆空间,可能会导致内存问题。您可能需要更新./vdbench
脚本中的Xmx
参数。
fsd=
uniquely identifies each File System Definition. The FSD name is used by the Filesystem Workload Definition (FWD) parameter to identify which FSD(s) to use for this workload.
When you specify default
as the FSD name, the values entered will be used as default for all FSD parameters that follow.
fsd=
用于唯一标识每个文件系统定义。该FSD名称由文件系统工作负载定义(FWD)参数使用,以确定要用于此工作负载的哪个FSD(或多个FSD)。
当您将default
指定为FSD名称时,输入的值将用作随后所有FSD参数的默认值。
The name of the directory where the directory structure will be created. This anchor may not be a parent or child directory of an anchor defined in a different FSD. If this anchor directory is the same as an anchor directory in a different FSD the directory structure (width, depth etc) must be identical. If this directory does not exist, Vdbench will create it for you. If you also want Vdbench to create this directory’s parent directories, specify ‘create_anchor=yes’.
这是一个目录的名称,用于创建目录结构。这个锚点不能是另一个FSD中定义的锚点的父目录或子目录。如果这个锚点目录与另一个FSD中的锚点目录相同,则目录结构(宽度、深度等)必须相同。如果这个目录不存在,Vdbench会为您创建它。如果您还想让Vdbench创建这个目录的父目录,请指定create_anchor=yes
。
# Example:
anchor=/file/system/
This parameter allows you to quickly create a sequence of FSDs, e.g. fsd=fsd,anchor=/dir,count=(1,5)
results in fsd1-fsd5 for /dir1 through /dir5
该参数允许您快速创建一系列FSDs,例如,fsd=fsd,anchor=/dir,count=(1,5)
将在/dir1
到/dir5
上生成fsd1
到fsd5
。
With Vdbench running multiple slaves and optionally multiple hosts, communications between slaves and hosts about a file’s status becomes difficult. The overhead involved to have all these slaves communicate with each other about what they are doing with the files just becomes too expensive. You don’t want one slave to delete a file that a different slave is currently reading or writing. Vdbench therefore does not allow you to share FSDs across slaves and hosts.
That of course all sounds great until you start working with huge file systems. You just filled up 500 terabytes of disk files and you then decide that you want to share that data with one or more remote hosts. Recreating this whole file structure from scratch just takes too long. What to do?
When specifying shared=yes
, Vdbench will allow you to share a File System Definition (FSD). It does this by allowing each slave to use only every nth
file as is defined in the FSD file structure, where n
equals the amount of slaves.
This means that the different hosts won’t step on each other’s toes, with one exception: When you specify format=yes
, Vdbench first deletes an already existing file structure. Since this can be an old file structure, Vdbench cannot spread the file deletes around, letting each slave delete his nth
file. Each slave therefore tries to delete ALL files, but will not generate an error message if a delete fails (because a different slave just deleted it). These failed deletes will be counted and reported however in the ‘Miscellaneous statistics’, under the FILE_DELETE_SHARED
counter. The FILE_DELETES
counter however can contain a count higher than the amount of files that existed. I have seen situations where multiple slaves were able to delete the same file at the same time without the OS passing any errors to Java.
If you're sure you will want to delete an existing file structure each time you run, you can of course also code startcmd="rm -rf /file/anchor"
which will do the delete for you. Be careful though; Vdbench only deletes its own files, while rm -rf /root
deletes anything it finds.
在Vdbench运行多个从机和可选的多个主机时,主从节点之间关于文件状态的通信变得困难。所有从机相互通信以了解它们对文件的操作会带来巨大开销。您不希望一个从机删除另一个从机当前正在读取或写入的文件。因此,Vdbench不允许在从机和主机之间共享FSDs。
然而,当您开始处理巨大的文件系统时,情况就有所不同了。例如,您填满了500TB的磁盘文件,然后决定要与一个或多个远程主机共享这些数据。重新创建整个文件结构就变得太耗时了。该怎么办呢?
通过指定shared=yes
,Vdbench允许您共享文件系统定义(FSD)。它通过允许每个从机仅使用FSD文件结构中定义的每第n
个文件来实现。这里的n
等于从机的数量。
这意味着不同的主机不会干扰彼此的操作,但有一个例外情况:当您指定format=yes
时,Vdbench首先会删除已存在的文件结构。由于这可能是旧的文件结构,Vdbench无法将文件删除操作分散到各个从机中,以便每个从机删除其第n
个文件。因此,每个从机都会尝试删除所有文件,但如果删除失败(因为另一个从机刚刚删除了它),则不会生成错误消息。但这些失败的删除操作会计数并在“其他统计信息”下的FILE_DELETE_SHARED
计数器中进行报告。然而,FILE_DELETES
计数器中可能包含的计数可能超过实际存在的文件数量。我曾经遇到过多个从机能够同时删除同一个文件而不会导致操作系统向Java传递任何错误的情况。
如果您确定每次运行都要删除现有文件结构,当然也可以编写startcmd="rm -rf /file/anchor"
来进行删除操作。但请注意,Vdbench只会删除其自己的文件,而rm -rf /root
会删除它找到的任何文件。
This parameter specifies how many directories to create in each new directory. See above for an example.
这个参数指定在每个新目录中创建多少个子目录。
This parameter specifies how many levels of directories to create under the anchor. See above for an example.
这个参数指定在锚点下创建多少级目录。
This parameter specifies how many files to create in the lowest level of directories. See above for an example. Note that you need at least one file per fwd=xxx,threads=
parameter specified. If there are not enough files, a thread may try to find an available file up to 10,000 times before it gives up.
Vdbench for each directory or file keeps track of a lot of information. That requires about 100 bytes per directory/file in the java heap. I suggest that you plan for about 200 bytes worth of java heap size though, this to accommodate very fast selection and switching of different files, and then avoid possible Garbage Collection issues. See 'GcTracker' information in thism document.
该参数指定在最深层目录中创建多少个文件。请参见上文的示例。请注意,每个fwd=xxx,threads=
参数至少需要一个文件匹配。如果文件不足,线程可能会尝试最多10,000次查找可用文件,然后放弃。
Vdbench会跟踪每个目录或文件的大量信息,这需要大约每个目录/文件100字节的Java堆空间。我建议您计划使用大约200字节的Java堆空间,以便快速选择和切换不同的文件,并避免可能的垃圾回收问题。请参阅本文档中有关“GcTracker”信息的部分。
This parameter specifies the size of the files. Either specify a single file size, or a set of pairs, where the first number in a pair represents file size, and the second number represents the percentage of the files that must be of this size. E.g. sizes=(32k,50,64k,50)
When you specify sizes=(nnn,0)
, Vdbench will create files with an average size of nnn
bytes. There are some rules though related to the file size that is ultimately used:
-
If size > 10m, size will be a multiple of 1m
-
If size > 1m, size will be multiple of 100k
-
If size > 100k, size will be multiple of 10k
-
If size < 100k, size will be multiple of 1k.
这个参数指定文件的大小。您可以指定单个文件大小,也可以指定一组大小和对应的百分比。例如,sizes=(32k,50,64k,50)
表示50%的文件大小为32k,另外50%的文件大小为64k。
当您指定sizes=(nnn,0)
时,Vdbench将创建平均大小为nnn
字节的文件。但是,最终使用的文件大小会根据以下规则确定:
- >10m,则大小将是1m的倍数。
- >1m,则大小将是100k的倍数。
- >100k,则大小将是10k的倍数。
- <100k,则大小将是1k的倍数。
Use this parameter to pass extra flags to open request (See: man open(2)) Flags currently supported: O_DSYNC, O_RSYNC, O_SYNC
.
See also ‘openflags=’: Control over open() system call.
使用此参数向打开请求传递额外的标志(参见:man open(2))。目前支持的标志有:O_DSYNC, O_RSYNC, O_SYNC
。
另请参阅'openflags=':对open()系统调用的控制。
This parameter stops the creation of new files after the requested total amount of file space is reached. Be aware that the depth=
, width=
, files=
and sizes=
parameter values must be large enough to accommodate this request. See also the RD fortotal=
parameter.
Example: totalsize=100g
See also 'format=limited'
该参数在达到请求的总文件空间后停止创建新文件。请注意,depth=
、width=
、files=
和sizes=
参数的值必须足够大以满足此请求。另请参阅RD的fortotal=
参数。
例如:totalsize=100g
。还请参阅'format=limited'。
While the depth, width, files and sizes parameters define the maximum possible file structure, totalsize=
if used specifies the amount of files and file space to create.
workingsetsize=
creates a subset of the file structure of those files that will be used during this run. If for instance you have 200g worth of files, and 32g of file system cache, you can specify wss=32g
to make sure that after a warmup period, all your file space fits in file system cache. Can also be used with forworkingsetsize
or forwss
.
尽管深度、宽度、文件数和大小参数定义了最大可能的文件结构,但如果使用了 totalsize=
参数,则指定要创建的文件数量和文件空间。
workingsetsize=
创建文件结构的子集,即在此运行期间将要使用的文件。例如,如果您有200GB的文件,并且有32GB的文件系统缓存,您可以指定 wss=32g
来确保在热身期之后,所有文件空间都适合文件系统缓存。也可与 forworkingsetsize
或 forwss
一起使用。