-
Notifications
You must be signed in to change notification settings - Fork 9
siebenmann/cks-dtrace
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
== Our DTrace scripts These are DTrace scripts that we've found useful to troubleshoot things in our Solaris + ZFS + iSCSI fileserver environment; you can find some details of that environment at http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSFileserverSetup These scripts are for OmniOS r151008j or so (as of 2014-05-15) and thus more or less current Illumos, and may not work on any later or earlier version. They work for us but as usual there are no warranties. I make no claims that they are general about what they report; we wrote them to find out the information that we needed at the time. Please note that some of these scripts contain local assumptions and things that are very specific to our environment here. You definitely want to read the start of the scripts before using them blindly. === Using the scripts There are four different things you can use these scripts for, depending on what you need. nfs3-mon.d and zfs-mon.d provide top-like periodic reports on NFS v3 server activity and ZFS activity, suitable for running either when there's a load problem or you just want to know generally what sort of things are going on. nfs3-mon.d is focused on the sources of client activity so you can zero in on where load is coming from. zfs-mon.d focuses on what pools are active and how. nfs3-stats.d and iscsi-stats.d gather aggregate timing and activity information over however long you run them for NFS v3 server and iSCSI initiator activity. Their goal is to characterize how your IO looks like and how things seem to be performing; what activities are fast or slow, and how things break down. These two scripts were developed first and are the most primitive and peculiar. nfs3-long.d and iscsi-long.d provide immediate reports of long operations at the NFS v3 server or iSCSI initiator level. You would run these if you think (or know) that you have unusually slow operations and want to know what they look like and how often they happen. Right now the NFS stuff is focused on reads, writes, and commits (although it reports on most ops); the iSCSI initiator stuff looks at all SCSI iSCSI operations. zfstrace.d reports every ZFS, ZIO, and iSCSI initiator operation, with duration and other information. This isn't the kind of thing you read directly, live; instead it's the kind of thing you feed to detailed analysis that needs full information about each event. For example, working out your 99% performance point or graphing the distribution of IO times. In addition, fdrwmon.d will report per-FD read/write IO activity for a particular process, with a bunch of limitations based on what system calls it currently monitors. === Potential errors when using the scripts [NOTE: this information is probably obsolete for OmniOS. It's retained for historical notes and because I don't have the time to thoroughly revise this README right now.] Suppose that you get an error like: nfs3-mon.d: line 63: probe description fbt::rfs3_read:entry does not match any probes If you're on a Solaris 10 machine, what's probably going is that you haven't exported any filesystems via NFS (or perhaps set up any iSCSI stuff). Solaris does many things through loadable kernel modules, including both NFS serving and iSCSI. These modules are loaded on demand, when needed by things you're doing, and if the modules aren't loaded any trace points they define aren't available (both fbt function tracepoints and any general providers). You can see a list of what kernel modules are loaded with modinfo. In Solaris 10 update 8, the NFS server module is 'nfssrv (NFS server module)' and the iSCSI initiator is 'iscsi (Sun iSCSI Initiator v20100524-0)'. Next, suppose you get an error like: dtrace: failed to compile script nfs3-mon.d: line 118: os is not a member of struct objset What's happening here is that the contents of 'struct objset' changed between Solaris 10 update 8 (what we're using and what these scripts are written for) and later versions of Solaris. Look in the scripts for comments about the ZNODE_TO_POOLNAME() macro and how to change it. === Script details The current DTrace scripts here: - fdrwmon.d: reports on read/write IO per-second bandwidth for a particular process on a per-file-descriptor basis. See the comments at the start of the script for how to run it and what information it prints. Currently only read(), write(), readv(), and writev() activity is tracked. (A thorough script would probably also look at least at the recv*() and send*() families of socket calls. We haven't needed this ourselves so far.) - nfs3-stats.d: gathers timing information on NFS v3 server performance and ZFS performance (both high level and low level). This is mostly presented as distributions. - rfs3_* operations are NFS v3 server operations. - zfs_read and zfs_write are the high-level interface into ZFS, used by eg the NFS v3 server. They can hit the ARC instead of doing real IO. - zio_* operations are low level ZFS ZIO IO operations; as far as I know they come after the ARC. I don't really understand what zio_free does. In our environment, zio_ioctl operations are only used for sending disk flushes to the disks and this script ignores them for reasons beyond the scope of this README. - iscsi-stats.d: gathers relatively low-level timing information on iSCSI initiator performance. This accumulates not just averages (which iostat will sort of tell you) but also distributions and breaks down the numbers in various ways. - nfs3-long.d: report on long NFS v3 operations, primarily on read, write, and commit operations. Other operations have less details reported about them and the nominal filename and inode number associated with the operation needs operation-specific knowledge to interpret (eg, sometimes it is the directory the operation was done in, sometimes it is the thing the operation was working on, and so on). - iscsi-long.d: report on long iSCSI SCSI operations. - nfs3-mon.d: provides an every-10-seconds report of NFS v3 activity, primarily of read and write activity. The activity is broken down in various ways so that we/you can see who is causing it. See the comments at the start of the script for some information on what it reports. - zfs-mon.d: provides an every-10-seconds report of ZFS activity, again primarily of read and write activity. See the comments at the start of the script for information on what the reports mean. - zfstrace.d: provide timing traces of zfs_read()/zfs_write(), ZIO operations, and iSCSI operations; every operation is reported with the time it took and various other details. The output is cryptic and compact due to volume and speed concerns. Understanding some of the details of this information requires some knowledge of (Open)Solaris kernel internals, but I think that much of it will be relatively obvious. Right now we ignore the IO sizes and make no attempt to, eg, normalize the operation durations by the amount of data they were working on. In our environment this works out fine, but it may not in yours. Note that one zfs_read() or zfs_write() will generally result in multiple zio_* operations. Some of these operations are synchronous and some are not (eg, readahead or writeback). nfs3-dtrace.d makes some attempt to only really track synchronous operations that delay the completion of the higher-level IO but is probably not perfect about it. ZIO operations seem to bubble down (and up) through the layers of vdevs involved in any particular ZFS pool. We attempt to only follow ZIO operations directed against physical disks instead of the intermediate vdevs (because this is really the useful level to report ZIOs) but our handling of this may not be complete. Also, this has only been tested in a setup with mirror vdevs and there may be things I don't know about ZIO against raidz vdevs. (Chris Siebenmann, November 23 2012/May 15 2014 for OmniOS revisions)
About
DTrace scripts we use at CSLab
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published