Skip to content

Commit

Permalink
Boost subshell performance by avoiding expensive pwd related syscalls (
Browse files Browse the repository at this point in the history
…ksh93#805)

In ksh93v- a sh.pwdfd variable was introduced, ostensibly for
implementing 'cd -f', 'pwd -f' and '${.sh.pwdfd}' as new features.
I might backport these in the future, but those aren't the focus of
this commit. Rather, what caught my interest was the effect of this
change on subshell performance (results from shbench[*] with 20
iterations, compiled with CCFLAGS='-O2 -fPIC'):

-------------------------------------------------------------
name           /tmp/ksh93u+2012-08-01  /tmp/ksh93u+m-dev
-------------------------------------------------------------
fibonacci.ksh  0.230 [0.227-0.235]     0.216 [0.21f1-0.222]
parens.ksh     0.208 [0.204-0.212]     0.173 [0.170-0.182]
-------------------------------------------------------------
name           /tmp/ksh93v-2012-08-24  /tmp/ksh93v-2014-12-24
-------------------------------------------------------------
fibonacci.ksh  0.269 [0.266-0.273]     0.247 [0.242-0.251]
parens.ksh     0.147 [0.145-0.149]     0.125 [0.121-0.127]
-------------------------------------------------------------
name           /tmp/ksh93u+redhat
-------------------------------------------------------------
fibonacci.ksh  0.264 [0.260-0.269]
parens.ksh     0.140 [0.137-0.144]
-------------------------------------------------------------

As can be seen above, 93u+ and 93u+m perform worse in the parens
benchmark than the ksh93v- release which implements the change, as
well as a copy of ksh93u+ compiled with Red Hat's patches for cd
(fdstatus, rmdirfix, cdfix and cdfix2 applied in that order).
However, 93u+ and 93u+m perform better in the fibonacci benchmark.

93v- is faster in the parens benchmark because it obtains the file
descriptor for the current working directory via openat(2) in
sh_diropenat, then caches it in sh.pwdfd. This fd is then reused in
sh_subshell(), which allows 93v- to avoid taxing open(2) and
fcntl(2) syscalls.

However, the 93v- implementation has a flaw. Unlike in 93u+, it
always duplicates sh.pwdfd with fcntl, even when it's a superfluous
operation. This is the cause of the performance regression in the
fibonacci benchmark.

To increase ksh93u+m's performance beyond both 93u+ and 93v-, I
have backported 93v-'s code, but with a crucial change:
- sh_subshell() no longer duplicates sh.pwdfd with fcntl during
  virtual subshell scope initialization. Rather, a sh_pwdupdate
  function is used in tandem with sh_diropenat in b_cd() to do all
  of this when the directory is being changed. This fixes the
  regression in the fibonacci benchmark. sh_pwdupdate will be
  called whenever sh.pwdfd changes to prevent file descriptor leaks
  while also avoiding erroneous close operations on the parent
  shell's file descriptor for the PWD.

Other changes applied include:
- The removal of support for systems which lack fchdir. ksh93v- did
  this twelve years ago in the oldest archived alpha release, which
  I think is a more than long enough deprecation window.
- If sh.pwdfd is -1 (i.e., the shell was initialized in a
  nonexistent directory), don't fork preemptively. It's more
  efficient to let cd check sp->pwdfd via a short function before
  attempting to change the directory. This allows subshells run in
  nonexistent directories to avoid forking, provided cd is never
  invoked.
- Removed <=1995 hackery from nv_open and env_init. During SH_INIT
  we should simply always invoke path_pwd(); giving $PWD NV_TAGGED
  is not just unnecessary, but possibly harmful when one attempts
  to use 'typeset -t' with the PWD variable.
- With the removal of that code, sh.pwd is guaranteed to get set
  during shell initialization (if it wasn't already). As such, the
  code no longer needs to cope with a NULL pwd and many such checks
  have been removed.
- sh.pwd no longer has a pointless and misleading const modifier.
  This allows for the elimination of many casts involving sh.pwd.
- sh.pwdfd is reset every time the pwd is (re)initialized in
  path_pwd().
- libast/features/fcntl.c: Account for systems which don't
  implement O_SEARCH and/or O_DIRECTORY. (I avoided backporting the
  ksh93v- feature test code because it's ginormous and filled with
  far too much compatibility hackery. If anyone wants to backport
  the mess that are the libast syscall intercepts, be my guest. In
  any case ksh doesn't need any of that to function correctly.)
- Goat sacrifices are hereby banned from 93u+m (for the curious,
  please see att#567).

With these changes ksh93u+m now handily beats the previous ksh
releases in the fibonacci and parens benchmarks (ksh2020 included
for comparison):

-------------------------------------------------------------
name           /tmp/ksh93u+2012-08-01  /tmp/ksh93v-2014-12-24  
-------------------------------------------------------------
fibonacci.ksh  0.230 [0.224-0.236]     0.247 [0.244-0.252]
parens.ksh     0.207 [0.203-0.213]     0.124 [0.121-0.127]
-------------------------------------------------------------
name           /tmp/ksh2020            /tmp/ksh93u+m-dev
-------------------------------------------------------------
fibonacci.ksh  0.369 [0.364-0.375]     0.217 [0.213-0.222]
parens.ksh     0.165 [0.162-0.169]     0.173 [0.171-0.176]
-------------------------------------------------------------
name           /tmp/ksh93u+m-patch
-------------------------------------------------------------
fibonacci.ksh  0.202 [0.198-0.206]
parens.ksh     0.082 [0.080-0.085]
-------------------------------------------------------------

[*] shbench: http://fossil.0branch.com/csb
  • Loading branch information
JohnoKing authored Jan 3, 2025
1 parent 9b0b260 commit 3d7b7e1
Show file tree
Hide file tree
Showing 10 changed files with 131 additions and 108 deletions.
71 changes: 56 additions & 15 deletions src/cmd/ksh93/bltins/cd_pwd.c
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,32 @@ static void rehash(Namval_t *np,void *data)
nv_rehash(np,data);
}

/*
* Obtain a file handle to the directory "path" relative to directory "dir"
*/
int sh_diropenat(int dir, const char *path)
{
int fd,shfd;
if((fd = openat(dir, path, O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
#if O_SEARCH
if(errno != EACCES || (fd = openat(dir, path, O_SEARCH|O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
#endif
return fd;
/* Move fd to a number > 10 and register the fd number with the shell */
shfd = sh_fcntl(fd, F_dupfd_cloexec, 10);
close(fd);
return shfd;
}

int b_cd(int argc, char *argv[],Shbltin_t *context)
{
char *dir;
Pathcomp_t *cdpath = 0;
const char *dp;
int saverrno=0;
int rval,pflag=0,eflag=0,ret=1;
char *oldpwd;
int newdirfd;
char *oldpwd, *cp;
Namval_t *opwdnod, *pwdnod;
NOT_USED(context);
while((rval = optget(argv,sh_optcd))) switch(rval)
Expand Down Expand Up @@ -118,13 +136,8 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
* If sh_subshell() in subshell.c cannot use fchdir(2) to restore the PWD using a saved file descriptor,
* we must fork any virtual subshell now to avoid the possibility of ending up in the wrong PWD on exit.
*/
if(sh.subshell && !sh.subshare)
{
#if _lib_fchdir
if(!test_inode(sh.pwd,e_dot))
#endif
sh_subfork();
}
if(sh.subshell && !sh.subshare && (!sh_validate_subpwdfd() || !test_inode(sh.pwd,e_dot)))
sh_subfork();
/*
* Do $CDPATH processing, except if the path is absolute or the first component is '.' or '..'
*/
Expand Down Expand Up @@ -181,21 +194,49 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
}
if(!pflag)
{
char *cp;
stkseek(sh.stk,PATH_MAX+PATH_OFFSET);
if(*(cp=stkptr(sh.stk,PATH_OFFSET))=='/')
if(!pathcanon(cp,PATH_DOTDOT))
continue;
}
if((rval=chdir(path_relative(stkptr(sh.stk,PATH_OFFSET)))) >= 0)
goto success;
if(errno!=ENOENT && saverrno==0)
cp = path_relative(stkptr(sh.stk,PATH_OFFSET));
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,cp);
if(newdirfd>0)
{
/* chdir for directories on HSM/tapeworms may take minutes */
if((rval=fchdir(newdirfd)) >= 0)
{
sh_pwdupdate(newdirfd);
goto success;
}
sh_close(newdirfd);
}
#if !O_SEARCH
else if((rval=chdir(cp)) >= 0)
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
#endif
if(saverrno==0)
saverrno=errno;
}
while(cdpath);
if(rval<0 && *dir=='/' && *(path_relative(stkptr(sh.stk,PATH_OFFSET)))!='/')
rval = chdir(dir);
/* use absolute chdir() if relative chdir() fails */
{
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,dir);
if(newdirfd>0)
{
/* chdir for directories on HSM/tapeworms may take minutes */
if((rval=fchdir(newdirfd)) >= 0)
{
sh_pwdupdate(newdirfd);
goto success;
}
sh_close(newdirfd);
}
#if !O_SEARCH
else if((rval=chdir(dir)) >= 0)
sh_pwdupdate(sh_diropenat(AT_FDCWD,dir));
#endif
}
if(rval<0)
{
if(saverrno)
Expand All @@ -221,7 +262,7 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
if(*dp && (*dp!='.'||dp[1]) && strchr(dir,'/'))
sfputr(sfstdout,dir,'\n');
nv_putval(opwdnod,oldpwd,NV_RDONLY);
free((void*)sh.pwd);
free(sh.pwd);
if(*dir == '/')
{
size_t len = strlen(dir);
Expand Down
2 changes: 1 addition & 1 deletion src/cmd/ksh93/features/externs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
hdr nc
mem exception.name,_exception.name math.h
lib setreuid,setregid,nice,fork,fchdir
lib setreuid,setregid,nice,fork
lib pathnative,pathposix
lib memcntl sys/mman.h

Expand Down
3 changes: 3 additions & 0 deletions src/cmd/ksh93/include/defs.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ extern char *sh_checkid(char*,char*);
extern void sh_chktrap(void);
extern void sh_deparse(Sfio_t*,const Shnode_t*,int,int);
extern int sh_debug(const char*,const char*,const char*,char *const[],int);
extern int sh_diropenat(int,const char *);
extern char **sh_envgen(void);
extern Sfdouble_t sh_arith(const char*);
extern void *sh_arithcomp(char*);
Expand Down Expand Up @@ -141,6 +142,8 @@ extern Dt_t *sh_subfuntree(int);
extern void sh_subjobcheck(pid_t);
extern int sh_subsavefd(int);
extern void sh_subtmpfile(void);
extern void sh_pwdupdate(int);
extern int sh_validate_subpwdfd(void);
extern char *sh_substitute(const char*,const char*,char*);
extern void sh_timetraps(void);
extern const char *_sh_translate(const char*);
Expand Down
3 changes: 2 additions & 1 deletion src/cmd/ksh93/include/shell.h
Original file line number Diff line number Diff line change
Expand Up @@ -371,7 +371,8 @@ struct Shell_s
int offsets[10];
Sfio_t **sftable;
unsigned char *fdstatus;
const char *pwd;
char *pwd;
int pwdfd; /* file descriptor for pwd */
void *jmpbuffer;
void *mktype;
Sfio_t *strbuf;
Expand Down
2 changes: 2 additions & 0 deletions src/cmd/ksh93/sh/fault.c
Original file line number Diff line number Diff line change
Expand Up @@ -687,6 +687,8 @@ noreturn void sh_done(int sig)
if(sh_isoption(SH_NOEXEC))
kiaclose((Lex_t*)sh.lex_context);
#endif /* SHOPT_KIA */
if(sh.pwdfd > 0)
close(sh.pwdfd);
/* Exit with portable 8-bit status (128 + signum) if last child process exits due to signal */
if(sh.chldexitsig)
savxit = savxit & ~SH_EXITSIG | 0200;
Expand Down
6 changes: 1 addition & 5 deletions src/cmd/ksh93/sh/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -1919,11 +1919,7 @@ static void env_init(void)
}
if(save_env_n)
sh.save_env_n = save_env_n;
if(nv_isnull(PWDNOD) || nv_isattr(PWDNOD,NV_TAGGED))
{
nv_offattr(PWDNOD,NV_TAGGED);
path_pwd();
}
path_pwd();
if((cp = nv_getval(SHELLNOD)) && (sh_type(cp)&SH_TYPE_RESTRICTED))
sh_onoption(SH_RESTRICTED); /* restricted shell */
/*
Expand Down
4 changes: 0 additions & 4 deletions src/cmd/ksh93/sh/name.c
Original file line number Diff line number Diff line change
Expand Up @@ -1507,11 +1507,7 @@ Namval_t *nv_open(const char *name, Dt_t *root, int flags)
{
cp++;
if(sh_isstate(SH_INIT))
{
nv_putval(np, cp, NV_RDONLY);
if(np==PWDNOD)
nv_onattr(np,NV_TAGGED);
}
else
{
char *sub=0, *prefix= sh.prefix;
Expand Down
8 changes: 5 additions & 3 deletions src/cmd/ksh93/sh/path.c
Original file line number Diff line number Diff line change
Expand Up @@ -213,8 +213,8 @@ char *path_pwd(void)
if(sh.pwd)
{
if(*sh.pwd=='/')
return (char*)sh.pwd;
free((void*)sh.pwd);
return sh.pwd;
free(sh.pwd);
}
/* First see if PWD variable is correct */
pwdnod = sh_scoped(PWDNOD);
Expand Down Expand Up @@ -249,7 +249,9 @@ char *path_pwd(void)
if(!tofree)
cp = sh_strdup(cp);
sh.pwd = cp;
return (char*)sh.pwd;
/* Set sh.pwdfd */
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
return sh.pwd;
}

/*
Expand Down
126 changes: 47 additions & 79 deletions src/cmd/ksh93/sh/subshell.c
Original file line number Diff line number Diff line change
Expand Up @@ -39,14 +39,6 @@
# define PIPE_BUF 512
#endif

#ifndef O_SEARCH
# ifdef O_PATH
# define O_SEARCH O_PATH
# else
# define O_SEARCH O_RDONLY
# endif
#endif

/*
* Note that the following structure must be the same
* size as the Dtlink_t structure
Expand Down Expand Up @@ -90,12 +82,9 @@ static struct subshell
char comsub;
unsigned int rand_seed; /* parent shell $RANDOM seed */
int rand_last; /* last random number from $RANDOM in parent shell */
int rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
char rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
uint32_t srand_upper_bound; /* parent shell's upper bound for $SRANDOM */
#if _lib_fchdir
int pwdfd; /* file descriptor for PWD */
char pwdclose;
#endif /* _lib_fchdir */
int pwdfd; /* parent shell's file descriptor for PWD */
} *subshell_data;

static unsigned int subenv;
Expand Down Expand Up @@ -457,6 +446,28 @@ void sh_subjobcheck(pid_t pid)
}
}

/*
* Set the file descriptor for the current shell's PWD without wiping
* out the stored file descriptor for the parent shell's PWD.
*/
void sh_pwdupdate(int fd)
{
struct subshell *sp = subshell_data;
if(!(sh.subshell && !sh.subshare && sh.pwdfd == sp->pwdfd) && sh.pwdfd > 0)
sh_close(sh.pwdfd);
sh.pwdfd = fd;
}

/*
* Check if the parent shell has a valid PWD fd to return to.
* Only for use by cd inside of virtual subshells.
*/
int sh_validate_subpwdfd(void)
{
struct subshell *sp = subshell_data;
return sp->pwdfd > 0;
}

/*
* Run command tree <t> in a virtual subshell
* If comsub is not null, then output will be placed in temp file (or buffer)
Expand Down Expand Up @@ -505,8 +516,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
path_get(e_dot);
sh.pathinit = 0;
}
if(!sh.pwd)
path_pwd();
sp->bckpid = sh.bckpid;
if(comsub)
sh_stats(STAT_COMSUB);
Expand All @@ -521,38 +530,9 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
sh.comsub = comsub;
if(!sh.subshare)
{
struct subshell *xp;
char *save_debugtrap = 0;
#if _lib_fchdir
sp->pwdfd = -1;
for(xp=sp->prev; xp; xp=xp->prev)
{
if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,sh.pwd)==0)
{
sp->pwdfd = xp->pwdfd;
break;
}
}
if(sp->pwdfd<0)
{
int n = open(e_dot,O_SEARCH);
if(n>=0)
{
sp->pwdfd = n;
if(n<10)
{
sp->pwdfd = sh_fcntl(n,F_DUPFD,10);
close(n);
}
if(sp->pwdfd>0)
{
fcntl(sp->pwdfd,F_SETFD,FD_CLOEXEC);
sp->pwdclose = 1;
}
}
}
#endif /* _lib_fchdir */
sp->pwd = (sh.pwd?sh_strdup(sh.pwd):0);
sp->pwd = sh_strdup(sh.pwd);
sp->pwdfd = sh.pwdfd;
sp->mask = sh.mask;
sh_stats(STAT_SUBSHELL);
/* save trap table */
Expand Down Expand Up @@ -626,21 +606,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
if(sh.savesig < 0)
{
sh.savesig = 0;
#if _lib_fchdir
if(sp->pwdfd < 0 && !sh.subshare) /* if we couldn't get a file descriptor to our PWD ... */
sh_subfork(); /* ...we have to fork, as we cannot fchdir back to it. */
#else
if(!sh.subshare)
{
if(sp->pwd && access(sp->pwd,X_OK)<0)
{
free(sp->pwd);
sp->pwd = NULL;
}
if(!sp->pwd)
sh_subfork();
}
#endif /* _lib_fchdir */
/* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
if(sh_isstate(SH_INTERACTIVE))
{
Expand Down Expand Up @@ -824,25 +789,29 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
free(savsig);
}
sh.options = sp->options;
/* restore the present working directory */
#if _lib_fchdir
if(sp->pwdfd > 0 && fchdir(sp->pwdfd) < 0)
#else
if(sp->pwd && strcmp(sp->pwd,sh.pwd) && chdir(sp->pwd) < 0)
#endif /* _lib_fchdir */
if(sh.pwdfd != sp->pwdfd)
{
saveerrno = errno;
fatalerror = 2;
/*
* Restore the parent shell's present working directory.
* Note: cd will always fork if sp->pwdfd is -1 (after calling sh_validate_subpwdfd()),
* which only occurs when a subshell is started with sh.pwdfd == -1. As such, in this
* if block sp->pwdfd is always > 0 (whilst sh.pwdfd is guaranteed to differ, and
* might not be valid).
*/
if(fchdir(sp->pwdfd) < 0)
{
/* Couldn't fchdir back; close the fd and cope with the error */
sh_close(sp->pwdfd);
saveerrno = errno;
fatalerror = 2;
}
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
path_newdir(sh.pathlist);
if(fatalerror != 2)
sh_pwdupdate(sp->pwdfd);
}
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
path_newdir(sh.pathlist);
if(sh.pwd)
free((void*)sh.pwd);
free(sh.pwd);
sh.pwd = sp->pwd;
#if _lib_fchdir
if(sp->pwdclose)
close(sp->pwdfd);
#endif /* _lib_fchdir */
if(sp->mask!=sh.mask)
umask(sh.mask=sp->mask);
if(sh.coutpipe!=sp->coutpipe)
Expand Down Expand Up @@ -918,8 +887,7 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
UNREACHABLE();
case 2:
/* reinit PWD as it will be wrong */
if(sh.pwd)
free((void*)sh.pwd);
free(sh.pwd);
sh.pwd = NULL;
path_pwd();
errno = saveerrno;
Expand Down
Loading

0 comments on commit 3d7b7e1

Please sign in to comment.