on-disk, on-memory file system, mounting, process and file system, file system calls
x=open("/d1/d2/f1", .....); // find the inode of "/d1/d2/f1"
- read the super block and find the location of the group descriptor
- read the group descriptor and find the location of the inode table
- read the inode table, find inode 2, find the block locations of "/"
- read the blocks of "/" and find the inode number of "d1"
- find the inode of "/d1" and find the block locations of "/d1"
- read the blocks of "/d1" and find the inode number of "d2"
- find the inode of "/d1/d2" and find the block locations of "/d1/d2"
- read the blocks of "/d1/d2" and find the inode number of f1
- find the inode of "/d1/d2/f1"
- disk is slow => open, read, write take too much time
- we cache frequently-used data (superblock, inode, group descriptor,...) into memory
- when caching, some additional information is added
- each disk has its own file system, and we need to know which meta block came from which disk
- (1)
- on-disk : ext2_super_block{}
- on-mem: super_block{}
- (2) additional info in super_block{} (include/linux/fs.h)
- s_list : next superblock
- s_dev: device number. which disk this superblock came from?
- s_type: file system type?
- s_op : operations on superblock
- s_root : root directory of the file system of this superblock
- s_files : link list of file{} belonging to this file system
- s_id : device name of this super block
- (3) all cached superblocks form a link-list pointed to by โsuper_blocksโ (fs/super.c)
Individual inode is cached when accessed by the system.
- (1)
- on-disk : ext2_inode{}
- on-mem: inode{} (include/linux/fs.h)
- (2) additional info
- i_list : next inode
- i_dentry: corresponding dentry list for this inode
- i_ino : inode number
- i_rdev: device this inode belongs to
- i_count: usage counter
- i_op: operations on this inode
- i_sb: pointer to super_block{} this inode belongs to
- i_pipe: used if a pipe
- (3) all cached inodes form a linked-list pointed to by โinode_in_useโ (fs/inode.c)
- (1) added info
- a buffer_head{} structure is attached to each cached block: (include/linux/buffer_head.h)
- b_blocknr : block number
- b_bdev : device this block belongs to
- b_size : block size
- b_data : original block
- (2) all cached blocks are attached to a hash table, โhash_table_arrayโ(linux 2.4)
- (1) for each cached directory entry, dentry{} structure is defined
- For example, when reading โ/aa/bbโ, three dentry objects are created: one for โ/โ, another for โaaโ, and the last for โbbโ.
- (2) dentry{} (include/linux/dcache.h)
- d_inode: pointer to the corresponding inode
- d_op : operations on this dentry
- d_mounted: this inode is a mounting point if d_mounted > 0
- d_name: corresponding file name (d_name.name is the actual file name)
All cached file systems are connected into one virtual file system through โmountingโ
- other file systems are mounted on this root file system
- meaning: mount the file system in /dev/x on /y/z
- mounted file system: /dev/x
- mounting point: /y/z
- mounting process:
- cache the file system in /dev/x
- cache superblock of /dev/x : sb
- cache the root inode of /dev/x : rinode
- sb->s_root = rinode
- connect the new file system to the mounting point
-
d_mounted of /y/z += 1 allocate vfsmount{}and set mnt_mountpoint=/y/z mnt_root= rinode mnt_sb=sb insert this vfsmount{} into mount_hashtable
- ```c struct vfsmount{ // include/linux/mount.h. mounting info of this fs struct vfsmount *mnt_parent; // parent vfsmount struct dentry *mnt_mountpoint; // mounting point struct dentry *mnt_root; // root of this file system struct super_block *mnt_sb; // super block of this file system char *mnt_devname; // dev name ....... };
-
Suppose we have two disks: dev0 and dev1. Suppose they have the file trees as below:
Assume dev0 is the root device (one which has the root file system).
- mount_root() caches the root file system: - cache the superblock - cache the root inode After this, the system has:
- cache the file system in /dev/fd0
- cache the superblock of /dev/fd0
- cache the root inode of /dev/fd0
- cache the inode of /d1
- cache the block of โ/โ
- cache the inode of /d1
- connect the root inode of /dev/fd0 to /d1
After caching the file system of /dev/fd0:
After caching the block of โ/โ:
After caching the inode of โ/d1โ and connecting the new file system with this:
After mounting, the final tree looks like:
The above tree will look as below to the user:
- each process has โrootโ and โpwdโ to access the root of the file system and to access the current working directory, respectively.
- example
- p1's root is what p1 thinks as "root"
- p1's pwd is the current location of p1
- when p1 says "/aa/bb", the system starts at p1's root for the search
- when p1 says "aa/bb", the system starts at p1's pwd for the search
chroot()
changes โrootโ to a โnew rootโchdir()
changes โpwdโ to a โnew pwdโ.
- example
- each process has โfd tableโ for file accessing
- the system has โfile tableโ to control the file accessing by a process
- the on-mem file system is represented by inode_in_use, super_blocks, hash_table_array
- for each opened file, we have file{} structure (include/linux/fs.h)
- f_list: next file{}
- f_dentry: link to the inode (actually dentry{}) of this file
- f_op : operations on this file{ (open, read, write, ...)
- f_pos : file read/write pointer. shows how much has been read/written
- f_count: number of links to this file{}
- ..........
- super_block{}->s_files contains a link list of file{} for each file system
- each process has (in task_struct) -- include/linux/sched.h
struct fs_struct *fs;
struct files_struct *files;
struct nsproxy *nsproxy; // namespace
struct nsproxy{ // include/linux/nsproxy.h
struct mnt_namespace *mnt_ns;
......
};
struct mnt_namespace{ // include/linux/mnt_namespace.h
struct vfsmount * root; // vfsmount of this process
.........
};
- fs contains root, pwd info
struct fs_struct{ // include/linux/fs_struct.h
struct path root, // the root inode of the file system
pwd; // the present working directory
.........
};
struct path { // include/linux/path.h
struct vfsmount *mnt;
struct denry *dentry;
};
- files contains fd table
struct files_struct{ // include/linux/file.h
struct fdtable *fdt;;
...........
};
struct fdtable{
struct file **fd; // fd table. file{} pointer array.
.......
};
- fork system call copies this fs, files structure, too โ so, the child inherits the root, pwd, and fd table of the parent.
x = open(โ/aa/bbโ, O_RDWR, 00777);
- meaning: find the inode of /aa/bb and open it
- algorithm:
- find the inode of
/aa/bb
- cache into memory
- connect to file table
- allocate
file{},
y
, insert tosb->s_files
linklist(sb
is the superblock of this process) y->f_dentry
= inode of/aa/bb
y->f_pos=0
- allocate
- find an empty entry in
fd
table,z
, and link toy
fd[z] = y
return z
- find the inode of
- Example:
y = read(x, buf, 10)
- meaning: go to the file pointed to by
fd[x]
and read 10 bytes intobuf
withf_op->read()
- algorithm:
- go to
file{}
pointed to byfd[x]
- go to
inode{}
pointed to byfile{}->f_dentry
- find the block location we want
- find the block in hash_table_array
- if not there, cache the block first
- read max 10 bytes starting from
file{}->f_pos
intobuf
- increase
file{}->f_pos
by actual num of bytes read - return the actual num of bytes read
- go to
y = write(x, buf, 10)
- meaning: go to the file pointed to by
fd[x]
, write max 10 bytes starting from the correspondingf_pos
, increasef_pos
by the actual num of bytes written, and return the actual num of bytes written.
close(x);
- meaning: close the file pointed to by
fd[x]
- algorithm:
fd[x]=0
file{}->f_count--
, wherefile{}
is the one pointed to byfd[x]
lseek(x, 20, 0)
- meaning: modify
f_pos
to 20, wheref_pos
is the file pointer of filex
. - example:
x=open(โ/aa/bbโ, .......); // open file /aa/bb
read(x, buf, 10); // read first 10 bytes into โbufโ
lseek(x, 50, SEEK_SET); // move f_pos to offset 50
read(x, buf, 10); // read 10 bytes staring from offset 50
y = dup(x);
- meaning: copy
fd[x]
intofd[y]
- example:
x = open(โ/aa/bbโ, ........); // fd[x] points to /aa/bb
y = dup(x); // fd[y] also points to /aa/bb
read(x, buf, 10); // read first 10 bytes
read(y, buf, 10); // read next 10 bytes
y = link(โ/aa/bbโ, โ/aa/newbbโ);
- meaning:
/aa/newbb
is now pointing to the same file as/aa/bb
- algorithm:
- make file
newbb
in/aa
directory - give it the same inode as
/aa/bb
- make file
1) Your Gentoo Linux has two disks: /dev/sda3
and /dev/sda1
. Which one is the root file system? Where is the mounting point for the other one? Use mount
command to answer this.
$ mount
/dev/sda3
์ /
์ ์ฐ๊ฒฐ๋์๊ณ , /dev/sda1
์ /boot
์ ์ฐ๊ฒฐ๋์๋ค. ๋ฐ๋ผ์ /dev/sda3
์ ๋ฃจํธ ํ์ผ ํํฐ์
์ด๊ณ , /dev/sda1
์ ๋ถํ
ํํฐ์
์ด๋ค.
$ mkdir temp
$ mount -o loop myfd temp # connect myfd to temp direcotry, which is called mounting
$ mount
/root/linux-2.6.25.10/myfd
์ /root/linux-2.6.25.10/temp
์ ์ถ๊ฐ๋ก ์ฐ๊ฒฐ๋ ๊ฒ์ ํ์ธํ ์ ์๋ค.
2) Add another entry in /boot/grub/grub.conf
as below. This boot selection does not use initrd directive to prevent initramfs loading (initramfs is a temporary in-ram file system used for performance improvement).
/boot/grub/grub.conf
์ ์๋์ ๊ฐ์ด entry๋ฅผ ์ถ๊ฐํ์๋ค.
$ vi /boot/grub/grub.conf
title=MyLinux3
root (hd0,0)
kernel /boot/bzImage root=/dev/sda3
๊ทธ ํ, ๋ณ๊ฒฝ์ฌํญ์ ์ปดํ์ผ ํ๊ณ ์ฌ๋ถํ ์์ผฐ๋ค.
$ cd linux-2.6.25.10
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
์ฌ๋ถํ ํ์๋ My Linux3๋ฅผ ์ฌ์ฉํ์๋ค.
3) The kernel calls mount_root
to cache the root file system. Starting from start_kernel
, find out the chain of intermediate functions that eventually calls mount_root
. Confirm your prediction by printing out messge at each intermediate function of this chain until you reach mount_root()
.
init/main.c
- start_kernel
:
start_kernel
์์๋ rest_init
์ ํธ์ถํ๋ค.
init/main.c
- rest_init
:
rest_init
์์๋ kernel_init
์ ํธ์ถํ๋ค.
init/main.c
- kernel_init
:
kernel_init
์์๋ init/do_mounts.c
์ ์๋ prepare_namespace
์ ํธ์ถํ๋ค.
init/do_mounts.c
- prepare_namespace
:
prepare_namespace
์์๋ mount_root
์ ํธ์ถํ๋ค.
init/do_mounts.c
- mount_root
:
mount_root
๋ root file system์ cachingํ๋ค.
include/linux/fs.h
:
struct super_block {
struct list_head s_list; /* Keep this first */
dev_t s_dev; /* search index; _not_ kdev_t */
unsigned long s_blocksize;
unsigned char s_blocksize_bits;
unsigned char s_dirt;
unsigned long long s_maxbytes; /* Max file size */
struct file_system_type *s_type;
const struct super_operations *s_op;
struct dquot_operations *dq_op;
struct quotactl_ops *s_qcop;
const struct export_operations *s_export_op;
unsigned long s_flags;
unsigned long s_magic;
struct dentry *s_root;
struct rw_semaphore s_umount;
struct mutex s_lock;
...
};
struct inode {
struct hlist_node i_hash;
struct list_head i_list;
struct list_head i_sb_list;
struct list_head i_dentry;
unsigned long i_ino;
atomic_t i_count;
unsigned int i_nlink;
uid_t i_uid;
gid_t i_gid;
dev_t i_rdev;
u64 i_version;
loff_t i_size;
#ifdef __NEED_I_SIZE_ORDERED
seqcount_t i_size_seqcount;
#endif
struct timespec i_atime;
struct timespec i_mtime;
struct timespec i_ctime;
unsigned int i_blkbits;
blkcnt_t i_blocks;
unsigned short i_bytes;
umode_t i_mode;
spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */
struct mutex i_mutex;
...
};
include/linux/buffer_head.h
:
struct buffer_head {
unsigned long b_state; /* buffer state bitmap (see above) */
struct buffer_head *b_this_page; /* circular list of page's buffers */
struct page *b_page; /* the page this bh is mapped to */
sector_t b_blocknr; /* start block number */
size_t b_size; /* size of mapping */
char *b_data; /* pointer to data within the page */
struct block_device *b_bdev;
bh_end_io_t *b_end_io; /* I/O completion */
void *b_private; /* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
struct address_space *b_assoc_map; /* mapping this buffer is associated with */
atomic_t b_count; /* users using this buffer_head */
};
include/linux/deache.h
:
struct dentry {
atomic_t d_count;
unsigned int d_flags; /* protected by d_lock */
spinlock_t d_lock; /* per dentry lock */
struct inode *d_inode; /* Where the name belongs to - NULL is negative */
/*
* The next three fields are touched by __d_lookup. Place them here
* so they all fit in a cache line.
*/
struct hlist_node d_hash; /* lookup hash list */
struct dentry *d_parent; /* parent directory */
struct qstr d_name;
struct list_head d_lru; /* LRU list */
/*
* d_child and d_rcu can share memory
*/
union {
struct list_head d_child; /* child of parent list */
struct rcu_head d_rcu;
} d_u;
struct list_head d_subdirs; /* our children */
struct list_head d_alias; /* inode alias list */
unsigned long d_time; /* used by d_revalidate */
struct dentry_operations *d_op;
struct super_block *d_sb; /* The root of the dentry tree */
void *d_fsdata; /* fs-specific data */
#ifdef CONFIG_PROFILING
struct dcookie_struct *d_cookie; /* cookie, if any */
#endif
int d_mounted;
unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */
};
5) Change the kernel such that it displays all superblocks before it calls mount_root
and after mount_root
. Boot with "My Linux3" to see what happens.
๋ชจ๋ superblocks๋ฅผ ํ์ํ๊ธฐ ์ํด ์๋ ์ฝ๋๋ฅผ prepare_namespace
ํจ์ ์ ์ ์ ์ ์ถ๊ฐํด์ฃผ์๋ค.
void display_superblocks(){
struct super_block *sb;
list_for_each_entry(sb, &super_blocks, s_list) {
printk("dev name:%s dev maj num:%d dev minor num:%d root ino:%d\n",
sb->s_id, MAJOR(sb->s_dev), MINOR(sb->s_dev),
sb->s_root->d_inode->i_ino);
}
}
๊ทธ๋ฆฌ๊ณ , prepare_namespace
ํจ์ ์ ์ ๋ด์์ mount_root
ํจ์๋ฅผ ํธ์ถํ๋ ๋ถ๋ถ์ ์๋ค๋ก display_superblocks()
ํจ์๋ฅผ ํธ์ถํด์ฃผ์๋ค.
๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด ์ปดํ์ผํ๊ณ , ์ฌ๋ถํ ํ๋ฉฐ ๋ถํ ๋ฉ์ธ์ง๋ฅผ ํ์ธํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
# Boot with "My Linux3"
$ dmesg > x
$ vi x
mount_root๊ฐ ํธ์ถ๋ ์ดํ์๋ "dev name: sda3, dev major num: 8, dev minor num: 3, root ino: 2"๊ฐ ๋ ์ถ๋ ฅ๋๋ค.
๋๋ฐ์ด์ค ๋ฒํธ๋ ๊ฐ ๋๋ฐ์ด์ค์ ๊ณ ์ ๋ฒํธ์ด๋ค. "/dev"์ ๊ฐ ๋๋ฐ์ด์ค์ ํ์ผ ์ด๋ฆ์ด ์ ํ ์๊ณ , ls -l์ ํตํด ๊ฐ ๋๋ฐ์ด์ค์ major, minor ๋ฒํธ๋ฅผ ๋ณผ ์ ์๋ค. major ๋ฒํธ๋ ๊ทธ ๋๋ฐ์ด์ค์ ๋ฒํธ์ด๊ณ , minor ๋ฒํธ๋ ๊ทธ ๋๋ฐ์ด์ค ์ข ๋ฅ ์์์์ ๊ตฌ๋ณ ๋ฒํธ๋ฅผ ์๋ฏธํ๋ค. ์ ๋ด์ฉ์ "Documentation/devices.txt"์ผ๋ก ๊ฐ๋ฉด ์์ธํ ์ ๋ณด๋ฅผ ๋ณผ ์ ์๋ค.
6) Change the kernel such that it displays all cached inodes before it calls mount_root
and after mount_root
. Boot with "My Linux3" to see what happens.
To display all cached indoes, use below.
extern struct list_head inode_in_use;
void display_all_inodes(){
struct inode *in;
list_for_each_entry(in, &inode_in_use, i_list){
printk("dev maj num:%d dev minor num:%d inode num:%d sb dev:%s\n",
MAJOR(in->i_rdev), MINOR(in->i_rdev), in->i_ino, in->i_sb->s_id);
}
}
6-1) Modify display_all_inodes such that it can also diplay the file name and file byte size of each file represented by the inode.
6-2) Make a system call that displays file name and file byte size of all inodes in use. Show only the first 100 files. Look at the result with dmesg command.
6-3) Modify your system call in 6-2) so that it can display mounting points. Mount myfd to temp directory and confirm your system call can detect it.
7) The pid=1 process (kernel_init) eventually execs to /sbin/init
with run_init_process("/sbin/init");
by calling kernel_execve("/sbin/init", ....)
in init/main.c/init_post()
. Change the kernel such that it execs to /bin/sh
. Boot the kernel, and you will find you cannot access /boot/grub/grub.conf
. Explain why.
์ปค๋์ด ๋ก๋๋๋ฉด ๋ฉ๋ชจ๋ฆฌ, ํ๋ก์ธ์, I/O ๋ฑ ์ฌ๋ฌ ํ๋์จ์ด๋ฅผ ์ด๊ธฐํํ๊ณ ์ค์ ํ๋ค. ์์ถ๋ initramfs ์ด๋ฏธ์ง๋ฅผ ๋ฉ๋ชจ๋ฆฌ์ ๋ฏธ๋ฆฌ ์ ํด์ง ์์น๋ก๋ถํฐ ์ฝ์ด "/sysroot/"์ ์ง์ ํ๊ณ , ๋ชจ๋ ํ์ํ ๋๋ผ์ด๋ฒ๋ฅผ ๋ก๋ํ๋ค. ๊ทธ ํ, ์ปค๋์ ๋ฃจํธ ์ฅ์น๋ฅผ ์์ฑํ์ฌ ์ฝ๊ธฐ ์ ์ฉ์ผ๋ก ๋ฃจํธ ํํฐ์ ์ ๋ง์ดํธํ๊ณ ์ฌ์ฉ๋์ง ์๋ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ํด์ ํ๋ค.
์ปค๋์ด ๋ก๋๋๋ฉด ์ฌ์ฉ์ ํ๊ฒฝ์ ์ค์ ํ๊ธฐ ์ํด "/sbin/init" ํ๋ก๊ทธ๋จ์ ์คํํ๋ค. "/sbin/init" ํ๋ก๊ทธ๋จ์ ์ต์์ ํ๋ก์ธ์ค(pid = 1)๋ก, ๋๋จธ์ง ๋ถํธ ํ๋ก์ธ์ค๋ฅผ ์ฃผ๊ดํ๋ฉฐ ์ฌ์ฉ์๋ฅผ ์ํ ํ๊ฒฝ์ ์ค์ ํ๋ ์ญํ ์ ํ๋ค.
"/sbin/init"๋ ํ์ผ ์์คํ ์ ๊ตฌ์กฐ๋ฅผ ๊ฒ์ฌํ๊ณ , ์์คํ ์ ๋ง์ดํธํ๊ณ , ์๋ฒ ๋ฐ๋ชฌ์ ๋์ฐ๊ณ , ์ฌ์ฉ์ ๋ก๊ทธ์ธ์ ๊ธฐ๋ค๋ฆฌ๋ ๋ฑ์ ์ญํ ์ ํ๋ค. ๋ง์ฝ "/sbin/init"์ ์คํํ์ง ์๊ณ "/bin/sh"๋ฅผ ์คํํ๋ฉด, "/dev/sda1"๊ฐ "/boot"์ ์ฐ๊ฒฐ๋์ง ์์ ๊ฒ์ด๋ค.
8) Try following code. Make /aa/bb
and type some text with length longer than 50 bytes. Explain the result.
$ cd / # cd /๋ก /์ ๊ฐ์
$ mkdir aa # mkdir aa๋ก /aa ๋๋ ํ ๋ฆฌ๋ฅผ ๋ง๋ค๊ณ
$ cd aa # cd aa๋ก aa์ ์ด๋ํด์
$ vi bb # vi bb๋ก /aa/bb๋ฅผ ๋ง๋ญ๋๋ค.
$ vi ex1.c
ex1.c
:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
int main(void)
{
char buf[100];
int x = open("/aa/bb", O_RDONLY, 00777);
int y = read(x, buf, 10);
buf[y] = '\0';
printf("we read %s\n", buf);
lseek(x, 20, SEEK_SET);
y = read(x, buf, 10);
buf[y] = '\0';
printf("we read %s\n", buf);
int x1 = dup(x);
y = read(x1, buf, 10);
buf[y] = '\0';
printf("we read %s\n", buf);
link("/aa/bb", "/aa/newbb");
int x2 = open("/aa/newbb", O_RDONLY, 00777);
y = read(x2, buf, 10);
buf[y] = '\0';
printf("we read %s\n", buf);
return 0;
}
์ฒซ๋ฒ์งธ printf
์์ ์ buf
๋ /aa/bb
์ 10๋ฐ์ดํธ๋งํผ read
ํ ๊ฒ์ด๋ฏ๋ก "0123456789"๊ฐ ์ถ๋ ฅ๋์๋ค.
๋๋ฒ์งธ printf
์์ ์ x
์ ํฌ์ธํฐ๋ lseek
๋ฅผ ํตํด ๋ฌธ์์ด์ ํ์ฌ ์์น๋ก๋ถํฐ 20๋ฐ์ดํธ๋งํผ ์ฎ๊ฒจ์ก๋ค. ์ด ๋์ ์ถ๋ ฅ๋๋ buf
๋ ํ์ผ์ 20๋ฐ์ดํธ๋ถํฐ 10๋ฐ์ดํธ๋งํผ read
ํ ๊ฒ์ด๋ฏ๋ก "9876543210"์ด ์ถ๋ ฅ๋์๋ค.
์ธ๋ฒ์งธ printf
์์ ์๋ x1
์ด dup
๋ฅผ ํตํด x
๋ก๋ถํฐ ๋ณต์ ๋์๋ค. ์ด ๋์ ์ถ๋ ฅ๋๋ buf
๋ ๋๋ฒ์งธ printf
์์ ์์ ๋ง์ง๋ง์ผ๋ก ์ฝ์ ์์น์ ๋ค์ ์์น๋ถํฐ 10๋ฐ์ดํธ๋งํผ read
ํ ๊ฒ์ด๋ฏ๋ก "klmnopqrst"๊ฐ ์ถ๋ ฅ๋์๋ค.
๋ค๋ฒ์งธ printf
์์ ์ link
๋ฅผ ํตํด /aa/newbb
๊ฐ ๊ฐ์ ํ์ผ์ธ /aa/bb
๋ฅผ ๊ฐ๋ฆฌํค๊ฒ ๋์๋ค. buf
๋ ์๋ก์ด /aa/newbb
์ 10๋ฐ์ดํธ๋งํผ read
ํ ๊ฒ์ด๋ฏ๋ก "0123456789"๊ฐ ์ถ๋ ฅ๋์๋ค.
$ ls โi /aa/*
/aa/bb
์ /aa/newbb
์ inode number๋ "502947"๋ก ๋์ผํ ๊ฒ์ ํ์ธํ์๋ค.
ex2.c
:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
int main(void)
{
char buf[100];
int x = open("/aa/bb", O_RDONLY, 00777);
int y = fork();
int z;
if (y == 0)
{
z = read(x, buf, 10);
buf[z] = '\0';
printf("child read %s\n", buf);
}
else
{
z = read(x, buf, 10);
buf[z] = '\0';
printf("parent read %s\n", buf);
}
return 0;
}
parent์ child๊ฐ ๋์ผํ ํ์ผ์ ์ ๊ทผํ ๊ฒ์ ํ์ธํ ์ ์๋ค. ํ๋ก์ธ์ค๊ฐ fork
๋๋ฉด x
์ f_pos
๊ฐ ์ ์ฅ๋๋ ์์น๋ ๊ฐ์ด ๋ณต์ฌ๋๋ฏ๋ก ๋ ํ๋ก์ธ์ค๊ฐ ์ด๋ฅผ ๊ณต์ ํ๊ฒ ๋๋ค. ๋ฐ๋ผ์ parent๋ child๊ฐ ์ฝ์๋ ๋ถ๋ถ๋ถํฐ ๊ณ์ ์ฝ๊ฒ ๋๋ค.
$ cd /
$ echo hello1 > f1
$ cd
$ echo hello2 > f1
$ mkdir d1
$ echo hello3 > d1/f1
b. Make ex3.c
that will display "/f1" before and after chroot
, and "f1" before and after chdir
as follows.
ex3.c
:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
void display_root_f1(void) // display the content of "/f1"
{
char buf[100];
int x = open("/f1", O_RDONLY);
int y = read(x, buf, 100);
buf[y] = '\0';
printf("%s\n", buf);
}
void display_f1(void) // display the content of "f1"
{
char buf[100];
int x = open("f1", O_RDONLY);
int y = read(x, buf, 100);
buf[y] = '\0';
printf("%s\n", buf);
}
int main(void)
{
display_root_f1(); // display the content of "/f1"
chroot(".");
display_root_f1(); // display the content of "/f1"
display_f1(); // display the content of "f1"
chdir("d1");
display_f1(); // display the content of "f1"
return 0;
}
- ์ฒซ
display_root_f1
์cd /
๋ก ์ด๋ํด์ ๋ง๋f1
์ ๋ด์ฉ์ ๋ณด์ฌ์ค๋ค. chroot(".")
๋ฅผ ํตํด ํ์ฌ ๋๋ ํ ๋ฆฌ๋ก root๊ฐ ๋ณ๊ฒฝ์ด ๋๋๋ฐ, ํ์ฌ ๋๋ ํ ๋ฆฌ๋ ํ ๋๋ ํ ๋ฆฌ์ด๋ค.- root๊ฐ ๋ณ๊ฒฝ๋ ์ดํ๋ก ๋ค์
display_root_f1
์ ์คํํ๋ฉด ํ์ฌ ๋๋ ํ ๋ฆฌ๊ฐ root์ด๋ฏ๋ก ํ์ฌ ๋๋ ํ ๋ฆฌ์ ์๋f1
์ ๋ด์ฉ์ด ์ถ๋ ฅ๋๋ฏ๋ก hello2๊ฐ ์ถ๋ ฅ์ด ๋๋ค. - ์ฒซ
display_f1
์ ํ์ฌ ๋๋ ํ ๋ฆฌ์f1
์ ๋ด์ฉ์ด ์ถ๋ ฅ๋๋ฏ๋ก ๋๊ฐ์ด hello2๊ฐ ์ถ๋ ฅ๋๋ค. chdir("d1")
์ผ๋ก ํ์ฌ ๋๋ ํ ๋ฆฌ๋ฅผd1
์ ๋ฐ๊พผ ๋ค ์คํํ๋ฉด,d1
์์ชฝ์ ๋ง๋f1
์ด ์ถ๋ ฅ๋๋ฏ๋ก hello3์ด ์ถ๋ ฅ๋๋ค.
12) Make a new system call, my_show_fpos()
, which will display the current process ID and the file position for fd=3
and fd=4
of the current process. Use this system call to examine file position as follows. (Use %lld
to print the file position since f_pos is long long integer)
arch/x86/kernel/syscall_table_32.S
:
56๋ฒ์ my_show_fpos
์์คํ
์ฝ์ ๋ฑ๋กํด์ค๋ค.
fs/read_write.c
:
asmlinkage void my_show_fpos(void)
{
printk("fd=3, f_pos=%lld\n", current->files->fdt->fd[3]->f_pos);
printk("fd=4, f_pos=%lld\n", current->files->fdt->fd[4]->f_pos);
}
ex4.c
:
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
void my_show_fpos()
{
syscall(56);
}
int main(void)
{
char buf[25];
int x = open("f1", O_RDONLY);
int y = open("f2", O_RDONLY);
my_show_fpos(); // f_pos right after opening two files
read(x, buf, 10);
read(y, buf, 20);
my_show_fpos(); // f_pos after reading some bytes
return 0;
}
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
# after reboot
$ echo 8 > /proc/sys/kernel/printk
$ ./ex4
x
์ y
๋ ๊ฐ๊ฐ ํ์ผ ๋์คํฌ๋ฆฝํฐ 3๊ณผ 4๋ฅผ ์๋ฏธํ๋ค. ๊ฐ๊ฐ 10๊ธ์, 20๊ธ์๋ฅผ ์ฝ์์ผ๋ฏ๋ก f_pos
๊ฐ 0์์ 10์ด, 10์์ 20์ด ๋์๋ค.
13) Modify your my_show_fpos()
such that it also displays the address of f_op->read
and f_op->write
function for fd 0, fd 1, fd 2, fd 3, and fd 4, respectively. Find the corresponding function names in System.map
. Why the system uses different functions for fd 0, 1, 2 and fd 3 or 4?
fs/read_write.c
:
asmlinkage void my_show_fpos(void)
{
printk("fd=3, f_pos=%lld\n", current->files->fdt->fd[3]->f_pos);
printk("fd=4, f_pos=%lld\n", current->files->fdt->fd[4]->f_pos);
// Update
int i;
for(i = 0; i < 5; i++) {
printk("fd=%d, read=%p\n", i, current->files->fdt->fd[i]->f_op->read);
printk("fd=%d, write=%p\n", i, current->files->fdt->fd[i]->f_op->write);
}
}
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
# after reboot
$ echo 8 > /proc/sys/kernel/printk
$ ./ex4
read์ write ํจ์์ ์ฃผ์๋ฅผ ์ถ๋ ฅํ ์ ์๊ฒ ํ๋ค. ์ถ๋ ฅ๋ ์ฃผ์๋ฅผ ๋ฆฌ๋
์ค ์ฝ๋์ System.map
์์ ์ฐพ์๋ณด๋ฉด ์๋์ ๊ฐ์ด ๋์จ๋ค. System.map
์ ์ปดํ์ผํ ๋๋ง๋ค ๋ฆฌ๋
์ค ์ฝ๋ ๋๋ ํ ๋ฆฌ์ ์์ฑ๋๋ค.
14) Use my_show_fpos()
to explain the result of the following code. File f1
has โabโ and File f2
has โqโ. When you run the program, File f2
will have โbaโ. Explain why f2
have โbaโ after the execution.
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
void my_show_fpos()
{
syscall(56);
}
int main(void)
{
char buf[10];
int f1 = open("./f1", O_RDONLY, 00777);
int f2 = open("./f2", O_WRONLY, 00777);
printf("f1 and f2 are %d %d\n", f1, f2); // make sure they are 3 and 4
if (fork() == 0)
{
my_show_fpos();
read(f1, buf, 1);
sleep(2);
my_show_fpos();
write(f2, buf, 1);
}
else
{
sleep(1);
my_show_fpos();
read(f1, buf, 1);
write(f2, buf, 1);
}
return 0;
}
fork
์ ์ํด f_pos
๋ฅผ ๊ณต์ ํ๋ ํ๋ก์ธ์ค 2๊ฐ๋ก ๋๋์ด์ง๋ค.
๊ฐ์ฅ ๋จผ์ ์์ ํ๋ก์ธ์ค์์ f1
๊ณผ f2
์ด๊ธฐ ์ํ๋ฅผ ์ถ๋ ฅํ๊ณ ๋ ๋ค f_pos
๋ 0์ด๋ค.
๊ทธํ "f1" ํ์ผ์ ์ฝ์ด buf
์ ์ ์ฅํ๋ค. ํ์ฌ buf
์์๋ ['a']
๊ฐ ์ ์ฅ๋์ด ์๋ค.
์์ ํ๋ก์ธ์ค๊ฐ 2์ด๊ฐ ๋๊ธฐํ๋ ์ฌ์ด์, ๋ถ๋ชจ ํ๋ก์ธ์ค๋ f1
๊ณผ f2
์ํ๋ฅผ ์ถ๋ ฅํ๊ณ ์ด๋ f1
์ f_pos
๊ฐ ์ฝ์ ๊ธ์ ์๋งํผ ์ฆ๊ฐํ ๊ฒ์ ํ์ธํ ์ ์๋ค.
๋ค์ ํ ๊ธ์ ์ฝ์ด buf
์ ์ ์ฅํ๋ฉด buf
์๋ ['b']
๊ฐ ์ ์ฅ๋๊ฒ ๋๋ค.
๋ ํ๋ก์ธ์ค ์ฌ์ด์ buf
์ ๊ฐ์ ์ง์ญ๋ณ์๋ ๊ณต์ ๋์ง ์๋๋ค.
๋ถ๋ชจ ํ๋ก์ธ์ค์ buf
๋ฅผ "f2"์ ์ ์ฅํ๊ณ , 1์ด ํ ์์ ํ๋ก์ธ์ค์ buf
๋ฅผ "f2"์ ์ ์ฅํ๋ฉด "f2"๋ "ba"๊ฐ ๋๋ค.
x=open(fpath, .......);
-
- find empty fd
-
- search the inode for "fpath"
- 2-1) if "fpath" starts with "/", start from "fs->root" of the current process
- 2-2) otherwise, start from "fs->pwd"
- 2-3) visit each directory in "fpath" to find the inode of the "fpath"
- 2-4) while following mounted file path if it is a mounting point.
-
- find empty file{} entry and fill-in relevant information.
-
- chaining
-
- return fd
-
read(x, buf, n);
-
- go to the inode for x
-
- read n bytes starting from the current file position
-
- save the data in buf
-
- increase the file position by n
-
$ cd /
$ vi f1
..........
$
Try to read this file before โmount_root
โ, after โmount_root
โ, after sys_mount(โ.
โ, โ/
โ, ...), and after sys_chroot(โ.
โ) in init/do_mounts.c/prepare_namespace()
. Explain what happens and why. For this problem, the kernel_init
process should exec to /sbin/init
.