- interrupt, process, file, memory, I/O
- process definition
- Multiprogramming, Timesharing
- process descriptor, process queue, run queue
- pid, process state, time slice, mm, eip
task_struct{}
,thread_union{}
, KMS(Kernel Mode Stack)init_task
,current
- process schedule example
- How processes are created and destroyed
fork
,exec
,exit
,wait
fork
: duplicate the parentexec
: transform the current process into anotherexit
: stop the current processwait
: wait until the child dies- All processes in Linux is created through
fork
andexec
- process vs thread
pthread_create
kernel_thread
- shell
- Linux initialization
- Where is CPU
A process is a program loaded in the memory.
- 1945 ~ 1955 :
- No OS.
- 1955 ~ 1965 :
- FMS (Fortran Monitoring System). Runs a set of programs one by one.
- cpu utilization problem: over 90% time the cpu was idle
- 1965 ~ :
- Multiprogramming. Load several programs at once. Run one by one. If the current one is blocked because of I/O, go to the next process. Fairness problem: cpu-bound program is favored.
- Now :
- Time Sharing. Allocate x ticks to each process. Now all processes will be stopped either if it executes an I/O instruction or the timer expires.
- process descriptor:
- one for each process.
- contains information about the corresponding process such as process ID, ticks
- remained, register values of the last stop-point, memory location, size, etc.
task_struct{}
is the process descriptor in Linux. (include/linux/sched.h
)- "current" points to the current process
thread_union{}
=task_struct{}
+ Kernel Mode Stackinit_task
is the kernel's process descriptor (arch/x86/kernel/init_task.c
,include/linux/init_task.h
)struct task_struct init_task = INIT_TASK(init_task); INIT_TASK(init_task) = { ........ .parent = &init_task, .comm = "swapper", ........ }
- process queue:
- linked list of process descriptors
&init_task
is the start of this queue.- traversing the process queue
void display_processes(){ struct task_struct *temp; temp = &init_task; for(;;){ print process id, user id, program name, process state for temp; temp = next_task(temp); // find next process if (temp == &init_task) break; } printk("\n"); }
- run queue
- A linked list of runnable processes.
- The scheduler looks at this queue to pick the next process after each interrupt
- Each cpu has its own run queue
- Each run queue is an array of queues based on the priority
- :
struct prio_array{}.queue[]
- :
3.1) task_struct
is defined in include/linux/sched.h
(search for "task_struct {
").
Which fields of the task_struct
contain information for process ID, parent process ID, user ID, process status, children processes, the memory location of the process, the files opened, the priority of the process, program name?
- Process ID :
pid_t pid
- Parent process ID :
struct task_struct parent;
/* parent process */
- User ID :
uid_t uid
- Process status :
volatile long state;
/* -1 unrunnable, 0 runnable, >0 stopped */
- Children processes :
struct list_head children;
/* list of my children */
- The memory location of the process :
struct mm_struct *mm
- The files opened :
struct files_struct *files;
/* open file information */
- The priority of the process :
int orio, static prio, normal prio;
- Program name :
char comm[TASK_COMM_LEN]
3.2) Display all processes with ps โef
. Find the pid of ps -ef
, the process you have just executed. Find the pid and program name of the parent process of it, then the parent of this parent, and so on, until you see the init_task
whose process ID is 0.
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 1 00:04 ? 00:00:00 init [3]
... ... ... ... ... ... ... ...
root 4467 1 0 00:05 tty1 00:00:00 /bin/login --
... ... ... ... ... ... ... ...
root 4486 4467 0 00:05 tty1 00:00:00 -bash
root 4491 4486 0 00:05 tty1 00:00:00 ps -ef
ps -ef
์ PID๋ 4491์ด๋ค.ps -ef
์ PPID๋ 4486์ด๋ฏ๋ก, parent process๋ PID๊ฐ 4486์ธ-bash
์ด๋ค.-bash
์ PPID๋ 4467์ด๋ฏ๋ก, parent process๋ PID๊ฐ 1์ธinit [3]
์ด๋ค.init [3]
์ PPID๋ 0์ผ๋ก, parent process๋ PID๊ฐ 0์ธinit_task
์ด๋ค.
3.3) Define display_processes()
in init/main.c
(right before the first function definition). Call this function in the beginning of start_kernel()
.
Confirm that there is only one process in the beginning. Find the location where the number of processes becomes 2. Find the location where the number of processes becomes 3. Find the location where the number of processes is the greatest. Use dmesg
to see the result of display_processes()
.
init/main.c
์display_processes()
๋ฅผ ์ ์ํ์๋ค. ์ฝ๋๊ฐ ์๋ฏธํ๋ ๋ฐ๋ ๋ค์๊ณผ ๊ฐ๋ค.
- ํฌ์ธํฐํ
struct
์ธtemp
๋ฅผ ์ ์ธํ๊ณ ,init_task
(์ปค๋์ process descriptor) ์ ๋ฃ์ด์ค๋ค. - ํด๋นํ๋ ํ๋ก์ธ์ค๋ง๋ค ๊ฐ๊ฐ pid, pname, state๋ฅผ ์ถ๋ ฅํ๊ณ ,
temp
๋ ๊ทธ ๋ค์ ํ๋ก์ธ์ค๋ฅผ ์ง์ ํ๋๋ก ํ๋ค. init_task
๋ค์์๋init_task
๋ผ๋ฉด for๋ฌธ์ ํ์ถํ๊ณ ํจ์๊ฐ ๋๋๊ฒ ๋๋ค.
- ํฌ์ธํฐํ
start_kernel
ํจ์๊ฐ ํธ์ถ๋๋ฉด display_processes()
๋ฅผ ๊ฐ์ฅ ๋จผ์ ํธ์ถํ ์ ์๊ฒ ํ์๋ค.
๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด ์ปค๋์ ์ปดํ์ผํ๊ณ , ์ฌ๋ถํ ํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
dmesg
๋ก ๋ถํ
๋ฉ์ธ์ง๋ฅผ ํ์ธํด๋ณด์๋ค.
pid: 0, pname: swapper, state: 0
์คํ ๊ฒฐ๊ณผ ํ๋์ ํ๋ก์ธ์ค๋ง ์์๊ณ , swapper
ํ๋ก์ธ์ค๊ฐ 'pid: 0'์ผ๋ก ์์ฑ๋ ๊ฒ์ ๋ณผ ์ ์์๋ค.
์คํ ์ค์ธ ํ๋ก์ธ์ค๊ฐ 2๊ฐ๊ฐ ๋ ์ฒซ ๋ฒ์งธ ์์ ์ kernel_init
์ ์ต์๋จ์์ display_processes
๋ฅผ ํธ์ถํ์ ๋์ด๋ค.
๊ฐ์ฅ ๋ง์ ํ๋ก์ธ์ค๊ฐ ๋ณด์ฌ์ง๋ ๊ณณ์ kernel_init
์์ init_post
๊ฐ ํธ์ถ๋ ์ดํ์ด๋ค.
3.4) Make a system call that, when called, displays all processes in the system. Run an application program that calls this system call and see if this program displays all processes in the system.
ํ๋ก์ธ์ค ์ ๋ณด๋ฅผ ์ถ๋ ฅํ๋ system call์ ๋ง๋๋ ๊ฒ์ด๋ฏ๋ก fs/read_write.c
์ ํจ์๋ฅผ ์ถ๊ฐํ๋ค.
์ดํ arch/x86/kernel/syscall_table_32.S
์์ ๋น์ด์๋ 44๋ฒ index์ ํด๋น ํจ์๋ฅผ ์ค์ ํด์ฃผ์๋ค.
arch/x86/kernel/syscall_table_32.S
:
์ดํ, ๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด make
๋ช
๋ น์ด๋ก ๋ฆฌ๋
์ค ์ปค๋์ ์ปดํ์ผํ๊ณ ์ฌ๋ถํ
ํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
์ฌ๋ถํ ํ, ๋ง๋ ์์คํ ์ฝ ํจ์๋ฅผ ํธ์ถํ๋ ํ๋ก๊ทธ๋จ์ ์์ฑํ์๋ค.
syscall_44.c
:
#include <unistd.h>
int main(void) {
syscall(44);
return 0;
}
๋ก๊ทธ ๋ ๋ฒจ์ ๋ฐ๊พธ๊ณ ํ๋ก๊ทธ๋จ์ ์คํํด ์์คํ ์ฝ ํจ์๋ฅผ ํธ์ถํ๋ฉด ํ๋ก์ธ์ค ๋ชฉ๋ก์ด ๋ณด์ฌ์ง๋ค.
$ echo 8 > /proc/sys/kernel/printk
$ ./syscall_44
3.4.1) Make a system call that, when called, displays all ancestor processes of the calling process in the system. For example, if ex1
calls this system call, you should see: ex1
, ex1
โs parent, ex1
โs parentโs parent, etc. until you reach pid=0 which is Linux itself.
๋ชจ๋ ๋ถ๋ชจ์ ํ๋ก์ธ์ค ์ ๋ณด๋ฅผ ์ถ๋ ฅํ๋ system call์ ๋ง๋๋ ๊ฒ์ด๋ฏ๋ก fs/read_write.c
์ ํจ์๋ฅผ ์ถ๊ฐํ๋ค.
my_sys_display_all_ancestors
ํจ์์์๋, ํ์ฌ process์ ์ ๋ณด๋ฅผ ์ป๊ธฐ ์ํด, current ํฌ์ธํฐ๋ฅผ ๋ฐํํด์ฃผ๋ ์ปค๋ ํจ์์ธ get_current
๋ฅผ ์ฌ์ฉํ์๋ค.
์ดํ arch/x86/kernel/syscall_table_32.S
์์ ๋น์ด์๋ 53๋ฒ index์ ํด๋น ํจ์๋ฅผ ์ค์ ํด์ฃผ์๋ค.
arch/x86/kernel/syscall_table_32.S
:
์ดํ, ๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด make
๋ช
๋ น์ด๋ก ๋ฆฌ๋
์ค ์ปค๋์ ์ปดํ์ผํ๊ณ ์ฌ๋ถํ
ํ ํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
์ฌ๋ถํ ํ, ๋ง๋ ์์คํ ์ฝ ํจ์๋ฅผ ํธ์ถํ๋ ํ๋ก๊ทธ๋จ์ ์์ฑํ์๋ค.
syscall_53.c
:
#include <unistd.h>
int main(void) {
syscall(53);
return 0;
}
๋ก๊ทธ ๋ ๋ฒจ์ ๋ฐ๊พธ๊ณ ํ๋ก๊ทธ๋จ์ ์คํํด ์์คํ ์ฝ ํจ์๋ฅผ ํธ์ถํ๋ฉด ํ์ฌ ํ๋ก์ธ์ค์ ์กฐ์๋ค์ ํ๋ก์ธ์ค ๋ชฉ๋ก์ด ๋ณด์ฌ์ง๋ค.
$ echo 8 > /proc/sys/kernel/printk
$ ./syscall_53
3.5) Run three user programs, f1
, f2
, and f3
, and run another program that calls the above system call as follows.
State 0 means runnable and 1 means blocked. Observe the state changes in f1
, f2
, f3
and explain what these changes mean.
f1.c
:
int i,j; double x=1.2;
for(i=0;i<100;i++){
for(j=0;j<10000000;j++){ // make f1 busy for a while
x=x*x;
}
// and then sleep 1sec
usleep(1000000);
}
f2.c
:
int i,j; double x=1.2;
for(i=0;i<100;i++){
for(j=0;j<10000000;j++){ // make f2 busy for a while
x=x*x;
}
// and then sleep 2sec
usleep(2000000);
}
f3.c
:
int i,j; double x=1.2;
for(i=0;i<100;i++){
for(j=0;j<10000000;j++){ // make f3 busy for a while
x=x*x;
}
// and then sleep 3sec
usleep(3000000);
}
ex1.c
:
for(i=0;i<100;i++){
sleep(5);
syscall(17); // show all processes
// assuming the system call number in exercise (3.4) is 44
}
$ echo 8 > /proc/sys/kernel/printk
$ ./f1&
$ ./f2&
$ ./f3&
$ ./ex1
์คํ ๊ฒฐ๊ณผ, f1
, f2
, f3
์ ์ํ๊ฐ ๋ฐ๋๋ ๊ฒ์ ํ์ธํ ์ ์์๋ค. ex1
์ state๋ 0์ผ๋ก ์ ์ง๋์๋ค.
f1
> f2
> f3
> ex1
์์ผ๋ก ์ข
๋ฃ๋์๋ค.
3.6) Modify your my_sys_display_all_processes()
so that it can also display the remaining time slice of each process (current->rt.time_slice
) and repeat 3.5) as below to see the effect.
chrt -rr 30 ./f1
will run f1
with priority value = max_priority
-30 (lower priority means higher priority).
-rr
is to set scheduling policy to SCHED_RR
(whose max_priority
is 99).
display_processes
ํจ์์ printk
์ current->rt.time_slice
๋ฅผ ์ถ๊ฐํด์ฃผ๊ณ ,
๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด ์ปค๋์ ์ปดํ์ผํ๊ณ , ์ฌ๋ถํ
ํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
chrt
๋ ํ๋ก์ธ์ค์ real-time ์์ฑ์ ์์ ํ๋ ๋ช
๋ น์ด๋ก, -rr
์ ์ค์ผ์ค๋ง ์ ์ฑ
์ SCHED_RR
์ผ๋ก ์ค์ ํ๋ ๊ฒ์ ์๋ฏธํ๋ค.
์ด๋, ํ๋ก์ธ์ค๋ ๋ฎ์ ์ซ์๋ฅผ ๊ฐ์ง์๋ก ๋์ ์ฐ์ ์์๋ฅผ ๊ฐ์ง๋ค.
$ chrt โrr 30 ./f1&
$ chrt -rr 30 ./f2&
$ chrt -rr 30 ./f3&
$ chrt -rr 30 ./ex1
์คํ ๊ฒฐ๊ณผ ๊ฐ ํ๋ก์ธ์ค๊ฐ ์คํ๋ ๋์ ์์ฌ ํ์ ์ฌ๋ผ์ด์ค๋ฅผ ๋ณด์ฌ์ค๋ค.
์ด ๋จ์๋ก ์ถ์ธก๋๋ฉฐ 6~9๊ฐ์ ํ๋ก์ธ์ค๊ฐ ๊ฐ์ ๊ฐ์ ๊ฐ์ง๋ ๊ฒ์ผ๋ก ๋ณด์ ๊ต์ฅํ ์งง์ ์๊ฐ์ ์ฌ๋ฌ ํ๋ก์ธ์ค๋ค์ด ์ฒ๋ฆฌ๋๋ ๊ฒ์ ์ ์ ์๋ค.
์ฒ์์๋ 250 ์ ๋๊ฐ ์ฃผ์ด์ง๊ณ 0์ ๋๋ฌํ๋ฉด 25 ์ ๋๊ฐ ์๋ก ์ฑ์์ก๋ค. ์ด ๊ณผ์ ์ด ex1
์ด ๋๋ ๋๊น์ง ๋ฐ๋ณต๋์๋ค.
include/linux/sched.h
:
struct task_struct {
long state; // 0: runnable, >0 : stopped or dead
int prio; // priority
const struct sched_class *sched_class; // scheduling functions depending on
// scheduling class of this process
struct sched_entity se; // scheduling info
struct list_head tasks; // points to next task
struct mm_struct *mm; // memory occupied by this process
pid_t pid;
struct task_struct *parent;
struct list_head children;
uid_t uid; // owner of this process
char comm[TASK_COMM_LEN]; // program name
struct thread_struct thread; // pointer to saved registers
struct fs_struct *fs;
struct files_struct *files;
struct signal_struct *signal;
struct sighand_struct *sighand;
}
struct sched_class { // fair class has func name such as task_tick_fair, enqueue_task_fair..
// rt class has task_tick_rt, enqueue_task_rt, ...
void (*enqueue_task)(struct rq *rq, struct task_struct *p, ...);
void (*dequeue_task)(struct rq *rq, struct task_struct *p, ...);
struct task_struct *(*pick_next_task)(struct rq *rq);
void (*task_tick)(struct rq *rq, struct task_struct *p,...);
.........
}
struct sched_entity {
u64 sum_exec_runtime;
u64 vruntime; // actual runtime normalized(weighted) by the number of
// runnable processes. unit is nanosecond
................
}
struct list_head is little tricky. It does not point to the next item directly.
For example,
struct list_head tasks;
does not mean (current->tasks).next
points to the next task.
include/linux/list.h
:
struct list_head{
struct list_head *next, *prev;
};
(current->tasks).next
simply points to another structlist_head
that is included in the next process. To find the next process, use macro:list_entry()
ornext_tasks()
list_entry( (current->tasks).next, struct task_struct, tasks)
- or
next_task(current)
We can display all processes in the system by
struct task_struct *temp;
temp = &init_task;
for(;;) {
printk("pid %d ",temp->pid);
temp = list_entry(temp->tasks.next, struct task_struct, tasks);
if (temp == &init_task) break;
}
To display run queues, it is more difficult.
Each process has a priority (โprio" field in task_struct
), and there are 0 to 139 priorities.
For each priority we have different run queue.
run_list
points to the next process in the run queue with the priority of the corresponding process.
this_rq()
will point to the โstruct rq" structure for the current cpu. This structure contains 140 run queues.
kernel/sched.c
:
union thread_union{
struct thread_info thread_info;
unsigned long stack[THREAD_SIZE/sizeof(long)]; // 8192 bytes
}
#define next_task(p) list_entry(rcu_dereference((p)->tasks.next), struct task_struct, tasks)
#define for_each_process(p) for(p=&init_task;(p=next_task(p))!=&init_task;) do
arch/i386/kernel/init_task.c
:
union thread_union init_thread_union;
struct task_struct init_task = INIT_TASK(init_task);
include/asm-i386/thread_info.h
:
struct thread_info{
struct task_struct *task; // main task structre
struct exec_domain *exec_domain; // execution domain
long flags;
long status;
__u32 cpu; // cpu for this thread
mm_segment_t addr_limit; // 0-0xBFFFFFFF (3G bytes) for user-thread
// 0-0xFFFFFFFF (4G bytes) for kernel-thread
}
kernel/sched.c
:
#define DEF_TIMESLICE (100*HZ/1000) // 100 ms for default time slice. HZ=1000
// HZ is num of timer interrupts per second.
struct prio_array { // run queue
unsigned int nr_active;
struct list_head queue[MAX_PRIO]; // run queues for various priorities
};
void __activate_task(p, struct rq *rq) { // wake up p
struct prio_array * target = rq->active;
enqueue_task(p, target);
p->array = array;
}
process priority: each process has priority in โprio" (value 0..139) 0..99 is for real time task. 100..139 for user process
140 priority list
The kernel calls schedule() at the end of each ISR(Interrupt Service Routine) to pick the next process.
kernel/sched.c
:
void schedule() {
struct task_struct *prev, *next;
struct prio_array *array;
prev = current;
rq = this_rq(); // run queue of the belonging cpu
array = rq->active; // active run queue
deactivate_task(prev, rq);
idx = sched_find_first_bit(array->bitmap);
queue = array->queue + idx;
/** old code
next = list_entry(queue->next, struct task_struct, run_list); // next task
array = next->array;
**/
next=pick_next_task(rq, prev);
rq->curr = next; // next is the curr task
context_switch(rq, prev,next); // move to next
}
struct task_struct * pick_next_task(..){
class=sched_class_highest;
p=class->pick_next_task(rq);
return p;
}
struct task_struct *pick_next_task_fair(rq){ // for cfs case
cfs_rq=&rq->cfs;
se=pick_next_entiry(cfs_rq);
p = task_of(se);
return p;
}
struct sched_entity *pick_next_entity(..){
rb_entry(.....); // find the next task in rb tree
}
#define this_rq() (&__get_cpu_var(runqueues))
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
#define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
__attribute__((__section__(โ.data.percpu"))) __typeof__(type) per_cpu__##name
The above will make
static struct rq per_cpu_ruqueues;
void wake_up_new_task(struct task_struct *p, ..){
struct rq *rq, *this_rq;
int this_cpu, cpu;
rq = task_rq_lock(p, ...); // the runqueue of the cpu this task belongs to
cpu = task_cpu(p); // cpu p belongs to
__activate_task(p, rq); // insert p in rq
}
void scheduler_tick(){ // timer interrupt calls this to
// decrease time slice of current p (in old code)
// in new code (after 2.6.23), it increases curr->se->vruntime
int cpu=smp_processor_id();
struct rq *rq=cpu_rq(cpu);
struct task_struct *curr=rq->curr;
curr->sched_class->task_tick(rq, curr, 0); // task_running_tick() in old code
.......
}
/** old code
void task_running_tick(rq, p){
if (!--p->time_slice){ // decrease it. if 0, reschedule
dequeue_task(p, rq->active);
set_tsk_need_resched(p);
p->time_slice=task_timeslice(p); // reset time slice
enqueue_task(p, rq->active); // put back at the end
}
}
**/
void task_tick_fair(rq, curr, ..){
se=&curr->se; // sched_entity
cfs_rq=cfs_rq_of(se);
entity_tick(cfs_rq, se, ...);
}
void entity_tick(cfs_rq, ...){
update_curr(cfs_rq);
.........
}
void update_curr(cfs_rq){
struct sched_entity *curr=cfs_rq->curr;
now=rq_of(cfs_rq)->clock;
delta_exec=now - curr->exec_start; // running time so far for curr
__update_curr(cfs_rq, curr, delta_exec);
curr->exec_start = now;
}
void __update_curr(cfs_rq, struct sched_entity *curr, delta_exec){
curr->sum_exec_runtime += delta_exec;
delta_exec_weighted=calc_delta_fair(delta_exec, curr);
curr->vruntime += delta_exec_weighted;
}
When the kernel starts, we have only one process: init_task
, which represents the kernel itself. Other processes are created by fork
.
example:
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int x;
void main(){
x= fork();
if (x!=0) {
printf("korea %d\n", x);
while (1);
}
else {
printf("china\n");
while (1);
}
}
What is the result of above code?
fork()
==> mov $2
, %eax
int $0x80
==> system_call
==> arch/x86/kernel/process_32.c/sys_fork
=> kernel/fork.c/do_fork()
fork
is translated into 2 assembly instructions as below by C library:
mov $2, %eax
int $0x80
- The ISR for interrupt 0x80 is
system_call
which calls in turnsys_fork
when eax=2.sys_fork
callsdo_fork
anddo_fork
does followings:- (1) copy the body of the parent process
- (2) copy thread_union of the parent process, and adjust some information
- (3) insert child into the process queue
- (4) return 0 to the child, and return childโs pid to the parent
pthread_create is similar to fork()
except it does not copy the body of the parent.
p1.c:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void * foo(void * aa){
printf("hello from child\n");
return NULL;
}
void main(){
pthread_t x;
pthread_create(&x, NULL, foo, NULL); // make a child which starts at foo
printf("hello from parent\n");
}
$ gcc โo p1 p1.c โlpthread
$./p1
hello from child
hello from parent
Use kernel_thread()
in Linux kernel which is similar to pthread_create().
start_kernel() {
trap_init();
init_IRQ();
time_init();
console_init();
...............
rest_init();
}
rest_init() {
.........
kernel_thread(kernel_init, ...........);
pid=kernel_thread(kthreadd, ....);
schedule();
cpu_idle();
}
The above Linux code calls kernel_thread
(kernel_init
, ....).
After this call the kernel is duplicated (but only the thread_union of the kernel is duplicated), and the child's starting location is kernel_init()
.
Similarly, after kernel_thread
(kthreadd
,...), another child is born whose starting location is kthreadd
.
Since we have three processes, they will be scheduled one by one.
ex1.c
:
void main(){
printf("I am ex1\n");
}
ex2.c
:
void main(){
execve("./ex1", 0,0);
printf("I am ex2\n");
}
$ gcc ex1.c -o ex1
$ gcc ex2.c -o ex2
$ ex2
What will be the result?
exec
==> mov $11
, %eax
==> system_call
==> sys_execve
int $0x80
exec
is translated into 2 assembly instructions as below by C library:
mov $11, %eax
int $0x80
- The ISR for interrupt 0x80 is
system_call
which calls in turnsys_execve
when eax=11. (sys_execve
is inarch/x86/kernel/process_32.c
)
sys_execve
callsdo_execve
(fs/exec.c
) which does following things:- (1) remove old body
- (2) load new body
- (3) update the
task_struct
- (4) update the KMS (the stack portion in
thread_union
) such that- KMS.eip = starting location of the new body
exec1.c
:
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int x;
void main(){
char * exec2 = "./exec2";
char * argv[2];
argv[0] = exec2;
argv[1] = 0;
x=fork();
if (x!=0){
printf("korea %d\n",x);
execve("./exec2", argv, 0);
printf("exec failed\n");
}
else{
printf("japan\n");
for(;;);
}
}
exec2.c
:
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int x;
void main(){
printf("china\n");
}
$ gcc -o exec2 exec2.c
$ gcc -o exec1 exec1.c
exec1
shell uses fork()
and exec()
to run the command:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
int main() {
int x, y;
char buf[50];
char * argv[2];
for(;;) {
printf("$ ");
scanf("%s", buf); // get command. no arg can be input this way
argv[0] = buf;
argv[1] = NULL;
x = fork();
if (x == 0) { // child
printf("I am child to execute %s\n", buf);
y = execve(buf, argv, 0);
if (y < 0) {
printf("exec failed. errno is %d\n", errno);
exit(1);
}
} else {
wait();
}
}
return 0;
}
start_kernel()
==> rest_init()
==> kernel_thread(kernel_init, ....); // now we have two processes (init_task and kernel_init)
==> kernel_thread(kthreadd, ...); // init_task runs first and create another thread.
// now we have three processes(init_task, kernel_init, kthredd)
==> schedule(); // init_task calls schedule. the scheduler picks kernel_init.
// prio of init_task is 140. prio of the other two is 120.
==> kernel_init()
==> do_basic_setup()
...........
init_post();
==> init_post()
==> run_init_process("/sbin/init", ......);
==> kernel_execve(โ/sbin/init", ...); //kernel_init is transformed into /sbin/init.
==> /sbin/init
==> for (i=0;i < number of programs listed in /etc/inittab; i++) {
x=fork();
if (x==0){ // child
execve(next program listed in /etc/inittab, ...);
}
} // parent goes back to the loop to create the next child
for(;;){ // parent
waits here;
}
==> fork();
==> child init calls execve(โ/sbin/agetty",..); // child init
// is transformed into /sbin/agetty
==> when user logins to this server /sbin/agetty execs to /bin/login
and display
login:
==> when user types login ID and password correctly /bin/login makes a child and
execs the child to the shell as specified in /etc/passwd, which is usually /bin/bash
root:x:0:0:root:/root:/bin/bash
...............
==> /bin/bash runs the shell code: display '#', read command, fork, let the child exec to the command, etc.
==> when user types ps -ef, shell forks and execs the child to ps -ef
init_task->/sbin/init->/sbin/agetty->/bin/login->/bin/bash->ps โef
Shell code again
.........
for(;;){
printf("$ ");
scanf("%s", buf); // get command. no arg can be input this way
.................
x=fork();
if (x==0){ // child
y=execve(buf, argv, 0);
............
} else wait();
}
-
- Shell runs
printf("$")
, and this library function callswrite(1, "$", 1)
which will display prompt. (printf
=>write
=>INT 128
=>sys_write
=>display "$"
)
- Shell runs
-
- Shell runs
scanf("%s", buf);
, and this library function callsread(0, buf, n)
which will make the shell sleep until the user enters a command.
Making shell sleep means setting shell's state to TASK_INTERRUPTIBLE (a blocked state) and taking it out of the run queue.
Since shell cannot be scheduled, the scheduler picks kernel (pid=0) as the next process and runscpu_idle()
. (scanf
=>read
=>INT 128
=>sys_read
=> make shell sleep; cpu jumps tocpu_idle
)
- Shell runs
-
- A user types command.
$ ls<Enter>
- A user types command.
-
- Each key typing will raise keyboard interrupt.
press l
=>INT 33
=>atkbd_interrupt
=> storel
=> cpu goes back tocpu_idle()
=> releasel
=>INT 33
=>atkbd_interrupt
=> ignore key release => cpu goes back tocpu_idle()
=> presss
=>INT 33
=> ......
=> press =>INT 33
=>atkbd_interrupt
=> storels
in shell's buf and wakes up shell.
- Waking up shell means set its state to TASK_RUNNING and put it back to the run queue. Now the scheduler picks shell as the next process and shell resumes execution.
- Each key typing will raise keyboard interrupt.
-
- shell runs
x=fork()
, andfork()
will make a child.
(fork
=>INT 128
=>do_fork()
=> make a child; assume the scheduler picks parent first)
- shell runs
-
- parent shell runs
wait()
, andwait()
will make it sleep. Now the scheduler picks child.
(wait
=>INT 128
=>sys_wait
)
- parent shell runs
-
- child shell runs
execve("ls", ....)
which will change it to/bin/ls
program. The scheduler picks the child again (parent is still sleeping).
(execve
=>INT 128
=>do_execve
)
- child shell runs
-
/bin/ls
runs and shows all file names in the current directory in the screen.
At the end, it callsexit()
.exit()
will make it a zombie (set its state to TASK_ZOMBIE) and sends a signal to parent.
This signal wakes up parent (set its state to TASK_RUNNING).
The scheduler now picks parent.
-
- parent goes back to the beginning of
for(;;)
loop and runsprintf("$")
.$
- parent goes back to the beginning of
ex1.c
:
void main(){
int x;
x=fork();
printf("x:%d\n", x);
}
fork
๋ ์์ ์ body์ process descriptor๋ฅผ ๋ณต์ฌํด child process๋ฅผ ๋ง๋ค์ด๋ธ๋ค.
fork
๊ฐ ์ฑ๊ณตํ๋ฉด ์์ ํ๋ก์ธ์ค์์๋ 0์ ๋ฐํํ๊ณ (์คํจ์ -1), ๋ถ๋ชจ ํ๋ก์ธ์ค์์๋ ์์์ pid๋ฅผ ๋ฐํํ๋ค.
0์ ์์ ํ๋ก์ธ์ค์ printf๋ก๋ถํฐ ์ถ๋ ฅ๋ ๊ฐ์ด๊ณ , 4506์ ๋ถ๋ชจ ํ๋ก์ธ์ค์ printf
๋ก๋ถํฐ ์ถ๋ ฅ๋ ๊ฒ์ด๋ค.
ex1.c
:
void main(){
fork();
fork();
fork();
for(;;);
}
์ ์ฝ๋๋ fork
๋ฅผ 3๋ฒํ๊ณ ๋ฌดํ๋ฃจํ๋ฅผ ๋๊ณ ์๋ค.
$ gcc โo ex1 ex1.c
$ ./ex1 &
$ ps โef
์ด 8๊ฐ์ ./ex1
ํ๋ก์ธ์ค๊ฐ ์์ฑ๋๋๋ฐ, fork
ํจ์๊ฐ 3๊ฐ์ด๊ธฐ ๋๋ฌธ์ 2^3=8๊ฐ์ ํ๋ก์ธ์ค๊ฐ ์์ฑ๋์๋ค.
ex1.c
:
#include <stdio.h>
#include <unistd.h>
void main(){
int i; float y=3.14;
fork();
fork();
for(;;){
for(i=0;i<1000000000;i++) y=y*0.4;
printf("%d\n", getpid());
}
}
2๋ฒ๊ณผ ๋น์ทํ์ง๋ง, fork()
๊ฐ 2๋ฒ ์๊ธฐ ๋๋ฌธ์ ์ด 4(=2^2)๊ฐ์ ./ex1
ํ๋ก์ธ์ค๊ฐ ์์ฑ๋๋ค.
4๊ฐ์ ํ๋ก์ธ์ค์์ y
์ ๋ํ ์ฐ์ฐ์ ์ํํ๊ณ ์์ ์ PID๋ฅผ ์ถ๋ ฅํ๋ค.
๋ฌดํ๋ฃจํ์ด๊ธฐ ๋๋ฌธ์ ํ๋ก์ธ์ค๊ฐ ์ข ๋ฃ๋ ๋๊น์ง ๊ณ์ ์ฐ์ฐ์ ์ํํ๊ณ PID๋ฅผ ์ถ๋ ฅํ ๊ฒ์ด๋ค.
ex1.c
:
void main(){
char *argv[10];
argv[0] = "./ex2";
argv[1] = 0;
execve(argv[0], argv, 0);
}
ex2.c
:
void main(){
printf("korea\n");
}
execve
๋ ํ์ฌ ํ๋ก์ธ์ค๋ฅผ ์
๋ ฅ ๋ฐ์ ํ๋ก๊ทธ๋จ์ผ๋ก ํ๋ก์ธ์ค๋ฅผ ๊ต์ฒดํด ์๋ก ์์ํ๋ ํจ์์ด๋ค. ์ฒซ ๋ฒ์งธ ์ธ์๋ก ํ๋ก๊ทธ๋จ ๊ฒฝ๋ก๋ฅผ ๋ฐ๊ณ ๋ ๋ฒ์งธ ์ธ์๋ก ํ๋ก๊ทธ๋จ์ argv
์ ๋์ด๊ฐ ๊ฐ์ ๋ฐ๋๋ค.
$ gcc โo ex1 ex1.c
$ gcc โo ex2 ex2.c
$ ./ex1
ex1
์์ execve
๋ฅผ ํธ์ถํ๊ธฐ ๋๋ฌธ์ ํ์ฌ ํ๋ก์ธ์ค body๊ฐ ex2
์ body๋ก ๊ต์ฒด๋๊ณ ์๋ก ์คํ๋๊ธฐ ๋๋ฌธ์ ex2
์ ์ถ๋ ฅ์ธ "korea"๊ฐ ์ถ๋ ฅ๋์๋ค.
void main() {
char *argv[10];
argv[0] = "/bin/ls";
argv[1] = 0;
execve(argv[0], argv, 0);
}
4๋ฒ๊ณผ ์ ์ฌํ์ง๋ง, ์คํ๋๋ ํ์ผ ์ด๋ฆ์ด "/bin/ls"๋ก ๋ฐ๋์๋ค.
ls
๋ช
๋ น์ด๋ /bin/ls
์ด๋ผ๋ ํ๋ก๊ทธ๋จ์ ์คํํ๋ ๋ช
๋ น์ด์ด๋ค.
execve
๋ก /bin/ls
์ ์คํํ๊ธฐ ๋๋ฌธ์, ls
๋ฅผ ์คํํ ๊ฒฐ๊ณผ์ ๊ฐ์ ๋ด์ฉ์ด ์ถ๋ ฅ๋์๋ค.
argv
๋ ๋ฌธ์์ด ๋ฐฐ์ด๋ก C์ธ์ด์์ ๋ฌธ์์ด๊ณผ ๊ฐ์ด ๋ง์ง๋ง ์์๋ฅผ NULL
ํ์ํจ์ผ๋ก์จ ๋ฐฐ์ด์ ๋์ ๋ํ๋ธ๋ค.
void main() {
char *argv[10];
argv[0] = "/bin/ls";
argv[1] = "-a";
argv[2] = 0;
execve(argv[0], argv, 0);
}
5๋ฒ๊ณผ ์ ์ฌํ์ง๋ง, argv
์ ๋ ๋ฒ์งธ ์์๋ก "-a"๊ฐ ๋ค์ด์๋ค.
ls
์ -a
์ต์
์ ์จ๊ฒจ์ง ํ์ผ์ด๋ ๋๋ ํ ๋ฆฌ๋ฅผ ์ถ๋ ฅํ๋ ์ต์
์ด๋ค.
ls -a
๋ฅผ ์ง์ ์คํํด๋ณด์์ ๋, ๋์ผํ ์ถ๋ ฅ ๊ฒฐ๊ณผ๋ฅผ ์ป์ ์ ์์๋ค.
p1.c
:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void * foo(void * aa) {
printf("hello from child\n");
return NULL;
}
void main() {
pthread_t x;
pthread_create(&x, NULL, foo, NULL); // make a child which starts at foo
printf("hi from parent\n");
pthread_join(x, NULL); // wait for the child
}
ํ๋ก์ธ์ค(process)๋ ๋จ์ํ ์คํ ์ค์ธ ํ๋ก๊ทธ๋จ(program)์ด๋ผ๊ณ ํ ์ ์๋ค.
์ฆ, ์ฌ์ฉ์๊ฐ ์์ฑํ ํ๋ก๊ทธ๋จ์ด ์ด์์ฒด์ ์ ์ํด ๋ฉ๋ชจ๋ฆฌ ๊ณต๊ฐ์ ํ ๋น๋ฐ์ ์คํ ์ค์ธ ๊ฒ์ ๋งํ๋ค. ์ด๋ฌํ ํ๋ก์ธ์ค๋ ํ๋ก๊ทธ๋จ์ ์ฌ์ฉ๋๋ ๋ฐ์ดํฐ์ ๋ฉ๋ชจ๋ฆฌ ๋ฑ์ ์์ ๊ทธ๋ฆฌ๊ณ ์ค๋ ๋๋ก ๊ตฌ์ฑ๋๋ค.
์ค๋ ๋(thread)๋ ํ๋ก์ธ์ค(process) ๋ด์์ ์ค์ ๋ก ์์
์ ์ํํ๋ ์ฃผ์ฒด๋ฅผ ์๋ฏธํ๋ค. ๋ชจ๋ ํ๋ก์ธ์ค์๋ ํ ๊ฐ ์ด์์ ์ค๋ ๋๊ฐ ์กด์ฌํ์ฌ ์์
์ ์ํํ๋ค.
์ถ์ฒ : TCP School - 69) ์ค๋ ๋์ ๊ฐ๋
pthread_create
๋ ์ค๋ ๋๋ฅผ ์์ฑํ๋ ํจ์์ด๋ค. ์ฒซ ๋ฒ์งธ ์ธ์ thread๋ ์ค๋ ๋๊ฐ ์์ฑ๋์์ ๋, ์ด๋ฅผ ์๋ณํ๊ธฐ ์ํ ๊ฐ์ด๋ค. ์ธ ๋ฒ์งธ ์ธ์๋ ์ค๋ ๋๊ฐ ์คํ๋ ๋, ์ฌ์ฉ๋ ํจ์๋ฅผ ๋ฃ์ด์ค๋ค.
$ gcc โo p1 p1.c โlpthread
$ ./p1
<pthread.h>
ํค๋์ ํจ์๋ฅผ ์ฌ์ฉํ๋ ค๋ฉด PThread ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ๋งํฌํด์ผ ํ๋ค. gcc
์์ -l
์ ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ๋งํฌํ๋ ์ต์
์ผ๋ก -lpthread
๋ /usr/lib/libpthread.so
๋ฅผ ๋งํฌํ๋ค.
์คํ๊ฒฐ๊ณผ๋ ์๋์ ๊ฐ๋ค.
p1.c
:
#include <stdio.h>
int y=0;
int main() {
int x;
x = fork();
if (x == 0) {
y = y + 2;
printf("process child:%d\n", y);
} else {
y = y + 2;
printf("process parent:%d\n", y);
}
return 0;
}
p2.c
:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
int y = 0;
void * foo(void *aa) { // aa is arguments passed by parent, if any.
y = y + 2;
printf("thread child:%d\n", y);
return NULL;
}
void main() {
pthread_t x;
pthread_create(&x, NULL, foo, NULL);
y = y + 2;
printf("thread parent:%d\n", y);
pthread_join(x, NULL); // wait for the child
}
p1.c
๋ process์ parent, child๋ฅผ ๋น๊ต,
p2.c
๋ thread์ parent, child๋ฅผ ๋น๊ตํ๋ ์ฝ๋์ด๋ค.
p1.c
์ p2.c
๋ชจ๋ ์ ์ญ ๋ณ์ y
๋ฅผ ์ ์ธํด์ฃผ์๋ค.
p1.c
๋ y
์ ๊ฐ์ผ๋ก parent์ child ๋ชจ๋ ๋์ผํ๊ฒ 2๊ฐ ์ถ๋ ฅ๋์์ง๋ง, ๊ฐ๊ฐ์ ํ๋ก์ธ์ค๋ง๋ค ๊ฐ๋ณ์ ์ธ y
๋ฅผ ๊ฐ์ง๊ณ ์๊ธฐ ๋๋ฌธ์ ์ด๋ ์๋ก์ ์ํฅ์ ๋ฐ์ง ์์ ๊ฐ์ด๋ค.
p2.c
์์๋ ์ค๋ ๋๊ฐ ํ ํ๋ก์ธ์ค ๋ด์ ์ ์ญ๋ณ์ y
๋ฅผ ๊ณต์ ํ๊ธฐ ๋๋ฌธ์ child process์์ 2๋ฅผ ์ถ๋ ฅํ๊ณ , parent process์์ 2์ธ y
์ 2๋ฅผ ๋ํ 4๋ฅผ ์ถ๋ ฅํ๋ค.
1) Try the shell code in section 7. Try Linux command such as /bin/ls
, /bin/date
, etc.
shell.c
:
Section 7์ shell code๋ฅผ ์ฐธ๊ณ ํ์ฌ shell ํ๋ก๊ทธ๋จ์ ๋ง๋ค์ด๋ณด์๋ค.
/bin/ls
, /bin/date
, ๊ทธ๋ฆฌ๊ณ /bin/pwd/
๋ช
๋ น์ด๋ฅผ ์
๋ ฅํด๋ณด์๋๋ฐ ๊ฒฐ๊ณผ๊ฐ์ด ์ ํํ ๋์จ ๊ฒ์ ํ์ธํ ์ ์์๋ค.
2) Print the pid of the current process (current->pid
) inside rest_init()
and kernel_init()
. The pid printed inside rest_init()
will be 0, but the pid inside kernel_init()
is 1. 0 is the pid of the kernel itself.
Why do we have pid=1 inside kernel_init()
?
Find a location where current->pid
will print 2
.
์์ ๊ฐ์ด rest_init
๊ณผ kernel_init
ํจ์ ์์ ๋ถ๋ถ์ ํ์ฌ PID๋ฅผ ์ถ๋ ฅํ๋ ์ฝ๋๋ฅผ ์ฝ์
ํ๋ค.
๋ณ๊ฒฝ์ฌํญ์ ์ ์ฉํ๊ธฐ ์ํด ์ปค๋์ ์ปดํ์ผํ๊ณ , ์ฌ๋ถํ ํ์๋ค.
$ make bzImage
$ cp arch/x86/boot/bzImage /boot/bzImage
$ reboot
dmesg
๋ก ๋ถํ
๋ฉ์ธ์ง๋ฅผ ํ์ธํด๋ณด์๋ค.
rest_init
์์์ PID๋ 0, kernel_init
์์๋ 1์ด ์ถ๋ ฅ๋์๋ค.
rest_init
ํจ์ ๋ด๋ถ์์๋ kernel_thread
ํจ์๋ก kernel_init
์ด๋ผ๋ task๋ฅผ ๋ง๋ค๋ฉฐ PID๋ฅผ 1๋ก ์ง์ ํ๋ค. ์ด๋, kernel_init
์ ํ๋ก์ธ์ค์ด๊ธฐ ๋๋ฌธ์ init_task
์ process body๋ฅผ ๊ณต์ ํ๋ค.
๋ค์์ผ๋ก current->pid
์ ์ถ๋ ฅ์ด 2
๊ฐ ๋๋ ๊ณณ์ ์์๋ณด๊ธฐ ์ํด ps
๋ช
๋ น์ด๋ฅผ ์ฌ์ฉํ์๋ค.
$ ps -ef
kthreadd
์ PID๊ฐ 2์ด๋ฏ๋ก current->pid
์ ์ถ๋ ฅ์ด 2
๊ฐ ๋๋ ๊ณณ์ kthreadd
๊ฐ ์คํ๋ ์ดํ๋ผ๊ณ ์์ธกํ ์ ์๋ค.
3) The last function call in start_kernel()
is rest_init()
. If you insert printk()
after rest_init()
, it is not displayed during the system booting. Explain the reason.
init/main.c
:
void start_kernel(){
............
printk("before rest_init\n"); // this will be printed out
rest_init();
printk("after rest_init\n"); // but this will not.
}
start_kernel()
์์ ๋ง์ง๋ง์ผ๋ก ํธ์ถ๋๋ rest_init
๋ฅผ ๋ณด๋ฉด ํจ์์ ์ ์ ์ค cpu_idle
ํจ์๋ฅผ ํธ์ถํ๋ค.
arch/x86/kernel/process_64.c
:
cpu_idle
ํจ์๋ "login:"๋ฅผ ํ๋ฉด์ ์ถ๋ ฅํ ํ, ์ฌ์ฉ์์ ์
๋ ฅ ์ ๊น์ง ๋ฌดํ๋ฃจํ๋ฅผ ๋๋ฉฐ ์ค์ผ์ฅด๋ง์ ๊ธฐ๋ค๋ฆฐ๋ค. ๊ณ์ ๋ฌดํ ๋ฃจํ๋ฅผ ๋๊ณ ์๊ธฐ ๋๋ฌธ์, ๊ฐ์ ํ๋ก์ธ์ค์ rest_init ์ดํ์ ์ฝ๋๋ ์คํ๋์ง ์๋๋ค. ๋ฐ๋ผ์ rest_init ์ดํ์ ์ฝ๋๋ ๋ถํ
๊ณผ์ ์์ ์คํ๋ ์ ์๋ค.
4) The CPU is either in some application program or in Linux kernel. You always should be able to say where is the CPU currently. Suppose we have a following program (ex1.c
).
ex1.c
:
void main(){
printf("korea\n");
}
When the shell runs this, CPU could be in shell program or in ex1
or in kernel.
Explain where is CPU for each major step of this program. You should indicate the CPU location whenever the cpu changes its location among these three programs.
Start the tracing from the moment when the shell prints a prompt until it prints next prompt.
shell: printf(โ$โ); // CPU๋ shell์ ์์ผ๋ฉฐ, shell์์ ์
๋ ฅ ๊ฐ๋ฅ์ ๋ํ๋ด๋ ๋ฌธ์๋ฅผ ์ถ๋ ฅ
=> write(1, โ$โ, 1); // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, STDOUT_FILENO(=1)์ โ$โ์ 1๊ธ์ ์ถ๋ ฅ
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
=> mov eax 4 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, write์ ์์คํ
์ฝ ๋ฒํธ๋ 4
kernel: sys_write() // CPU๋ kernel์ ์์ผ๋ฉฐ, โ$โ ๋ฌธ์์ด์ ์ถ๋ ฅํ๊ธฐ ์ํ ์์คํ
์ฝ ํธ์ถ
shell: scanf(โ%sโ, buf); // CPU๋ shell์ ์์ผ๋ฉฐ, ์ฌ์ฉ์๊ฐ ์
๋ ฅํ ๋ฌธ์์ด์ ์ฝ์
=> read(1, buf, n); // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, STDIN_FILENO(=1)์์ len ๋งํผ ์ฝ์ด buf์ ์ ์ฅ
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
=> mov eax 3 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, read์ ์์คํ
์ฝ ๋ฒํธ๋ 3
kernel: sys_read(); // CPU๋ kernel์ ์์ผ๋ฉฐ, ์
๋ ฅํ ๋ฌธ์์ด์ ์ฝ์ด์ค๋ ์์คํ
์ฝ ํธ์ถ
/* (ํค๋ณด๋ ์
๋ ฅ ๋ฐ์) */
shell: INT 33 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ํค๋ณด๋ ์ธํฐ๋ฝํธ์ธ 33๋ฒ ํธ์ถ
kernel: atkbd_interrupt // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ํค๋ณด๋ ๋ฒํผ์ ์
๋ ฅํ ๋ฌธ์ ์ ์ฅ
/* (ENTER๊ฐ ์
๋ ฅ๋ ๋๊น์ง ์ ๊ณผ์ ๋ฐ๋ณต) */
/* (ENTER ์
๋ ฅ) */ // CPU๊ฐ shell๋ก ๋๋์๊ฐ
shell: x=fork(); // CPU๋ shell์ ์์ผ๋ฉฐ, ํ๋ก๊ทธ๋จ์ ์คํํ๊ธฐ ์ํ ์์ ํ๋ก์ธ์ค ์์ฑ
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
kernel: sys_fork() // CPU๋ kernel์ ์์ผ๋ฉฐ, fork์ ์์คํ
์ฝ ํธ์ถ
shell: printf(โI am child~%s\nโ, buf); // CPU๋ shell์ ์์
=> write(1, โI am~โ, n) // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, STDOUT_FILENO(=1)์ โI am~โ์ n๊ธ์๋งํผ ์ถ๋ ฅ
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
kernel: sys_write() // CPU๋ kernel์ ์์ผ๋ฉฐ, โI am~โ๋ฌธ์์ด์ ์ถ๋ ฅํ๊ธฐ ์ํ ์์คํ
์ฝ ํธ์ถ
shell: y=execve(buf, argv, 0); // CPU๋ shell์ ์์ผ๋ฉฐ, ์
๋ ฅ๋ฐ์ ํ๋ก๊ทธ๋จ์ ์คํ
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
kernel: sys_execve() // CPU๋ kernel์ ์์ผ๋ฉฐ, execve์ ์์คํ
์ฝ ํธ์ถ
ex1: printf(โkorea\nโ); // CPU๋ ex1์ ์์
=> write(1, โkorea\nโ, 6); // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ,
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
kernel: sys_write() // CPU๋ kernel์ ์์ผ๋ฉฐ, โkoreaโ๋ฌธ์์ด์ ์ถ๋ ฅํ๊ธฐ ์ํ ์์คํ
์ฝ ํธ์ถ
=> exit(0)
shell: else wait(); // CPU๋ shell์ ์์
=> INT 128 // CPU๋ C ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์์ผ๋ฉฐ, ์์คํ
์ฝ ์ธํฐ๋ฝํธ์ธ 128๋ฒ์ ํธ์ถ
kernel: sys_wait4 // CPU๋ kernel์ ์์
shell: printf("$"); // CPU๋ ๋ค์ shell์ ์์ผ๋ฉฐ, ํ๋ก๊ทธ๋จ์ด ์ข
๋ฃ๋๋ฉด ๋ค์ shell์์ ์
๋ ฅ ๊ฐ๋ฅ์ ๋ํ๋ด๋ ๋ฌธ์๋ฅผ ์ถ๋ ฅ
5) What happens if the kernel calls kernel_init
directly instead of calling kernel_thread(kernel_init, ...)
in rest_init()
?
Call kernel_init
with NULL
argument and explain why the kenel falls into panic.
rest_init
ํจ์์ ์ ์์์ kernel_thread
์ฝ๋๋ฅผ ์ฃผ์ ์ฒ๋ฆฌํ ํ kernel_init(NULL);
๋ก ์ง์ ํธ์ถํ๋๋ก ํ์๋ค.
์ดํ recompile ๋ฐ ์ฌ๋ถํ ํ์๋ค.
๋ง์ง๋ง ์ค์ "Kernel panic"์ด๋ผ๋ ์๋ฌ ๋ฉ์ธ์ง๊ฐ ์ถ๋ ฅ๋ ํ ๋ถํ ์ด ๋ ์ด์ ์งํ๋์ง ์์๋ค.
kernel_thread
๋ ํ๋ก์ธ์ค ๋์คํฌ๋ฆฝํฐ๋ฅผ ๋ณต์ฌํ๋๋ฐ, kernel_thread
์์ด kernel_init
์ ์คํํ๊ฒ ๋๋ฉด kernel_init
๋ด๋ถ์ ๋ฌดํ๋ฃจํ๋ก ์ธํด ์ดํ ํ๋ก์ธ์ค๊ฐ ์คํ๋์ง ์๋๋ค.
๋ํ, kernel_execve
์ผ๋ก ๋ค๋ฅธ ํ๋ก์ธ์ค๋ฅผ ์คํํ๋ฉด ํ์ฌ body๋ฅผ ์ ๊ฑฐํ๊ธฐ ๋๋ฌธ์ ์ค๋ฅ๊ฐ ๋ฐ์ํ๋ค.
6) Trace fork
, exec
, exit
, wait
system call to find the corresponding code for the major steps of each system call.
fork
๋ sys_fork
ํจ์๋ฅผ ํธ์ถํ๋ค.
sys_fork
-> do_fork
-> copy_process
-> dup_task_struct
do_fork
์์ return
ํ๋ nr
์ ๋ณต์ฌ๋ ํ๋ก์ธ์ค์ PID์ด๋ค.
exec
๋ sys_exec
ํจ์๋ฅผ ํธ์ถํ๋ค.
sys_execve
-> do_execve
-> open_exec
-> sched_exec
do_execve
์์ ํ์ผ์ ์ด๊ณ , ์ค์ผ์ค์ ๋ฑ๋กํ๊ณ , argv
๋ฑ์ ๊ฐ์ ๋๊ฒจ์ค๋ค.
exit
๋ sys_exit
ํจ์๋ฅผ ํธ์ถํ๋ค.
do_exit
:
......
......
......
sys_exit
-> do_exit
-> exit_signals
-> exit_mm
-> exit_thread
-> exit_notify
-> schedule
์๊ทธ๋ ๋ณด๋ด๊ณ (๋ฑ๋ก๋ ํจ์ ํธ์ถ), ๋ฉ๋ชจ๋ฆฌ ํ์ํ๊ณ , ์ค๋ ๋ ์ข ๋ฃํ๊ณ , ๋ถ๋ชจ ํ๋ก์ธ์ค์ ์๋ฆฌ๊ณ (์๊ทธ๋ ์ ์ก), ์ค์ผ์ค๋ง์ผ๋ก ์์ ํ ์ ๊ฑฐํ๋ค.
wait(&wstatus)
๋ waitpid(-1, &wstatus, 0)
์ด๋ค. ๋ฐ๋ผ์ waitpid
๋ฅผ ์ฐพ์์ผ ํ๋ค.
waitpid
๋ sys_waitpid
ํจ์๋ฅผ ํธ์ถํ๋ค.
sys_waitpid
๋ฅผ ์ฐพ์๊ฐ๋ฉด,
sys_waitpid
๋ ํธํ์ฑ์ ์ํด ๋จ๊ฒจ๋์์๋ฟ, sys_wait4
์ผ๋ก ๊ตฌํ๋๋ค๊ณ ์ฃผ์์ด ๋จ๊ฒจ์๋ค.
sys_wait4
๋ฅผ ์ฐพ์๋ณด์.
sys_waitpid
-> sys_wait4
-> do_wait
-> wait_task_stopped
/ wait_task_zombie
/ wait_task_continued
$ ./startsys;./sysnum;./stopsys
where, startsys
sets the kernel flag so that system call number can be displayed, stopsys
resets it, and sysnum
calls printf
.
sysnum.c
:
void main(){
printf("hi\n");
}
startsys.c
:
void main(){
syscall(31); // start printing sysnum
}
stopsys.c
:
void main(){
syscall(32); // stop printing sysnum
}
execve(argv[0], argv, 0);
how the Linux knows the value of argv
?
int x,y;
y=0;
x=20/y;
This program, when run, will print:
Floating point exception
and dies. It dies because of divide-by-zero exception. Modify the kernel such that the system prints instead (when this program runs):
Divide-by-zero exception
Floating point exception
Tip) to make a call to a new function from entry.S
, you need to protect registers as follows:
SAVE_ALL
call new_function
RESTORE_REGS
All programs end with exit()
system call
. Even If the programmer didn't put exit()
in his code, the compiler will provide it in crtso (C run-time start-off function).
- remove body
- make it a zombie
- send SIGCHLD to the parent
- adopt children to init process
- schedule next process
exit
-> sys_exit
->
kernel/exit.c
:
do_exit() {
struct task_struct *tsk = current;
exit_mm(tsk); // remove body
exit_sem(tsk);
__exit_files(tsk);
__exit_fs(tsk); // remove resouces
exit_notify(tsk); // send SIGCHLD to the parent, ask init to adopt my own child,
// set tsk->exit_state = EXIT_ZOMBIE to make it a zombie
tsk->state = TASK_DEAD;
schedule(); // call a scheduler
}
The parent should wait in wait
to collect the child; otherwise the child stays as a zombie consuming 8192 bytes of the memory.
if child has exit first (that is, if the child is a zombie)
let it die completely (remove its process descriptor)
else (child is not dead yet)
block parent
remove parent from the run queue
schedule next process
When later the child exits, the parent will get SIGCHLD, wakes up, and be inserted into the run queue.
wait
-> sys_wait4
->
kernel/exit.c
:
do_wait() {
struct task_struct *tsk;
DECLARE_WAITQUEUE(wait, current); // make a "wait" queue
add_wait_queue(¤t->signal->wait_chldexit, &wait);
current->state = TASK_INTERRUPTIBLE; // block parent
tsk = current;
do{
list_for_each(_p, &task-children){
p = list_entry(...);
if (p->exit_state == EXIT_ZOMBIE){ // if child has exit first
wait_task_zombie(p, ...); // kill it good
break;
}
// otherwise, it's still alive. wait here until it is dead
wait_task_contiuned(p,...);
break;
}
schedule();
}