Xalanz's 网络日志(Ethan's blog)

阳光风雨中的泥土Mud under the sunlight and rain

Linux kernel 如何确定哪个程序是最滥的,该Kill的

,

[By Xalanz]在以资源为基础的世界里,哪怕是计算机的Cyber Space,当资源紧张的时候,总要有所选择,我们看看Linux 内核在系统资源告罄的时候,如何确定哪个进程是最差的,然后杀掉它以腾出宝贵的资源,使整个系统得以存活。
Oom_kill.c
Oom_badness()
我们看看什么是好的算法:

这是作者的注释
* Good in this context means that:
* 1) we lose the minimum amount of work done
* 2) we recover a large amount of memory
* 3) we don't kill anything innocent of eating tons of memory
* 4) we want to kill the minimum amount of processes (one)
* 5) we try to kill the process the user expects us to kill, this
* algorithm has been meticulously tuned to meet the principle
* of least surprise ... (be careful when you change it)
*/
翻译过来,大概意思是:
在这里所谓好的(算法的)意思是:
1)我们失去的完成的工作量最小。
2)我们能够获得大量的内存(资源).
3) 我们不会杀掉任何单存的仅仅吃掉巨量内存的任何东西。
4)我们想杀掉最少数量的进程(一个)
5)我们试图杀掉用户期望杀掉的进程,这个算法已经被小心翼翼的调优以符合带来的惊讶最小的原则....
(修改时一定要谨慎)
unsigned long badness(struct task_struct *p, unsigned long uptime)
{
unsigned long points, cpu_time, run_time, s;
struct list_head *tsk;

if (!p->mm)
return 0;

/*
* The memory size of the process is the basis for the badness.
* 进程的内存占用大小是计算它有多滥的(点数)基础.
*/
points = p->mm->total_vm;

/*
* Processes which fork a lot of child processes are likely
* a good choice. We add half the vmsize of the children if they
* have an own mm. This prevents forking servers to flood the
* machine with an endless amount of children. In case a single
* child is eating the vast majority of memory, adding only half
* to the parents will make the child our kill candidate of choice.
* 那些创建了大量的子进程的进程可能是好的选择。如果这些子进程有自己的
* 内存占用,我们将点数加上所有子进程的虚存大小的一半。这可以用来阻止服务器进程通过
* 无休止的创建大量的子进程来冲毁及其。对于单个子进程吃掉大量内存的情形,将其内存大小的一半加
* 到父进程(点数)上使该子进程成为被杀的候选。
*/

list_for_each(tsk, &p->children) {
struct task_struct *chld;
chld = list_entry(tsk, struct task_struct, sibling);
if (chld->mm != p->mm && chld->mm)
points += chld->mm->total_vm/2 + 1;
}

/*
* CPU time is in tens of seconds and run time is in thousands
* of seconds. There is no particular reason for this other than
* that it turned out to work very well in practice.
* CPU 时间以十秒计,运行时间以千秒计。没什么特别的原因,除了排除那些实际上工作得很好的进程。
*/
cpu_time = (cputime_to_jiffies(p->utime) + cputime_to_jiffies(p->stime))
>> (SHIFT_HZ + 3);

if (uptime >= p->start_time.tv_sec)
run_time = (uptime - p->start_time.tv_sec) >> 10;
else
run_time = 0;

s = int_sqrt(cpu_time);
if (s)
points /= s;
s = int_sqrt(int_sqrt(run_time));
if (s)
points /= s;

/*
* Niced processes are most likely less important, so double
* their badness points.
* 降低过优先级的进程通常可能更不重要,所以对它们糟糕点数加倍
*/
if (task_nice(p) > 0)
points *= 2;

/*
* Superuser processes are usually more important, so we make it
* less likely that we kill those.
* 超级用户进程通常是更重要一些的,所以我们使杀掉这些进程的可能性更小一些。
*/
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_ADMIN) ||
p->uid == 0 || p->euid == 0)
points /= 4;

/*
* We don't want to kill a process with direct hardware access.
* Not only could that mess up the hardware, but usually users
* tend to only have this flag set on applications they think
* of as important.
* 我们不想杀掉直接访问硬件的进程。不仅因为这可能搞乱硬件,用户通常倾向于
* 对他们认为重要的应用设置CAP_SYS_RAWIO标记。
*/
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO))
points /= 4;

/*
* Adjust the score by oomkilladj.
* 如果进程有oomkilladj调整字段,通过oomkilladj字段调整分数。
*/
if (p->oomkilladj) {
if (p->oomkilladj > 0)
points <<= p->oomkilladj;
else
points >>= -(p->oomkilladj);
}

#ifdef DEBUG
printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
p->pid, p->comm, points);
#endif
return points;
}

Linux 的LVM富康16v第八改:高流量空滤+全合成机油

Write a comment

New comments have been disabled for this post.