极湖

无不用其“极”

Subscribe to RSS feed

Posts tagged with "Unix"

Unix 下文本操作的利器

, ,

■ cut

cut 命令可用来截取文本中特定的字段。

(例)
$ cut -c1 file1.txt
截取文件 file1.txt 各行的第1个字符并输出

$ cut -c1-10 file1.txt
截取文件 file1.txt 各行的第1到第10个字符并输出

$ cut -c20- file1.txt
截取文件 file1.txt 各行第20个字符至末尾的字符串并输出

$ cut -d: -f1 file1.txt
截取文件 file1.txt 各行以冒号( : )分割的第1个字符串

$ cut -d' ' -f1,2 file1.txt
截取文件 file1.txt 各行以空格( )分割的第1和第2个字符串

■ paste

paste 命令可用来拼接两个文本文件中的数据。

(例)
文件 file1.txt 的内容:
1
2
3

文件 file2.txt 的内容:
One
Two
Three

$ paste file1.txt file2.txt
显示结果:
1 One
2 Two
3 Three

$ paste -d, file1.txt file2.txt
显示结果:
1,One
2,Two
3,Three

paste -s 可以把一个文件的各行拼接成一整行。

$ paste -s file1.txt
显示结果:
1 2 3

■ sed

sed 是一个数据流编辑器。

(例)
$ sed 's/Unix/UNIX/' file1.txt
把文件 file1.txt 中的字符串 Unix 替换成 UNIX 并输出

$ sed -n '1,2p' file1.txt
  输出文件 file1.txt 的头两行

$ sed -n '/UNIX/p' file1.txt
  输出文件 file1.txt 中包含 UNIX 的行

$ sed '1,2d' file1.txt
  删除文件 file1.txt 的头两行并输出

$ sed '/UNIX/d' file1.txt
  删除文件 file1.txt 中包含 UNIX 的行并输出

■ tr

tr 命令用作字符串变换的过滤器。

(例)
$ tr e x < file1.txt
  将文件 file1.txt 内的字符'e'替换成字符'x'并输出

$ tr '[a-z]' '[A-Z]' < file1.txt
  将文件 file1.txt 内的小写字母替换大写字母并输出

$ tr ' ' '\11' < file1.txt
  将文件 file1.txt 内的空格替换为Tab(缩进)并输出

$ tr -s ' ' '\11' < file1.txt
  将文件 file1.txt 内连续的空格替换为一个Tab(缩进)并输出

$ tr -d ' ' < file1.txt
  删除文件 file1.txt 内所有空格并输出

■ grep

grep命令用于在一个或多个文件中查找符合特定模式的行。

(例)
$ grep '[A-Z]' file1.txt
  在文件 file1.txt 中查找包含英文大写字母的行并输出

$ grep '[A-Z]...[0-9]' file1.txt
  在文件 file1.txt 中查找包含英文大写字母开头并以数字结尾的行并输出

$ grep -v 'UNIX' file1.txt
  在文件 file1.txt 中查找不包含 UNIX 的行并输出

$ grep -l 'Unix' *.txt
  在 *.txt 中查找包含 Unix 的文件并输出文件名

$ grep -n 'Unix' file1.txt
  在文件 file1.txt 中查找不包含 UNIX 的行并带行号输出

■ sort

sort 命令用来对输入行进行排序并输出。

(例)
$ sort < file1.txt
以升序输出 file1.txt 的各行

$ sort -u < file1.txt
以升序输出 file1.txt 的各行,重复的行只输出一次

$ sort -r < file1.txt
以降序输出 file1.txt 的各行

$ sort file1.txt -o file2.txt
将 file1.txt 的各行排序并输出到文件 file2.txt

$ sort -n file1.txt
将 file1.txt 的各行以数值排序并输出

$ sort +1n file.txt
将 file1.txt 的各行根据第2字段(空格分割)以数值排序并输出
  第2フィールドを使って数値的に並び替える。

$ sort +2n -t: file1.txt
将 file1.txt 的各行根据第3字段(冒号分割)以数值排序并输出

■ uniq

uniq 命令用于查找文件中的重复行(输出不重复的行)。

(命令格式) uniq in_file out_file

(例)
$ uniq file1.txt
输出 file1.txt 中不重复的行

$ uniq -d file1.txt
输出 file1.txt 中有重复的行

$ uniq -c file1.txt
统计 file1.txt 中各行的重复次数并输出

※注: 本文翻译整理自《UNIXツール

Vim7 的内置 grep 功能介绍

, ,

Vim7 在其内部集成了 grep 功能。

若要在 Vim7 内部实现类似 Linux/Unix 之 grep 命令的功能,只需使用 :vimgrep 命令。

1. 在当前目录下的所有php文件中查找包含某个字符串(如'mb_convert')的行
:vimgrep /mb_convert/ *.php

2. 用 j 标志查找并打开最初匹配的文件
:vimgrep /mb_convert/j *.php

3. 递归查找 (包括子目录)
:vimgrep /mb_convert/j **/*.php

查找结果会在 Quickfix 列表中显示。
:copen 命令打开 Quickfix 列表窗口。也可以加管道命令 | cwin 直接打开窗口。即:
:vimgrep /mb_convert/j **/*.php | cwin

:ccl 命令关闭 Quickfix 列表窗口。更多用法请看 :he quickfix-window

可用 :grep 代替 :vimgrep 命令。设置方法如下:
:set grepprg=internal

Solaris每进程打开最大文件数的修改方法

,

在Solaris系统下执行以下命令:

# ulimit -a
core file size (blocks) unlimited
data seg size (kbytes) unlimited
file size (blocks) unlimited
open files 256
pipe size (512 bytes) 10
stack size (kbytes) 8192
cpu time (seconds) unlimited
max user processes 15877
virtual memory (kbytes) unlimited

可以看到,每进程打开最大文件数是256,有时候需要修改这个数字。

修改方法

编辑 /etc/system 文件,在其末尾追加一行

set rlim_fd_max=1024

具体数字更具需要而定,修改后需要重启系统。

crontab的时间格式

,

举例如下:
分 時 日 月 星期
43 21 * * *               21:43 执行
15 05 * * *             05:15 执行
0 17 * * *                17:00 执行
0 17 * * 1                每周一的 17:00 执行
0,10 17 * * 0,2,3         每周日,周二,周三的 17:00和 17:10 执行
0-10 17 1 * *             毎月1日从 17:00到7:10 毎隔1分钟 执行
0 0 1,15 * 1              毎月1日和 15日和 一日的 0:00 执行
42 4 1 * *              毎月1日的 4:42分 执行
0 21 * * 1-6            周一到周六 21:00 执行
0,10,20,30,40,50 * * * * 每隔10分 执行
*/10 * * * *        每隔10分 执行
* 1 * * *         从1:0到1:59 每隔1分钟 执行
0 1 * * *         1:00 执行
0 */1 * * *        毎时0分 每隔1小时 执行
0 * * * *         毎时0分 每隔1小时 执行
2 8-20/3 * * *      8:02,11:02,14:02,17:02,20:02 执行
30 5 1,15 * *       1日 和 15日的 5:30 执行

存在时间超过n分钟文件的删除命令

, ,

(例)删除某目录下存在时间超过10分钟文件:

find /filepath -type f -amin +10 -exec rm {} \;

用正则表达式识别中英文

, , ,

包含正则表达式的命令如下:

iconv -f gbk -t utf-8 query_list |egrep -e "^[a-z0-9]*$"

以上这个命令的详细解释,请看车东的Blog

Solaris下tar和gzip的组合用法

, ,

Solaris的命令有时候没有Linux那么方便,比如tar命令,因为没有-z的选项,压缩归档不是那么方便,和gzip组合使用才能达到-z选项的效果。

在此总结一下tar和gzip的组合用法。

归档并压缩的命令

$ tar cvf - test | gzip -c > test.tar.gz

解压缩并展开归档的命令

$ gzip -d -c test.tar.gz | tar xvf -
$ gunzip -c test.tar.gz | tar xvf -
$ zcat test.tar.gz | tar xovf -

Solaris 简明安装手册

,

本文翻译自: http://www.unix-power.jp/solaris/install.html

■开发环境的整备

Solaris的开发环境默认是不被安装的,OS安装后需要自己安装。从Freeware for Solaris能得到二进制包,因此可以从这里下载一套开发环境。

以下是我的软件安装列表。

autoconf-2.53-sol8-sparc-local.gz
automake-1.6-sol8-sparc-local.gz
bash-2.05-sol8-sparc-local.gz
bc-1.06-sol8-sparc-local.gz
binutils-2.11.2-sol8-sparc-local.gz
bison-1.34-sol8-sparc-local.gz
bzip2-1.0.1-sol8-sparc-local.gz
cpio-2.4.2-sol8-sparc-local.gz
fileutils-4.1-sol8-sparc-local.gz
findutils-4.1-sol8-sparc-local.gz
flex-2.5.4a-sol8-sparc-local.gz
gawk-3.1.0-sol8-sparc-local.gz
gcc-2.95.3-sol8-sparc-local.gz
gdb-5.0-sol8-sparc-local.gz
gzip-1.3.3
gdbm-1.8.0-sol8-sparc-local.gz
gettext-0.10.37-sol8-sparc-local.gz
libgcj-2.95.1-sol8-sparc-local.gz
libpcap-0.6.2-sol8-sparc-local.gz
libtool-1.4-sol8-sparc-local.gz
m4-1.4-sol8-sparc-local.gz
make-3.79.1-sol8-sparc-local.gz
md5-6142000-sol8-sparc-local.gz
ntp-4.1.1a-sol8-sparc-local.gz
proftpd-1.2.1-sol8-sparc-local.gz
readline-4.2-sol8-sparc-local.gz
sed-3.02-sol8-sparc-local.gz
tar-1.13.19-sol8-sparc-local.gz
texinfo-4.0-sol8-sparc-local.gz
zlib-1.1.4-sol8-sparc-local.gz


安装软件用以下命令:

$ gunzip ******.gz
$ pkgadd -d ./******


■log的输出

Solaris默认输出的log较少,不容易掌握系统的状态,因此推荐有意识地增加log的输出。具体做法是在/etc/syslog.conf追加以下内容:
mail.debug               /var/log/maillog
auth.debug               /var/log/authlog
news.debug               /var/log/newslog
daemon.debug       /var/log/daemonlog
cron.debug               /var/log/cronlog

以上设置,须注意空白部分必须是tab。

另外,增加了log的输出之后,一方面log会一直堆积,因此需要定期滚动清理。Solaris默认用/usr/lib/newsyslog的Shell脚本来定期清理日志,但我比较喜欢用自己的Perl脚本,通过cron定期实现日志滚动。

/var/adm 之下的日志滚动脚本
#!/usr/bin/perl

$log='/var/adm';
@log_file=('lastlog','messages','sulog','utmpx','wtmpx','vold.log');
$count1=0;
$count2=5;

for $i (@log_file){
        if(-f "$log/$i$count2.gz"){
                system("rm -rf $log/$i$count2.gz");
        }
        for($j=4;$j>=0;$j--){
                if(-f "$log/$i$j.gz"){
                        $target=$j+1;
                        system("mv $log/$i$j.gz $log/$i$target.gz");
                }
        }

        if(-f "$log/$i.gz"){
                system("mv $log/$i.gz $log/$i$count1.gz");
        }

        system("/usr/local/bin/gzip $log/$i");
        system("mv $log/$i.gz $log/$i$count1.gz");
        system("cp /dev/null $log/$i");
        system("chmod 644 $log/$i");
}

system("/etc/init.d/syslog stop;/etc/init.d/syslog start");


/var/log 之下的日志滚动脚本
#!/usr/bin/perl

$log='/var/log';
@log_file=('authlog','cronlog','daemonlog','sshd.log','syslog','tcpd.log');
$count1=0;
$count2=7;

for $i (@log_file){
        if(-f "$log/$i$count2.gz"){
                system("rm -rf $log/$i$count2.gz");
        }
        for($j=7;$j>=0;$j--){
                if(-f "$log/$i$j.gz"){
                        $target=$j+1;
                        system("mv $log/$i$j.gz $log/$i$target.gz");
                }
        }
        if(-f "$log/$i.gz"){
                system("mv $log/$i.gz $log/$i$count1.gz");
        }
        system("/usr/local/bin/gzip $log/$i");
        system("mv $log/$i.gz $log/$i$count1.gz");
        system("cp /dev/null $log/$i");
        system("chmod 644 $log/$i");
}

system("/etc/init.d/syslog stop;/etc/init.d/syslog start");


以上脚本用适当的名字保存到适当的路径下,然后赋予执行权限,之后写入/var/spool/cron/crontabs/root让其定期执行。

■网络的设置

具体的网络设置文件如下所示:

/etc/netmasks
/etc/resolv.conf
/etc/defaultrouter
/etc/nodename
/etc/hosts
/etc/hostname.xxx


/etc/netmasks
192.168.0.0 255.255.255.0

以上是IP地址和子网掩码。默认应该就是这样的设置。

/etc/resolv.conf
domain yourdomain.com
nameserver 210.xxx.xxx.xxx
nameserver 203.xxx.xxx.xxx

如上所述设置所属域和域名服务器(DNS)的IP地址。第二域名服务器没有的情况下不需要特别记述。可能有些不便的是,因该文件默认不存在,需要生成。

/etc/defaultrouter
192.168.0.1

如上所述,设置您所使用的路由(网关)的IP地址。拨号上网的人经常在这儿碰壁。没有这个设定,也能访问局域网(LAN),不过,基本上不能访问外部网络。因该文件默认不存在,需生成。

/etc/hosts
192.168.0.3 host2
192.168.0.4 host3


以上文件记述本地网络中的机器,自己的主机名默认设置成loghost,这个不需要特别介意。因为这是作为别名来使用,设置syslog的时候被使用。如果不需要,把loghost删除了也没事。

/etc/hostname.xxx
hostname

xxx随着设备和构架而改变。如Sparc构架下一般是hme0,Intel构架下的3com则是elxl0等。该文件用来设置主机名。默认有设置。

恢复iptables的默认设置

, ,

来自: http://www.Unlinux.com

#
# reset the default policies in the filter table.
#
/usr/local/sbin/iptables -P INPUT ACCEPT
/usr/local/sbin/iptables -P FORWARD ACCEPT
/usr/local/sbin/iptables -P OUTPUT ACCEPT


#
# reset the default policies in the nat table.
#
/usr/local/sbin/iptables -t nat -P PREROUTING ACCEPT
/usr/local/sbin/iptables -t nat -P POSTROUTING ACCEPT
/usr/local/sbin/iptables -t nat -P OUTPUT ACCEPT


#
# flush all the rules in the filter and nat tables.
#
/usr/local/sbin/iptables -F
/usr/local/sbin/iptables -t nat -F


#
# erase all chains that's not default in filter and nat table.
#
/usr/local/sbin/iptables -X
/usr/local/sbin/iptables -t nat -X'

[转载] xargs 命令的用法(英文)

, ,

原文地址 : http://www.unixreview.com/documents/s=8274/sam0306g/
作者 : Ed Schaefer

(July 2003 — see Web-exclusive update at end of article)

Many UNIX professionals think the xargs command, construct and execute argument lists, is only useful for processing long lists of files generated by the find command. While xargs dutifully serves this purpose, xargs has other uses. In this article, I describe xargs and the historical "Too many arguments" problem, and present eight xargs "one-liners":

  • Find the unique owners of all the files in a directory.
  • Echo each file to standard output as it deletes.
  • Duplicate the current directory structure to another directory.
  • Group the output of multiple UNIX commands on one line.
  • Display to standard output the contents of a file one word per line.
  • Prompt the user whether to remove each file individually.
  • Concatenate the contents of the files whose names are contained in file into another file.
  • Move all files from one directory to another directory, echoing each move to standard output as it happens.

Examining the "Too Many Arguments" Problem

In the early days of UNIX/xenix, it was easy to overflow the command-line buffer, causing a "Too many arguments" failure. Finding a large number of files and piping them to another command was enough to cause the failure. Executing the following command, from Unix Power Tools, first edition (O'Reilly & Associates):

pr -n 'find . -type f -mtime -1 -print'|lpr

will potentially overflow the command line given enough files. This command provides a list of all the files edited today to pr, and pipes pr's output to the printer. We can solve this problem with xargs:

find . -type f -mtime -1 -print|xargs pr -n |lp

With no options, xargs reads standard input, but only writes enough arguments to standard output as to not overflow the command-line buffer. Thus, if needed, xargs forces multiple executions of pr -n|lp.

While xargs controls overflowing the command-line buffer, the command xargs services may overflow. I've witnessed the following mv command fail -- not the command-line buffer -- with an argument list too long error:

find ./ -type f -print | xargs -i mv -f {} ./newdir

Limit the number of files sent to mv at a time by using the xargs -l option. (The xargs -i () syntax is explained later in the article). The following command sets a limit of 56 files at time, which mv receives:

find ./ -type f -print | xargs -l56 -i mv -f {} ./newdir

The modern UNIX OS seems to have solved the problem of the find command overflowing the command-line buffer. However, using the find -exec command is still troublesome. It's better to do this:

# remove all files with a txt extension

find . -type f -name "*.txt" -print|xargs rm

than this:

find . -type f -name "*.txt" -exec rm {} \; -print

Controlling the call to rm with xargs is more efficient than having the find command execute rm for each object found.

xargs One-Liners

The find-xargs command combination is a powerful tool. The following example finds the unique owners of all the files in the /bin directory:

# all on one line
find /bin -type f -follow | xargs ls -al | awk ' NF==9 { print $3 }'|sort -u

If /bin is a soft link, as it is with Solaris, the -follow option forces find to follow the link. The xargs command feeds the ls -al command, which pipes to awk. If the output of the ls -al command is 9 fields, print field 3 -- the file owner. Sorting the awk output and piping to the uniq command ensures unique owners.

You can use xargs options to build extremely powerful commands. Expanding the xargs/rm example, let's assume the requirement exists to echo each file to standard output as it deletes:

find . -type f -name "*.txt" | xargs -i ksh -c "echo deleting {}; rm {}"

The xargs -i option replaces instances of {} in a command (i.e., echo and rm are commands).

Conversely, instead of using the -i option with {}, the xargs -I option replaces instances of a string. The above command can be written as:

find . -type f -name "*.txt" | xargs -I {} ksh -c "echo deleting {}; rm {}"

The new, third edition of Unix Power Tools by Powers et al. provides an xargs "one-liner" that duplicates a directory tree. The following command creates in the usr/project directory, a copy of the current working directory structure:

find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir

The /usr/project directory must exist. When executing, note the error:

mkdir: Failed to make directory "/usr/project/"; File exists

which doesn't prevent the directory structure creation. Ignore it. To learn how the above command works, you can read more in Unix Power Tools, third edition, Chapter 9.17 (O'Reilly & Associates).

In addition to serving the find command, xargs can be a slave to other commands. Suppose the requirement is to group the output of UNIX commands on one line. Executing:

logname; date

displays the logname and date on two separate lines. Placing commands in parentheses and piping to xargs places the output of both commands on one line:

(logname; date)|xargs

Executing the following command places all the file names in the current directory on one line, and redirects to file "file.ls":

ls |xargs echo > file.ls

Use the xargs number of arguments option, -n, to display the contents of "file.ls" to standard output, one name per line:

cat file.ls|xargs -n1

# from Unix in a Nutshell
In the current directory, use the xargs -p option to prompt the user to remove each file individually:

ls|xargs -p -n1 rm

Without the -n option, the user is prompted to delete all the files in the current directory.

Concatenate the contents of all the files whose names are contained in file:

xargs cat < file > file.contents

into file.contents.

Move all files from directory $1 to directory $2, and use the xargs -t option to echo each move as it happens:

ls $1 | xargs -I {} -t mv $1/{} $2/{}

The xargs -I argument replaces each {} in the string with each object piped to xargs.

Conclusion

When should you use xargs? When the output of a command is the command-line options of another command, use xargs in conjunction with pipes. When the output of a command is the input of another command, use pipes.

References

Powers, Shelley, Peek, Jerry, et al. 2003. Unix Power Tools. Sebastopol, CA: O'Reilly & Associates.

Robbins, Arnold. 1999. Unix in a Nutshell. Sebastopol, CA: O'Reilly & Associates.

Ed Schaefer is a frequent contributor to Sys Admin. He is a software developer and DBA for Intel's Factory Integrated Information Systems, FIIS, in Aloha, Oregon. Ed also hosts the monthly Shell Corner column on UnixReview.com. He can be reached at: shellcorner@comcast.net.

July 2003 UPDATE from the author:

I've received very positive feedback on my xargs article. Other readers have shared constructive criticism concerning:

1. When using the duplicate directory tree "one-liner", reader Peter Ludemann suggests using the
mkdir -p option:

find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir -p

instead of :

find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir

mkdir's "-p" option creates parent directories as needed, and doesn't error out if one exists. Additionally, /usr/project does not have to exist.

2. Ludemann, in addition to reader Christer Jansson, commented that spaces in directory names renders the duplicate directory tree completely useless.

Although I'm unable to salvage the duplicate directory command, for those find and xargs versions that support -0 (probably GNU versions only), you might try experimenting with:

find ... -print0 | xargs -0 ...

Using Ludemann's email example, suppose your current directory structure contains:
foo
bar
foo bar

find . -type f -print | xargs -n 1 incorrectly produces:
foo
bar
foo
bar

while find . -type f -print0 | xargs -0 -n 1 delivers the correct results:
foo
bar
foo bar

According to the 7.1 Red Hat Linux man page for xargs and find, the -0 doesn't use the null terminator for file names disabling the special meaning of white space.

3. Reader Peter Simpkin asks the question, "Does the use of xargs only operate after the find command has completed?

find. -type f -name "*.txt" -print | xargs rm

If not, I was under the impression that the above was a bad idea as it is modifying the current directory that find is working from, or at least this is what people have told me, and, thus the results of find are then undefined."

My response is "no". Any Unix command that supports command-line arguments is an xargs candidate. The results of the find command are as valid as the output of the ls command:
# remove files ending with .txt in current directory

ls *.txt|xargs rm

If a command such as this is valid:

chmod 444 1.txt 2.txt 3.txt

then:

find . \( -name 1.txt -o -name 2.txt -o -name 3.txt \) -print|xargs chmod 444

is valid.

In closing, If I had the opportunity to rewrite "Using the Xargs Command", it would look somewhat different.


看过这篇文章后的一个应用:

ls -1 *200603*.log | xargs -i tar -zcvf {}.tar.gz {}

按天压缩3月份的log文件。
February 2012
S M T W T F S
January 2012March 2012
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29