w3resource

Linux Basic Unix tools

Introduction

In this session, we have introduced commands to find, locate files and to compress files, together with other common tools that were not discussed before. While the tools discussed here are technically not considered filters, they can be used in pipes.

find

The find command can be very useful at the start of a pipe to search for files. You might want to add 2>/dev/null to the command lines to avoid cluttering your screen with error messages. Here are some examples.

Find all files in /etc and put the list in etcfiles.txt

find /etc > etcfiles.txt

The output is shown below.

datasoft @ datasoft-linux ~$ cat etcfiles.txt | more
/etc
/etc/pm
/etc/pm/sleep.d
/etc/pm/sleep.d/10_grub-common
/etc/pm/sleep.d/10_unattended-upgrades-hibernate
/etc/pm/sleep.d/novatel_3g_suspend
/etc/pm/power.d
/etc/pm/config.d
/etc/mtab.fuselock
/etc/hp
/etc/hp/hplip.conf
/etc/kernel
/etc/kernel/postrm.d
/etc/kernel/postrm.d/initramfs-tools
/etc/kernel/postrm.d/zz-update-grub
/etc/kernel/postinst.d
/etc/kernel/postinst.d/initramfs-tools
/etc/kernel/postinst.d/update-notifier
/etc/kernel/postinst.d/apt-auto-removal
/etc/kernel/postinst.d/zz-update-grub
/etc/kernel/postinst.d/pm-utils
/etc/insserv
/etc/insserv/overrides
--More--

Find all files of the entire system and put the list in allfiles.txt

find / > allfiles.txt

Find files that end in .conf in the current directory (and all subdirs).

find . -name "*.conf"

The output is shown below

datasoft @ datasoft-linux /$ find . -name "*.conf" | more
find: `./proc/37/map_files': Permission denied
find: `./proc/37/fdinfo': Permission denied
find: `./proc/37/ns': Permission denied
find: `./proc/49/task/49/fd': Permission denied
find: `./proc/49/task/49/fdinfo': Permission denied
find: `./proc/49/task/49/ns': Permission denied
find: `./proc/49/fd': Permission denied
find: `./proc/49/map_files': Permission denied
find: `./proc/49/fdinfo': Permission denied
find: `./proc/49/ns': Permission denied
find: `./proc/52/task/52/fd': Permission denied
find: `./proc/52/task/52/fdinfo': Permission denied
find: `./proc/52/task/52/ns': Permission denied
find: `./proc/52/fd': Permission denied
find: `./proc/52/map_files': Permission denied
find: `./proc/52/fdinfo': Permission denied
find: `./proc/52/ns': Permission denied
find: `./proc/53/task/53/fd': Permission denied
find: `./proc/53/task/53/fdinfo': Permission denied
find: `./proc/53/task/53/ns': Permission denied
find: `./proc/53/fd': Permission denied
find: `./proc/53/map_files': Permission denied
find: `./proc/53/fdinfo': Permission denied
find: `./proc/53/ns': Permission denied
...

Find files of type file (not directory, pipe or etc.) that end in .conf.

find . -type f -name "*.conf"

The output is shown below

datasoft @ datasoft-linux /$ find . -name "*.conf" | more
find: `./proc/37/map_files': Permission denied
find: `./proc/37/fdinfo': Permission denied
find: `./proc/37/ns': Permission denied
find: `./proc/49/task/49/fd': Permission denied
find: `./proc/49/task/49/fdinfo': Permission denied
find: `./proc/49/task/49/ns': Permission denied
find: `./proc/49/fd': Permission denied
find: `./proc/49/map_files': Permission denied
find: `./proc/49/fdinfo': Permission denied
find: `./proc/49/ns': Permission denied
...

Find files of type directory that end in .bak .

find /data -type d -name "*.bak"

The output is shown below

datasoft @ datasoft-linux /$ find /data -type d -name "*.bak" | more
find: `/data': No such file or directory

Find files that are newer than file42.txt

find . -newer file42.txt

The output is shown below

 datasoft @ datasoft-linux /$ find . -newer file42.txt | more
find: `file42.txt': No such file or directory

Find can also execute another command on every file found. This example will look for *.odf files and copy them to /backup/.

Find can also execute, after your confirmation, another command on every file found. This example will remove *.odf files if you approve of it for every file found.

find /data -name "*.odf" -exec cp {} /backup/ \;

datasoft @ datasoft-linux /$ find /data -name "*.odf" -exec cp {} /backup/ \; | more find: `/data': No such file or directory

locate

The locate tool is very different from find in that it uses an index to locate files. This is a lot faster than traversing all the directories, but it also means that it is always outdated. If the index does not exist yet, then you have to create it (as root on Red Hat Enterprise Linux) with the updatedb command.

 datasoft @ datasoft-linux /$ locate samba | more
/etc/samba
/etc/apparmor.d/abstractions/samba
/etc/dhcp/dhclient-enter-hooks.d/samba
/etc/pam.d/samba
/etc/samba/gdbcommands
/etc/samba/smb.conf
/etc/samba/tls
/usr/bin/samba-regedit
/usr/bin/samba-tool
/usr/lib/samba
/usr/lib/2013.com.canonical.certification:checkbox/bin/samba_test
/usr/lib/i386-linux-gnu/libsamba-credentials.so.0
/usr/lib/i386-linux-gnu/libsamba-credentials.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-hostconfig.so.0
/usr/lib/i386-linux-gnu/libsamba-hostconfig.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-policy.so.0
/usr/lib/i386-linux-gnu/libsamba-policy.so.0.0.1
/usr/lib/i386-linux-gnu/libsamba-util.so.0
/usr/lib/i386-linux-gnu/libsamba-util.so.0.0.1
/usr/lib/i386-linux-gnu/samba
/usr/lib/i386-linux-gnu/samba/auth
/usr/lib/i386-linux-gnu/samba/bind9
/usr/lib/i386-linux-gnu/samba/gensec
--More--

Most Linux distributions will schedule the updatedb to run once every day.

sleep

The sleep command is used to suspend execution for at least the integral number of seconds specified by the time operand. The following example shows a six second sleep.

datasoft @ datasoft-linux /$ sleep 6
datasoft @ datasoft-linux /$ 

time

The time command can display how long it takes to execute a command. In the following example the date command takes only a little time to execute.

datasoft @ datasoft-linux /$ time date
Tue Aug  5 17:22:56 IST 2014

real	0m0.001s
user	0m0.000s
sys	0m0.000s

In the following example the sleep 5 command takes five real seconds to execute, but consumes little cpu time.

datasoft @ datasoft-linux /$ time sleep 5

real	0m5.001s
user	0m0.000s
sys	0m0.000s

This bzip2 command compresses a file and uses a lot of cpu time.

datasoft @ datasoft-linux /$ time bzip2 text.txt

real	0m0.021s
user	0m0.000s
sys	0m0.000s

gzip

The gzip command is used to reduces the size of the named files using Lempel-Ziv coding (LZ77).

datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug  2 14:36 temp.txt
 datasoft @ datasoft-linux ~$ gzip temp.txt
 datasoft @ datasoft-linux ~$ ls -lh temp.txt.gz
-rw-rw-r-- 1 datasoft datasoft 49 Aug  2 14:36 temp.txt.gz

gunzip

The gunzip command is used to get back the orginal file, which was compressed by gzip command

 datasoft @ datasoft-linux ~$ gunzip temp.txt.gz
 datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug  2 14:36 temp.txt

zcat - zmore

Text files that are compressed with gzip can be viewed with zcat and zmore.

 datasoft @ datasoft-linux ~$ head -4 temp.txt

four
three 
two
 datasoft @ datasoft-linux ~$ gzip temp.txt
 datasoft @ datasoft-linux ~$ zcat temp.txt.gz | head -4

four
three 
two

bzip2

The bzip2 command is used to reduces the size of the named files using the Burrows-Wheeler block sorting text compression algorithm, and Huffman coding.

datasoft @ datasoft-linux ~$ bzip2 temp.txt
 datasoft @ datasoft-linux ~$ ls -lh temp.txt.bz2
-rw-rw-r-- 1 datasoft datasoft 59 Aug  2 14:36 temp.txt.bz2

bunzip2

Files can be uncompressed again with bunzip2.

datasoft @ datasoft-linux ~$ bunzip2 temp.txt.bz2
 datasoft @ datasoft-linux ~$ ls -lh temp.txt
-rw-rw-r-- 1 datasoft datasoft 22 Aug  2 14:36 temp.txt

bzcat - bzmore

And in the same way, bzcat and bzmore can display files compressed with bzip2.

datasoft @ datasoft-linux ~$ bzip2 temp.txt
 datasoft @ datasoft-linux ~$ bzcat temp.txt.bz2 | head -4
four
three 
two

Exercise, Practice and Solution:

1. Explain the difference between these two commands. This question is very important. If you don't know the answer, then look back at the shell chapter.

find /data -name "*.txt"

find /data -name *.txt

When *.txt is quoted then the shell will not touch it. The find tool will look in the /data for all files ending in .txt. When *.txt is not quoted then the shell might expand this (when one or more files that ends in .txt exist in the current directory). The find might show a different result, or can result in a syntax error.;

2. Explain the difference between these two statements. Will they both work when there are 200 .odf files in /data ? How about when there are 2 million .odf files ?

find /data -name "*.odf" > data_odf.txt

find /data/*.odf > data_odf.txt

The first find will output all .odf filenames in /data and all subdirectories. The shell will redirect this to a file. The second find will output all files named .odf in /data and will also output all files that exist in directories named *.odf (in /data). With two million files the command line would be expanded beyond the maximum that the shell can accept. The last part of the command line would be lost.

3. Write a find command that finds all files created after January 30th, 2010.

Code:

>touch -t 201001302359 marker_date
find . -type f -newer marker_date
There is another solution :
find . -type f -newerat "20100130 23:59:59"
 }

4. Write a find command that finds all *.odf files created in September 2009.

Code:

touch -t 200908312359 marker_start
touch -t 200910010000 marker_end
find . -type f -name "*.odf" -newer marker_start ! -newer marker_end

The exclamation mark ! -newer can be read as not newer.

5. Count the number of *.conf files in /etc and all its subdirs.

Code:

find /etc -type f -name '*.conf' | wc -l

6. Two commands that do the same thing: copy *.odf files to /backup/ . What would be a
reason to replace the first command with the second ? Again, this is an important question.

cp -r /data/*.odf /backup/
find /data -name "*.odf" -exec cp {} /backup/ \;

cp -r /data/*.odf /backup/
find /data -name "*.odf" -exec cp {} /backup/ \;

The first might fail when there are too many files to fit on one command line.

7. Create a file called loctest.txt. Can you find this file with locate ? Why not ? How do you make locate find this file ?

You cannot locate this with locate because it is not yet in the index. updatedb

8. Use find and -exec to rename all .htm files to .html.

datasoft @ datasoft-linux ~$ find . -name '*.htm'
./one.htm
./two.htm
datasoft @ datasoft-linux ~$ find . -name '*.htm' -exec mv {} {}l \;
datasoft @ datasoft-linux ~$ find . -name '*.htm*'
./one.html
./two.html

9. Issue the date command. Now display the date in YYYY/MM/DD format.

Code:

date +%Y/%m/%d

10. Issue the cal command. Display a calendar of 1582 and 1752. Notice anything special ?

cal 1582
The calendars are different depending on the country. Check http://linux-training.be/files/
studentfiles/dates.txt

Previous: Linux - Filters
Next: Linux regular expressions



Follow us on Facebook and Twitter for latest update.