See this page as a slide show
CT320 Filesystem
Original slides from Dr. James Walden at Northern Kentucky University.
Topics
- Overview
- Pathnames
- Mounting
- Structure
- Organization
- File types
- Kernel Data Structures
Overview
- Basic purpose of the Linux filesystem:
To represent and organize the system’s storage resources, i.e.,
directories and files.
- Because the filesystem was well organized and easy to access,
system programmers uses it to store other resources:
- Processes
- Devices
- Mount Points
- Kernel Data
- Communication Channels
Filesystem Components
- Namespace:
A way of naming things and organizing them in a hierarchy
- Application Programmer Interface:
A set of system calls for navigating and manipulating objects
- Security model:
A scheme for protecting, hiding, and sharing things
- Implementation:
Software that ties the logical model to the actual hardware
Filesystem Types
- ext2, ext3, ext4: Linux
- FAT16, FAT32, NTFS: Windows
- HFS, HFS+: Apple
- XFS: Silicon Graphics
- NFSv2, NFSv3, NFSv4: Sun (distributed)
- ISO 9660, UDF: Optical Discs
https://en.wikipedia.org/wiki/Comparison_of_file_systems
Hierarchy
Files are located by traversing a directory tree:
- Root directory:
Represented by single forward slash (
/
)
- Home directory:
Represented by the tilde symbol (
~
)
- Not really a filesystem thing—a shell thing.
- Working directory:
Processes have a current working directory (
.
)
- Parent directory:
All directories except root have parent (
..
)
- Absolute path:
Pathname all the way from root (
/usr/local/bin
)
- Relative path:
Pathname from current location (
../local/bin
)
Linux filesystem is a single unified hierarchy, unlike Windows.
Filenames
- File, filename, path, pathname.
- Files are represented by strings.
- Pathname includes directory structure.
- File and directory names use the same namespace.
- Each component can be 255 characters.
- The entire path can be 4095 characters.
- Linux
find
command provides recursive listing.
Characters in Filenames
- Spaces and other special characters can be used.
- Anything but slash or null is ok.
- If you can sneak your bizarre character past the shell.
- Periods are followed by file extension, periods aren’t special.
- Except that
ls
, by default, ignores files starting with one.
Really, anything?
$ touch now then
$ date >n*
$ id >'~!#$^&*\ ([{<>}])|;'\''`":,.?½→☂ 🐟 ψάρι سمك मछली 鱼'
$ ls -lhog
total 8.0K
-rw------- 1 51 Mar 11 16:41 ~!#$^&*\ ([{<>}])|;'`":,.?½→☂ 🐟 ψάρι سمك मछली 鱼
-rw------- 1 29 Mar 11 16:41 now
-rw------- 1 0 Mar 11 16:41 then
Anything‽
$ date >-z
$ ls -l
total 4
-rw------- 1 ct320 class 29 Mar 11 16:41 -z
$ rm -z
rm: invalid option -- 'z'
Try 'rm ./-z' to remove the file '-z'.
Try 'rm --help' for more information.
$ rm '-z'
rm: invalid option -- 'z'
Try 'rm ./-z' to remove the file '-z'.
Try 'rm --help' for more information.
$ rm ./-z
$ rm -- -z
rm: cannot remove '-z': No such file or directory
Mounted Filesystems
Global filesystem contains mounted filesystems:
- Textbook uses “file tree” for global filesystem
- Mounted filesystems are attached to file tree
- Mount over an existing directory—none of this drive letter
C:
nonsense.
- Mounted filesystems are usually disk partitions
- Also can be network file servers
- Each must obey the proper API
- Attached to the tree with the
mount
command
Mounting
mount /dev/hd4 /MST3K
- Installs filesystem stored on disk under the path
/MST3K
- Files that were originally at
/MST3K
no longer accessible!
- Filesystem navigation occurs seamlessly
/etc/fstab
- List of filesystems customarily mounted on system
- Filesystems can be checked using
fsck
command
- Shows layout of system and allows
mount -a
- Show mounted filesystems and their types with
df -T
Unmounting
umount /MST3K
- Detaches a file system
- Cannot unmount a file system that is busy
- No open files
- No processes whose current working directory is in the file system
- No executable programs from the filesystem
umount -l /MST3K
- Lazy unmount, makes it unavailable
- “ratcheting”
- Waits for all of the above to be true
- Then physically unmounts
fuser
Code | Meaning |
f | file open for reading |
F | file open for writing |
c | current working directory |
e | executing a file |
r | root directory (chroot ) |
m | mapped file or shared library |
$ PATH=/bin:/usr/bin:/sbin:/usr/sbin
$ pwd
/tmp/PmWiki.tmp
$ fuser -v .
USER PID ACCESS COMMAND
/tmp/PmWiki.tmp: ct320 3026902 ..c.. php-cgi
ct320 3026932 ..c.. bash
$ sleep 10 </etc/group &
$ fuser -v /etc/group
USER PID ACCESS COMMAND
/etc/group: ct320 3026940 f.... sleep
$ kill %%
$ echo "My PID is $$"
My PID is 3026932
$ fuser -v /bin/bash
USER PID ACCESS COMMAND
/usr/bin/bash: ct320 3026932 ...e. bash
Filesystem Structure
- Blocks that stores metadata with structural info
- Superblock: filesystem type, size, status
- Blocks that store user data contained in files (really, in inodes)
- Objects in filesystem represented by inodes
- inode has file type, permissions, owner, group,
file size, number of links, access control,
file creation, access, modification times
- Directory is a list of [filename,inode number] pairs
Examples
$ ls -l /etc/passwd
-rw-r--r-- 1 root root 4368 Nov 30 2022 /etc/passwd
$ ls -i /etc/passwd
2624353 /etc/passwd
$ stat /etc/passwd
File: /etc/passwd
Size: 4368 Blocks: 16 IO Block: 4096 regular file
Device: 803h/2051d Inode: 2624353 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2025-03-10 18:05:01.364170908 -0600
Modify: 2022-11-30 17:03:02.874755985 -0700
Change: 2022-11-30 17:03:02.899755818 -0700
Birth: 2022-11-30 17:03:02.874755985 -0700
inode
- An inode contains, at least:
- file type (regular file, symlink, directory, character, block, …)
- file size in bytes
- UID / GID
- protection bits (e.g., rwxr-x---)
- ctime: inode last changed (changes to any field, including times)
- mtime: modification time
- atime: access time
- link count (reference count)
- pointers to the disk blocks
that store the file's contents
times
- ctime: inode last changed (changes to any field, including
times, permissions, owner, etc.)
- mtime: data modification time
- atime: access time
- every time you run
cat
, update the atime of /bin/cat
. ☹
- disabled entirely with
mount -o noatime
- which screws up some applications, e.g., has mailbox been read?
- improved performmance with
mount -a relatime
, the new default
- or
mount -a lazyatime
inode block pointers
A classic inode has thirteen pointers:
- ten direct blocks
- one indirect block
- one double-indirect block
- one triple-indirect block (not shown)
Some filesystems optimize the storage of tiny files by storing the data
itself in the inode.
Filesystem Organization
- Organization has always been haphazard
- Several naming schemes used, some incompatible
- Evolution ain’t pretty, but it gets the job done.
Standard directories
Directory | Meaning | Directory | Meaning |
/boot | Boot directory | /usr | Most standard programs |
/dev | Device files | /var | Spool directories |
/etc | Critical system files | /home | Mount point for users |
/sbin | System utilities | /lib | Libs and parts of the C compiler |
/bin | Important utilities | /media | Removable media |
/tmp | Temp files | /opt | Optional applications |
Standard directories
/usr/bin | Most commands and executables |
/usr/include | Header files |
/usr/lib | Libraries, support files for standard programs |
/usr/local | Local software |
/usr/local/bin | Local executables |
/usr/local/ … | Other local (etc, lib, sbin, src) |
Standard directories
/usr/man | Man pages |
/usr/sbin | Less essential sysadmin commands |
/usr/share | Common to multiple systems |
/usr/share/man | Shared man pages |
/usr/src | Source code for nonlocal packages |
Standard directories
/var/adm | Logs, system setup records |
/var/log | System log files |
/var/spool | Spooling directories (mail, printers) |
/var/tmp | More temp space (preserved between boots) |
File types
$ ls -l /dev/console /dev/log /dev/sda
crw------- 1 root root 5, 1 Feb 24 17:22 /dev/console
lrwxrwxrwx 1 root root 28 Feb 24 17:21 /dev/log -> /run/systemd/journal/dev-log
brw-rw---- 1 root disk 8, 0 Feb 24 17:22 /dev/sda
File type encoding
File type | Symbol | Created by | Removed by |
Regular file | - | cp , mv , vi | rm |
Directory | d | mkdir , cp -r | rmdir , rm -r |
Character device file | c | mknod | rm |
Block device file | b | mknod | rm |
Local domain socket | s | socket (2) | rm |
Named pipe | p | mknod | rm |
Symbolic link | l | ln -s | rm |
Examples
$ ls -ldU /bin/cat /etc/magic /tmp /dev/random /dev/sda /dev/gpmctl /bin/cc
-rwxr-xr-x 1 root root 38440 Apr 1 2023 /bin/cat
-rw-r--r--. 1 root root 111 Apr 6 2024 /etc/magic
drwxrwxrwt 4 root root 253800 Mar 11 16:41 /tmp
crw-rw-rw- 1 root root 1, 8 Feb 24 17:22 /dev/random
brw-rw---- 1 root disk 8, 0 Feb 24 17:22 /dev/sda
srwxrwxrwx 1 root root 0 Feb 24 17:22 /dev/gpmctl
lrwxrwxrwx 1 root root 3 Feb 14 02:00 /bin/cc -> gcc
Regular file
- It’s a sequence of bytes. Byte 0 is this, byte 1 is that, etc.
- Newline is just another byte, as far as the kernel is concerned.
- Linux imposes no structure.
- Sequential and random access allowed.
- Inserting a byte into the middle is non-trivial.
- The size is in the inode, so not end-of-file marker is required.
- There is no EOF character in the file.
Directories
- Contains named references to other files
.
Refers to current directory
..
Refers to parent directory
- File’s name stored within the parent’s directory
- Not in file itself.
- More than one directory can refer to a file (links)
ln
creates hard links, removed with rm
- Indistinguishable from first entry. They’re all just links.
Character and Block Device files
- Device files allow programs to communicate with hardware.
- Modules that know how to talk to the devices are
linked into the kernel when it is built.
- Device file not the same as a device driver.
- Device file is just a rendezvous point.
- Device files characterized by major, minor numbers.
- Major number says which driver (what kind of device).
- Minor number says which actual hardware device.
$ ls -l /dev/cdrom
/bin/ls: cannot access '/dev/cdrom': No such file or directory
$ ls -lH /dev/cdrom
/bin/ls: cannot access '/dev/cdrom': No such file or directory
Some Device Files
Name | Description |
/dev/sda2 | disk |
/dev/ttyS0 | RS-232 serial line |
/dev/pts/1 | Pseudo-tty (pty) |
/dev/tty | An alias for your terminal |
/dev/null | Read fails, write succeeds |
/dev/full | Read succeeds, write fails |
/dev/zero | Read yields zeroes |
/dev/random | Read yields random data |
Local domain sockets
- Sockets are communication mechanisms
- Allow communication between processes
- Or more importantly, sockets control network communication
- Local domain sockets
- Only accessible from the local host
- Filesystem object rather than a network port
- Visible to other processes
- Accessible only by connected processes
- Examples include X, NFS
Named Pipes
Communication mechanisms
- Between processes running on the same host
- Pipes are also known as FIFOs.
- Similar to pipes on the command line, but the data is persistent.
Symbolic links
- Also commonly referred to as a “soft link” or “symlink”.
- A special kind of file that contains another name.
- For example, if
/alpha/beta
were a symlink to gamma/delta
,
then it’s the same as /alpha/gamma/delta
- However, if
/alpha/beta
were a symlink to /epsilon/iota
,
then it’s the same as /epsilon/iota
- Deleting the target can leave a dangling link.
- Permissions of the symlink just don’t matter.
- Where is the target string stored?
Symbolic link example
$ date >now
$ ln -s now foo
$ ls -l
total 4
lrwxrwxrwx 1 ct320 class 3 Mar 11 16:41 foo -> now
-rw------- 1 ct320 class 29 Mar 11 16:41 now
$ cat now foo
Tue Mar 11 16:41:26 MDT 2025
Tue Mar 11 16:41:26 MDT 2025
$ echo "I am $USER" >foo
$ ls -l
total 4
lrwxrwxrwx 1 ct320 class 3 Mar 11 16:41 foo -> now
-rw------- 1 ct320 class 11 Mar 11 16:41 now
$ cat now foo
I am ct320
I am ct320
$ rm now
$ cat foo
cat: foo: No such file or directory
Hard Links
ln old-name new-name
(same order as cp
)
- A hard link is just another directory entry that refers
to the same inode.
- Both links have equal standing—it’s not as if
the first is the “real” file, and the second is just a link.
- An inode has a reference count.
- When the reference count goes to zero, the inode (and associated data)
is deallocated.
- Do you now see why you need directory write permission,
and not file write permission, to delete a file?
Hard/Soft Link Differences
ln -s old-name new-name
- Soft links can cross mount points, hard links cannot.
- Soft links work on directories, hard links do not.
- Soft links require additional filesystem reads.
- OK, inline files make it not so bad.
Hard Link Analogy
- I share an office with Phil.
- Neither of us “owns” the office—we share it.
- We are equals.
- It’s not my office, and Phil is just borrowing some space.
- Neither is it Phil’s office, with me there temporarily.
- I was there first, but that’s irrelevant.
Soft Link Analogy
- I used to have an office in the University Services building,
north of Laurel, along with the rest of the CS Department.
- When the department moved to our current building, I left
a note on the door of my old office, saying “Jack is now in
CSB 246”.
- If I retire tomorrow, that note might still be there.
- The note would then be incorrect.