Show Lecture.Email as a slide show.
CT320 Email
📧
Thanks to:
- Dr. James Walden, NKU
- Russ Wakefield, CSU
- Dr. Indrajit Ray, CSU
for contents of these slides
Topics
- Terminology
- Anatomy of a Mail Message
- Components of an E-mail System
- SMTP
- IMAP & POP
- E-mail Addresses
- Mail Policies
Nomenclature
- Email
- email
- e-mail
- electronic mail
- emails
Terminology
Acronym | Expansion | Description |
MUA | Mail User Agent | Interacts with the end user |
MSA | Mail Submission Agent | Submits the mail to an MTA |
MTA | Mail Transfer Agent | Transfers mail between hosts |
MDA | Mail Delivery Agent | Puts the email in a mailbox |
MRA | Mail Retrieval Agent | Retrieves the email from an MTA |
MUA
- Mail User Agent
- Definition: Interacts with the end user
- Examples:
- Thunderbird
- Browser connected to mail.google.com
- Text-based programs: Pine, Eudora, Mutt, elm
MTA
- Mail Transfer Agent
- Definition: Holds mail, transfers it to other MTAs
- Examples:
- Google’s mail servers
- Sendmail
- QMail
- Postfix
- Exim
MSA
- Mail Submission Agent
- Definition: Submits the mail to an MTA
MDA
- Mail Delivery Agent
- Definition: Puts the email in a mailbox.
Back in the old days, your mail lived in /var/mail/username,
and mail programs just read it from there.
Why might this be a separate program? What was the benefit?
MRA
- Mail Retrieval Agent
- Definition: Retrieves the email from an MTA
- Examples:
Overview
- Traditional path
-
MUA → MTA → … → MTA → MUA
- Expanded path
-
MUA → MSA → MTA → … → MTA → MDA ⇒ MRA ⇒ MUA
(where → is a push step and ⇒ is a pull step)
The difference between push and pull concerns
exactly who initiates the transfer.
- push: an offering
- pull: a demand
Multiple MTAs
- The traditional path: MUA → MTA → … → MTA → MUA
- Why have multiple MTAs?
- Because that’s now things worked back in UUCP days. Your computer
could only talk, via phone calls, to a few nearby machines.
- Even today, if you send mail to
Joe@BigCompany.com
,
you probably won’t send it directly there. A typical route:
- Your laptop
- Google, or your ISP’s mail server
- BigCompany’s corporate gateway mail server
- some mail server deep inside BigCompany
- Joe’s desktop computer
Internet E-mail System
┌──────────────┐ ┌───────────────┐
│ GMail server │ ◁····HTTPS····▷ │ Todd’s laptop │
└──────────────┘ │ using Chrome │
△ └───────────────┘
:
SMTP
:
▽
┌────────────────┐ ····IMAP····▷ ┌───────────────────┐
│ Comcast server │ │ Jack’s laptop │
└────────────────┘ ◁····SMTP···· │ using Thunderbird │
△ └───────────────────┘
:
SMTP
:
▽
┌───────────┐ ····POP3····▷ ┌───────────────────────┐
│ HP server │ │ Mary Jo’s workstation │
└───────────┘ ◁····SMTP···· │ using Outlook │
└───────────────────────┘
Message Store
- Communication
- Receives data from MDA (mail.local, procmail)
- Provides data to MRA (IMAP, POP, NFS, web)
- Types of stores
- Files (all messages for a user in one file)
- Directories (directory per user)
- Databases
Mail Access Agents
- Older systems directly accessed mail files.
- Modern systems use network
POP3
- POP3: Post Office Protocol, version 3
- Simple download protocol for offline reading.
- It’s a protocol, so it has:
- a port number (110)
- a server (popd, listening on port 110)
- syntax & semantics (
STAT
, LIST
, RETR
, DELE
, …)
IMAP
- IMAP: Internet Mail Access Protocol
- Online and offline modes of reading.
- Partial message fetch (headers, attachments, etc.)
- Message state stored on server, not client.
- Multiple mailbox and multiple client support.
- It’s a protocol, so it has:
- a port number (143, 993)
- a server (imap3, listening on port 143 or 993)
- syntax & semantics (
select
, fetch
, store
, …)
Mail User Agents
- Text clients
- GUI clients
- Eudora
- Mozilla Thunderbird
- MS Outlook
- Web clients
Mail Addressing
- Relative Addresses
mcvax!uunet!ucbvax!hao!boulder!air!evi
- Absolute Addresses
- MX Records
- Mail clients use MX records, not A records.
- Use A record if no MX record exists.
- Lowest preference # = highest priority.
- Permits failover if server down.
- First mentioned in RFC 974.
UUCP Routing
- How to go from Seattle to Dallas?
- There’s no direct route.
- You need several hops.
- How many paths exist?
Ports
- Port 25: Used to be the general-purpose SMTP port,
now used for MTA→MTA transfers.
- Port 587: For mail submission. Your MUA (or MSA) will open
port 587 on your ISP’s mail server to submit mail.
- Port 465: Obsolete. Formerly used for SMTPS, and still used
by some clients & servers.
Aliases
- Allow mail to be rerouted.
- Sysadmin: files (
/etc/mail/aliases
), local db, NIS, LDAP
- Personal:
~/.forward
$ cat ~/.forward
Applin@ColoState.Edu
- Alias destinations
- Local: address
- Remote: address
@
domain
- File:
:include:
pathname
- Program:
|
pathname
- Required aliases
Common Headers
Header | Purpose |
From: who | Who sent the message |
To: who, who … | Who receives the message |
Cc: who, who … | Who else receives the message |
Bcc: who, who … | Who else receives the message |
Reply-To: who | Who you should reply to |
Date: when | When the message was created |
Message-ID: id-string | A unique ID for the message |
Subject: whatever | The topic |
Received: info | Identify each way-station |
Body
- Separated from header by blank line.
- Contains 7-bit ASCII text by default.
- Any non-ASCII text must be encoded:
- uuencode (old school)
- MIME
MIME
Multipurpose Internet Mail Extensions
Content-Type:
mime-type
Content-Type: text/plain
Content-Type: text/html
Content-Type: image/gif
Content-Transfer-Encoding:
Content-Transfer-Encoding: 7bit
(please be obsolete)
- Seven-bit text (
Don't panic!
)
Content-Transfer-Encoding: 8bit
- Eight-bit text (
Don’t panic!
)
Content-Transfer-Encoding: quoted-printable
- Mostly printable text (
Don=e2=80=99t panic!
)
Content-Transfer-Encoding: base64
- Mostly unprintable bytes (
RG9u4oCZdCBwYW5pYyEK
)
Multipart Message
Here’s how to have multiple representations of the same message.
This way, we have a plain text version for a primitive mail reader,
and an HTML representation for a mail reader that can handle HTML.
Content-Type: multipart/alternative; boundary=zot
This is a message with multiple parts in MIME format.
--zot
Content-Type: text/plain
I really like comic books.
--zot
Content-Type: text/html
I <em>really</em> like
<font color=red>comic books</font>.
--zot--
Envelope
- Headers aren’t the full story
- Recipient isn’t necessarily on
To:
or Cc:
- Sender isn’t necessarily given on From: header.
- Envelope specifies sender/receiver
- Specified via SMTP commands.
- Envelope recipient used for
Bcc:
- Envelope recipient used by mail lists.
- Envelope facilities used by spammers too.
MTAs
- Mail Transport Agents
- Receive mail from MUAs.
- Route mail across internet.
- MTA Protocol: SMTP
- MTA Examples
An example
- Jack wants to send an email message to his niece Janelle,
who works at the University of Michigan.
- We could picture it like this:
Example
- However, Jack’s computer doesn’t really connect directly to
Janelle's computer.
- Actually, Jack uses a Comcast SMTP server as his MTA,
and Janelle uses a University of Michigan mail server MTA.
👴💻 Jack
| →
| 🏢 Comcast
| →
| 🏢 U of M
| →
| 👧💻 Janelle
|
- There might be even more steps between Comcast and U of M,
but this will suffice for illustrative purposes.
Jack sends message to Janelle, part one
👴💻 Jack
| →
| 🏢 Comcast
| →
| 🏢 U of M
| →
| 👧💻 Janelle
|
- Jack composes email message; provides Janelle’s email address to his MUA.
- Jack’s MUA (Thunderbird) creates a TCP SMTP connection to Jack’s mail server
at Comcast.
- Jack’s MUA pushes message to Comcast.
- Comcast queues up message for a suitable time to deliver.
Jack sends message to Janelle, part two
👴💻 Jack
| →
| 🏢 Comcast
| →
| 🏢 U of M
| →
| 👧💻 Janelle
|
- Comcast creates a TCP SMTP connection to U of M.
- Comcast pushes the message to the U of M mail server..
- Janelle’s MUA uses a client POP3/IMAP/HTTP connection to
the U of M mail server.
- Janelle uses her MUA to retrieve the email message.
- To reply, reverse the process.
Email header
- Every received email message will have a header.
- Header lines are added by entities (email tools, user-agents, email
servers) as they store and forward and email messages.
- The header lines are a series of text lines.
- Syntax Header-Name: Header-Value
- If a line starts with a tab character or a space then that
line is a continuation of previous Header-Value.
Email header (envelope)
Date: Wed, 16 Jun 2004 12:34:49 +0200
From: Marta Oliva <oliva@eps.udl.es>
To: Dr. Indrajit Ray <indrajit@CS.ColoState.EDU>
Subject: Re: Registration to the 18th Annual IFIP WG
11.3 WC on Data and Application Security, 2004
Email header (full)
Received: from mailr3.udl.es (mailr3.udl.es [193.144.10.36])
by chico.cs.colostate.edu (8.12.10/8.12.9) with ESMTP id i5GAYmvN008288
for <indrajit@CS.ColoState.EDU>; Wed, 16 Jun 2004 04:34:50 -0600 (MDT)
Received: from eps.udl.es (fermat.udl.net [10.50.54.28])
by mailr3.udl.es (8.11.6/8.11.6) with ESMTP id i5GAYga31371
for <indrajit@CS.ColoState.EDU>; Wed, 16 Jun 2004 12:34:42 +0200
Received: from eps.udl.es by eps.udl.es (8.8.8+Sun/SMI-SVR4)
id MAA22736; Wed, 16 Jun 2004 12:34:40 +0200 (MET DST)
Message-ID: <40D02249.6090105@eps.udl.es>
Date: Wed, 16 Jun 2004 12:34:49 +0200
From: Marta Oliva <oliva@eps.udl.es>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4)
Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Dr. Indrajit Ray" <indrajit@CS.ColoState.EDU>
Subject: Re: Registration to the 18th Annual IFIP WG 11.3 WC on Data and
Application Security, 2004
References: <40CDD679.3060008@eps.udl.es>
<Pine.GSO.4.58.0406151344360.18975@salieri.cs.colostate.edu>
In-Reply-To: <Pine.GSO.4.58.0406151344360.18975@salieri.cs.colostate.edu>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Displaying email headers
- You can instruct most email programs to display the full header
- In Netscape: Select: View→Headers>All
- In Outlook: Select: View→Options
- In Pine: Type H. (Requires the enablefull-header-cmd feature.)
- In WebMail: Click the Options button,
then select "Show message headers in
body of message" and click OK.
- Thunderbird: control-U or More→View Source
- GMail: ▼→Show original
Generation of email headers (1)
Let’s consider email sent from Alice to Bob.
Here are the initial headers, as created by Alice’s MUA running
on salieri.cs.colostate.edu:
From: alice@cs.colostate.edu (Alice The Great)
To: bob@isse.gmu.edu
Date: Fri, 18 Jun 2004 10:22:55 -0600 (MDT)
X-Mailer: Pine v2.32
Subject: Conference call today?
Now, the message is handed off to an MTA on chico.cs.colostate.edu
Generation of email headers (2)
The MTA on salieri adds some headers:
Received: from salieri.cs.colostate.edu (salieri.cs.colostate.edu [129.82.45.76] by
chico.cs.colostate.edu (8.12.10/8.12.9) id i5IGMtv0004345
From: alice@cs.colostate.edu (Alice The Great)
To: bob@isse.gmu.edu
Date: Fri, 18 Jun 2004 10:22:55 -0600 (MDT)
Message-ID: <Pine.GS0.4.58.0406181022460@salieri.cs.colostate.edu>
X-Mailer: Pine v2.32
Subject: Conference call today?
The message is then transmitted from chico.cs.colostate.edu
to the MTA at mailhost.isse.gmu.edu.
Generation of email
More headers are added by mailhost.isse.gmu.edu:
Received: from chico.cs.colostate.edu (chico.cs.colostate.edu
[129.82.45.30]) by mailhost.isse.gmu.edu (8.8.5/8.7.2) with
ESMTP id LAA20869 for <bob@isse.gmu.edu>; Fri, 18 Jun 2004
12:24:24 -0400 (EDT)
Received: from salieri.cs.colostate.edu (salieri.cs.colostate.edu
[129.82.45.76] by chico.cs.colostate.edu (8.12.10/8.12.9)
id i5IGMtv0004345
From: alice@cs.colostate.edu (Alice The Great)
To: bob@isse.gmu.edu
Date: Fri, 18 Jun 2004 10:22:55 -0600 (MDT)
Message-ID: <Pine.GS0.4.58.0406181022460@salieri.cs.colostate.edu>
X-Mailer: Pine v2.32
Subject: Conference call today?
Examining email headers
- The most important header field for email tracking purposes is the
Received header line(s)
- Syntax:
Received: from
? by
? via
?
with
? id
? for
? ; date-time
- where
from
, by
, via
, with
, id
, and
for
are token with values within a single header value
- Not all tokens will have values all the times
Examining ‘Received’ header
- Tip — Break a single Received line into multiple lines
Received: from chico.cs.colostate.edu (chico.cs.colostate.edu [129.82.45.30]) by mailhost.isse.gmu.edu (8.8.5/8.7.2) with ESMTP id LAA20869 for <bob@isse.gmu.edu>; Fri, 18 Jun 2004 12:24:24 -0400 (EDT)
Received:
from chico.cs.colostate.edu (chico.cs.colostate.edu [129.82.45.30])
by mailhost.isse.gmu.edu (8.8.5/8.7.2)
with ESMTP
id LAA20869
for <bob@isse.gmu.edu>;
Fri, 18 Jun 2004 12:24:24 -0400 (EDT)
Examining ‘Received’ header (2)
- For tracking purposes, we are interested in the from and
by tokens in the Received header field
- from name (dns-name [ip-address])
Received: from chico.cs.colostate.edu
(chico.cs.colostate.edu [129.82.45.30])
This piece of mail was received from a machine calling itself
chico.cs.colostate.edu
which is really named chico.cs.colostate.edu
and has the IP address 129.82.45.30.
Single most important piece of information for tracing email.
Examining ‘Received’ headers (3) by
receiving-host-name (software version number)
by mailhost.isse.gmu.edu (8.8.5/8.7.2)
The machine that received the email was
mailhost.isse.gmu.edu
It’s running software with version.
8.8.5/8.7.2
Examining ‘Received’ headers (4)
with (protocol) ID (server-assigned-id)
with ESMTP ID LAA20869
The machine that received the mail was running ESMTP
The machine assigned the identifier number LAA20869.
The system administrator needs to have this ID number to look up
the message in the machine’s log files — no other use for this ID
number.
Examining ‘Received’ headers (5)
for (<recipient’s email address>);
for <bob@isse.gmu.edu>;
The email was addressed to bob@isse.gmu.edu.
Note — This header is not related to the email address provided in the To: header line
date-time
Fri, 18 Jun 2004 12:24:24 -0400 (EDT)
This mail transfer occurred on
Friday, 18 June, 2004 at 12:24:24 Eastern Daylight Time which is 4 hours
behind Greenwich Mean Time.
Examining Received headers (6)
- Every time email moves through a new mail transfer agent (a mail
server or a mail relay), a new Received header line is added to the
beginning of the headers list.
- As we read the Received headers in an email message from top to bottom,
we move closer to the machine/person that sent the email.
Received: from chico.cs.colostate.edu (chico.cs.colostate.edu
[129.82.45.30]) by mailhost.isse.gmu.edu (8.8.5/8.7.2)
with ESMTP id LAA20869 for <bob@isse.gmu.edu>;
Fri, 18 Jun 2004 12:24:24 -0400 (EDT)
Received: from salieri.cs.colostate.edu (salieri.cs.colostate.edu
[129.82.45.76] by chico.cs.colostate.edu (8.12.10/8.12.9)
id i5IGMtv0004345
From: alice@cs.colostate.edu (Alice The Great)
To: bob@isse.gmu.edu
Date: Fri, 18 Jun 2004 10:22:55 -0600 (MDT)
Message-ID: <Pine.GS0.4.58.0406181022460@salieri.cs.colostate.edu>
X-Mailer: Loris v2.32
Subject: Conference call today?
Examining other portions of email header
From: alice@cs.colostate.edu (Alice The Great)
- This mail was sent by alice@cs.colostate.edu, who gives her
real name as Alice The Great
To: bob@isse.gmu.edu
- The mail was addressed to bob@isse.gmu.edu
Date: Fri, 18 Jun 2004 10:22:55 -0600 (MDT)
- The email was composed on Friday 18 June 2004 at 10:22:55
Mountain Daylight Time which is 6 hours behind GMT
Addresses
Addresses have several forms:
Form | Example |
address | bbag@shire.example |
name <address> | Bilbo Baggins <bbag@shire.example> |
address (name) | bbag@shire.example (Bilbo Baggins) |
Examining other portions of email header
Message-ID: <Pine.GS0.4.58.0406181022460@salieri.cs.colostate.edu>
- The email was provided with this number by
chico.cs.colostate.edu to identify it.
- This ID is different from the ESMTP / SMTP
ID numbers in the Received: headers
- It is attached to the message for life
- Sometimes this ID may provide a valuable clue,
but most of the time it is unintelligible
- information about sender’s email address
- information about the machine on which the email was composed
- Email program used to compose email
Examining other portions of email header
X-Mailer: Pine v2.32
- The message was sent using a program called Pine, version 2.32
Subject: Conference Call Today?
- Subject matter for the email
There can be many other header fields in the email header,
like Bcc, Cc etc. For the most part these do not contribute
for email tracing purposes. For complete list of header
fields, see RFC 2076.
Simple Mail Transfer Protocol
- RFC 2821
- Principal application layer protocol for
Internet electronic mail.
- Runs over TCP (port 25 (MTA→MTA) or 587 (mail submission))
- It is used to “push” email messages from one mail server to another or
from an user agent to a mail server Application Layer
Transcript of SMTP connection between Alice’s mail server and Bob’s
S: 220 mailhost.isse.gmu.edu ESMTP Sendmail 8.8.5/1.4/8.7.2/1.13; Fri, 18 Jun 2004 12:24:24 -0400 (EDT)
C: HELO mailhost.isse.gmu.edu
S: 250 Hello chico.cs.colostate.edu, pleased to meet you
C: MAIL FROM: <alice@cs.colostate.edu>
S: 250 alice@cs.colostate.edu … Sender ok
C: RCPT TO: bob@isse.gmu.edu
S: 250 bob@isse.gmu.edu … Recipient ok
C: DATA
S: 354 Enter mail, end with “.” on a line by itself
C: Received: from salieri.cs.colostate.edu (salieri.cs.colostate.edu [129.82.45.76] by ….
C: …
C: Subject: Conference Call Today?
C:
C: Are we having the conference call today?
C: .
S: 250 LAA20869 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection
- Client SMTP running on sending mail server host, establishes TCP connection
to SMTP server running on receiving email server host.
- TCP guarantees error-free delivery of email message.
SMTP Commands
- Most common SMTP commands:
HELO
hostname
EHLO
hostname
MAIL FROM:
addr
RCPT TO:
addr
VRFY
addr
EXPN
addr
DATA
QUIT
RSET
HELP
Understanding SMTP commands
HELO
hostname
- Identifies the sending machine
- The sender can lie
- Nothing, in principle, prevents
chico.cs.colostate.edu from saying
“HELO abc.invalid.com”
- Receiver can find out the sending
machine’s real identity, using reverse DNS lookup, for example
- Most modern email servers do this
Understanding SMTP commands
MAIL FROM:
addr
- Initiates email processing
- Address need not be the same as the sender’s own address
- Turns into the from address in the Received header
Understanding SMTP commands
RCPT TO:
addr
- Counterpart of MAIL FROM
- Specifies the intended recipient (the one to which the email will
be delivered regardless of whatever is specified in the
To: line in the message)
- One mail can be sent to multiple recipients by
including multiple RCPT TO command
- Turns into the for address in the Received header
Understanding SMTP commands
DATA
- Starts the actual mail entry. Everything
following it is considered the message
- No restrictions on its form
- Lines at the beginning of the message that start with a
single word followed by a colon is considered part of message header
- Line consisting only of a period terminates the message
Understanding SMTP commands
QUIT
- Terminates the SMTP connection
POP3 / IMAP / HTTP Protocols
- Used by Email reader programs to “pull” stored email messages
from the mail server to the recipient’s machine.
- For the most part do not add anything
extra to the email header
- May format the email header
Email relays
- SMTP allows messages to be relayed to other SMTP servers towards a
destination.
- Historically, this was the way SMTP was meant to be.
- Remember, the internet wasn’t an “always-on” thing like it is now.
- Messages were transmitted via @£‡$☠ phone calls.
- Currently, only unethical spammers use SMTP relaying to conceal the source of
their messages.
- This way spammers hope to deflect complaints to the (innocent) relay site
rather than the spammers’ own ISP.
Things to be aware of
- Do not take any domain (host) name or user name or email address in the
email header at their face values.
- They can be easily forged by compromising the sending SMTP server.
- Pay attention to the trail of ip addresses in the from tokens.
- These are directly gathered by the receivers from IP packets.
- The topmost IP address in the email header is the IP address of the
computer that last forwarded the email.
Things to be aware of
- False header information
- Spammers may try to introduce fake Received:
header lines in the message
- Introduced as part of data
- Follow the trail through the Received: header fields
and use common sense
- False IP Address
- The IP address may have been that of an naïve relay
not the actual sender
Things to be aware of
- Dynamic IP address
- Sender’s machine may not have a fixed IP address
- However mail server used by sender almost invariably has one
- Solicit the help of the ISP who can trace back the
sender from DHCP logs
Mail Policies
- Privacy Policy
- Namespaces
- Reliability
- Scaling
- Security
Privacy Policy
- Personal Use Policy
- Personal v. commercial use.
- When may employee e-mail be read?
- By whom
- Under what circumstances
- Retention Policy
Namespaces
- Avoid first.last format addresses.
- There will be duplicates: John.Smith.
- Use middle initials?
- Append numbers?
- Allow users to choose?
- Create unique organization-wide namespace.
- Use directory to lookup addresses.
Reliability
- Customers expect same reliability as power.
- Failures generate many support calls.
- Reliability measures
- Redundant servers.
- Backup MX hosts.
- RAID arrays.
- Multiple NICs, power supplies, processors, etc.
Scalability
- Types of scability
- To address growth in average messages/day.
- To address spikes in mail traffic.
- Number of messages grows
- faster than linearly with number of users.
- with time, even if user base is constant.
- due to spam, too.
- Size of messages grows
- due to technology: more + larger attachments.
- Back in the day, it would have been unthinkable to attach
a video file to an email message.
Security
- Mail server as a target
- Complexity of mail leads to vulnerabilities.
- Mail is an asset attackers want to take.
- E-mail as a conduit
- Brings viruses and trojans into organization.
- Leaks confidential information outward.
- ex (2005): Apple sues bloggers over releasing data about upcoming products.
- E-mail relaying
- If A sends relays email to C through an open relay B,
and this harms C, can B be sued?
Intercepting e-mail
How will you respond, when …
- Bill’s co-worker says,
“Show me Bill’s email—it contains technical data I need.”
- Bill’s boss says,
“Show me Bill’s email—it contains technical data we need.”
- Bill’s boss says,
“Show me Bill’s email—I want to see if he’s working hard enough.”
- the company president says,
“Show me Bill’s email—he’s selling secrets to our competitor.”
- the police say,
“Show us Bill’s email—he’s committed a crime.”
Policies!
- Don’t make decisions like this on the fly, based on how intimidated
you are at the moment.
- You need policies!
- Perhaps email is a company resource, for any employee to view.
- Perhaps email is a benefit, and employees have a reasonable
expectation of privacy.
- Perhaps your giant company already has a policy.
- Perhaps local, state, or federal law have opinions.
- Work this out before you need an answer.