Show Lecture.WWW as a slide show.
CT320 WWW
First web server, CERN
URL
- All web stuff is done with URLs (URLs, URIs, URNs—not going there).
- A URL is scheme:info. Examples:
https://www.colostate.edu/
http://example.com/foo/bar
magnet:?xt=urn:btih:c12fe1c06bba254a9dc9f519b335aa7c1367a88a&dn
ftp://ftp.cisco.com/pub/mibs/v1/
mailto:pinkie-pie@my-little-pony.example.net
tel:+1-970-555-1212
- There are scores of schemes registered:
https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
Web Browsing
- Web browsers use HTTP & HTTPS application-level protocols.
- These are TCP protocols on ports 80 & 443.
- Use HTTPS!
- HTTP is unencrypted. Anybody can see what you’re doing.
- HTTP is unsigned. Its contents can be replaced
or modified anywhere en route, and you wouldn’t know.
- Downloading software? I hope that nobody in the middle altered it!
How a Web Server Works
- A web browser sends a TCP request to a web server.
- Apache is the most common Linux web server.
- The web server looks at the request and decides what to do with it.
Fetching a web page
Let’s fetch this web page:
Warning: Use of undefined constant HTTP_HOST - assumed 'HTTP_HOST' (this will throw an Error in a future version of PHP) in /s/bach/a/class/cs000/public_html/pmwiki/cookbook/php.php(17) : eval()'d code on line 1
cs.colostate.edu to (:ip
Warning: Use of undefined constant HTTP_HOST - assumed 'HTTP_HOST' (this will throw an Error in a future version of PHP) in /s/bach/a/class/cs000/public_html/pmwiki/cookbook/php.php(17) : eval()'d code on line 1
cs.colostate.edu:) via DNS.
- Make a TCP connection to (:ip
Warning: Use of undefined constant HTTP_HOST - assumed 'HTTP_HOST' (this will throw an Error in a future version of PHP) in /s/bach/a/class/cs000/public_html/pmwiki/cookbook/php.php(17) : eval()'d code on line 1
cs.colostate.edu:) at port 443 (HTTPS).
- Send an HTTP request through the socket, which looks like …
Fetching a web page
Send an HTTP request:
GET /~ct320/Fall19/Lecture/WWW HTTP/1.1
Accept: */*
Accept-Encoding: gzip, br, zstd, deflate
Host: cs.colostate.edu
User-Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Browsing Response
The response from the web server looks like this:
HTTP/1.1 200 OK
Date: Sat, 23 Nov 2024 11:55:59 GMT
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
X-Powered-By: PHP/5.4.16
Vary: Accept-Encoding
Set-Cookie: ☠☠☠
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
<!doctype html>
<meta charset='utf-8'>
<title>CT320 | Lecture / WWW</title>
…
HTTP Response Codes
An HTTP response looks like this: HTTP/1.1 200 OK
or, in general: HTTP/
version numeric-code human-readable-code
The numeric codes are for programs, the words are for people.
Some popular HTTP response codes:
- 200 OK
- 301 Moved Permanently
- 302 Moved Temporarily
- 404 Not Found
HTML
- The response can be in many different formats.
- A popular format is HTML, which has tags, e.g.,
I <strong>love</strong> My Little Pony!
- It’s your browser’s job to translate that HTML to a good-looking
display on the screen.
- The network doesn’t care about any of this. It just delivers the
bits from the server to the browser.
Browsing Security Considerations
- The HTTPS payload (request & response) is encrypted, going both ways.
- However, IP source & destination numbers are in the IP packet, so everybody
knows that I’m talking to
Warning: Use of undefined constant HTTP_HOST - assumed 'HTTP_HOST' (this will throw an Error in a future version of PHP) in /s/bach/a/class/cs000/public_html/pmwiki/cookbook/php.php(17) : eval()'d code on line 1
cs.colostate.edu.
- However, nobody knows that I’m asking for the CT320 WWW lecture.
- What will be revealed if you fetch
https://male.personal.health.com/hair/restoration
?
Private Browsing
Private Browsing
- Chrome: Incognito
- Firefox: Private
- Internet Explorer: InPrivate
- What does it actually do?
- It doesn’t affect what you send or receive from the Internet at all.
- It limit evidence is kept in your browser.
- It saves you from your spouse, but not from the FBI,
or whoever owns any router along the way.
- Routers see all IP addresses, port numbers, and DNS requests.
www.
- There is nothing special about
www.
; it’s just a convention.
- Sometimes, only one of
example.net
and www.example.net
exists,
but not the other.
- Often, both versions exist.
- They might both resolve to the same IP address.
- One might be a DNS alias (CNAME).
- One might forward to the other via the HTTP Location: header.
- They could deliver wildly different content.
- This would really confuse your users. A goal?
www. examples
example.net
and www.example.net
simply have separate DNS entries
that resolve to the same IP address:
% host example.net
example.net has address 93.184.215.14
example.net has IPv6 address 2606:2800:21f:cb07:6820:80da:af6b:8b2c
example.net mail is handled by 0 .
% host www.example.net
www.example.net has address 93.184.215.14
www.example.net has IPv6 address 2606:2800:21f:cb07:6820:80da:af6b:8b2c
www. examples
The CS Department web server, www.cs.colostate.edu
,
is a DNS CNAME alias for another host:
% host www.cs.colostate.edu
www.cs.colostate.edu is an alias for beethoven.cs.colostate.edu.
beethoven.cs.colostate.edu has address 129.82.45.48
This way, our internal naming convention isn’t visible to the world.
www. examples
Fetching https://cs.colostate.edu/~ct320 with wget shows a lot of
web server forwarding:
$ wget -S https://cs.colostate.edu/~ct320 |& egrep '^[-CR]|^ (Location|HTTP)'
--2024-11-23 04:55:59-- https://cs.colostate.edu/~ct320
Resolving cs.colostate.edu (cs.colostate.edu)... 129.82.45.48
Connecting to cs.colostate.edu (cs.colostate.edu)|129.82.45.48|:443... connected.
HTTP/1.1 301 Moved Permanently
Location: https://www.cs.colostate.edu/~ct320/
--2024-11-23 04:55:59-- https://www.cs.colostate.edu/~ct320/
Resolving www.cs.colostate.edu (www.cs.colostate.edu)... 129.82.45.48
Connecting to www.cs.colostate.edu (www.cs.colostate.edu)|129.82.45.48|:443... connected.
HTTP/1.1 302 Found
Location: Fall19/
--2024-11-23 04:55:59-- https://www.cs.colostate.edu/~ct320/Fall19/
Reusing existing connection to www.cs.colostate.edu:443.
HTTP/1.1 200 OK
What’s the story with 301 vs. 302?