servers

This week we’ll learn about servers. What are they? How do you make one? Is everything really just a file?

intro: media archaeology

What did everyone do? Find anything weird or interesting?

reading questions:

do you think companies should be held responsible for what gets hosted on their servers?
should we have systems that can censor things online, or should the the internet be free from external control?
after reading these articles, what mental image of a server are you left with?

lecture: the ‘back end’

What is a server?

servers are just other computers – this is not only kind of neat, but it implies, by proxy, that your computer can be a server. It can! We’ll make it into one later in this class.

When we talk about servers in the context of this class, we’re talking about web servers, aka HTTP servers. These are computers configured to handle HTTP requests, returning HTML and the other stuff that’s linked to webpages, like images and Javascript, to the client’s browser. They might also be able to generate new HTML pages dynamically using scripting languages like PHP, but we’ll get to that in a second.

Servers are the backbone of what we think of as the internet. Any webpage that you navigate to online is being stored on a server, somewhere (and increasingly, with cloud computing, stored on a bunch of different servers).

everything is a file

One of the web precursors that never really took off was a project at Bell Labs called plan 9. The central idea was that every computer in the building was part of a distributed operating system, where computation-intensive processes were handled on a central CPU, and everyone was part of the same distributed filesystem.

One of the more interesting server artworks is a piece by net artists Eva and Franco Mattes called life sharing, where they gave anyone who wanted it access to their computer, from 2000-2003.

ports

All servers operate through ports. We mentioned these last week, but they’re worth revisiting. Whenever you make a request to another computer, that request gets made through what’s called a port. Different ports are used for different kinds of requests. Each port has a number assigned to it. The first 1000 ports on your computer are reserved for specific processes, like email traffic or HTTP requests.

Some ports on your computer are physical: you connect to them using USBs, or serial devices. The ones we’ll be talking to today live on your network card.

If you’re using your computer to test out a web server, you can use a higher-numbered port for the server to listen on, and receive HTTP requests. So long as you configure it as an HTTP server, it’ll still be able to listen for them, even though it’s not listening on port 80. It doesn’t matter which one you use, so long as the port number is >1000 (as ports with smaller numbers than that are reserved.) So, if I wanted to serve 3 websites from my computer, I could serve one on port 3000, one on port 4000, and one on port 5000…. and if you went to my computer’s IP address and looked at the 3 different ports, you’d see a different thing every time.

firewalls 🔥

Firewalls are a safety measure used to prevent malicious web traffic from coming into a computer’s ports. ‘firewalls’ were originally walls used to confine fires between buildings, and in general they’re used to prevent the spread of things like viruses and malware.

The simplest kind of firewall is a packet filter, which blocks information from coming into a computer through ports unless it follows particular rules.

These have been around since the late 1980s, and are part of a long, exciting history of adversarial computing. One common firewall technique on large systems (like school computing systems) is something called ‘Network Address Translation’ or NAT. This modifies how IP addresses are formatted in a network, hiding lots of IP addresses behind a larger one, and is part of the reason why just tracing an ip address doesn’t always connect us to a website.

nodeJS

It used to be that Javascript only worked ‘client-side’: that is, you could only use it while linked in webpages. In 2009, NodeJS was released, which allowed people to execute Javascript code outside of a browser. Most excitingly, it allows you to write a server just using Javascript.

We’re going to use NodeJS today to make our servers.

the node package manager

One of the cool things about node is that, because of its support for modules, all sorts of people have written all kinds of cool plugins for it. What’s nice is that it means you can use a lot of other people’s code without having to write your own (neat) though it does mean that a lot of web projects now get kind of bulky, cause they use so much of other people’s Javascript. These modules, called ‘packages’ are added to projects using some software called npm. It’s worth installing it right when you first start using it – it takes a little longer but it’ll save a lot of hassle later on.

Becuase of the number of different packages people use, there’s now a manager for the node package manager (lol) called yarn. For ages I thought this was kind of silly, but it does make your life a lot easier for managing large projects.

middleware

Middleware in this context is the stuff that sits between a client and the server, and makes it easy to pass things between them. It’s also known as ‘software glue’, and is generally pretty useful for interoperability between different types of code.

The middleware used most commonly for NodeJS apps is called expressJS, and it’s pretty neat and easy to use! These handle HTTP requests and responses in a way that makes sense to Javascript code, and saves us having to write a lot of stuff by ourselves. We’ll take a look at it in a minute, but check out this tutorial to get started.

server-side scripting

Say you’re Mark Zuckerberg in 2006, and you’re in charge of a website where people can make personal accounts. Naively, the simplest way to handle someone getting a new account would be for them to email you with what they want on their page, and you make a new HTML page for them. This, for obvious reasons, doesn’t really scale (and is pretty cumbersome even for a few people). The way you solve this problem (and Facebook continues to to this day!) is by using a ‘server-side scripting language’ like PHP, which can dynamically generate new webpages from a template.

PHP stands for ‘PHP: Hypertext Processor’ (episode 100000 of computer people love dumb acronyms). Now, when someone makes an account on your website, a page gets generated for them. Using PHP, this page can be changed and updated by the user, without them having to edit a single line of code. Instead, they just fill in ‘forms’, which add information to a database.

If you don’t want to use PHP, there are actually a bunch of different ways to solve this problem. NodeJS didn’t exist when Facebook first got made, but has pretty elegant support for server-side scripting, and importantly lets you use Javascript all the way down.

hosting servers/CDNs

These days, if you want to set up a web server, chances are you’re doing it using a hosting service. The main ones are Heroku and Amazon Web Services, both of which use an ‘elastic’ model: you pay for how much bandwith you use up, and that can expand and contract.

Paul Ford, in the reading about tilde.club, was using AWS to host a ‘Virtual Machine’, which contains all the tilde.club accounts. The architecture is super simple: it’s just a bunch of HTML files on a computer in the cloud: whenever someone makes a request to the tilde.club server, it sends back the page you went to.

However, the things you ‘need’ a dedicated server for these days are getting fewer. In part, this is due to the proliferation of ‘serverless’ hosting services.

going serverless

Services like Netlify and Github Pages call themselves ‘serverless’ – giving users the ability to host static webpages without . Netlify actually has support for more dynamic use, including running Node servers, and React apps, both of which can do things like dynamic page generation, which before you’d have needed something like PHP to do. The main limitations of these services is that they don’t store data: so, if you’re running a website with lots of users updating content, you still probably want to use a database.

However, for sites with a smallish amount of data, you can still make things work. As an example of what you can do without a server I’ll go through the architecture of my studio’s website.

What’s important to remember with these ‘serverless’ solutions is that they still do, definitely, involve servers. You just don’t configure them. This is normally a bit of a blessing (servers are hard), but it does mean you have a lot less control over the service itself.

tutorial: making a server!

We’re going to all make simple HTTP servers together, from scratch! First up, we need to check we have node installed. To do this, type ‘node’ into an open terminal window. You should see something like:

Welcome to Node.js v13.6.0.
Type ".help" for more information.
>

If you don’t see that, log onto one of the classroom computers to complete the rest of the tutorial.

Now, in the terminal window, make a new folder, called server, and cd inside of it:

mkdir server
cd server/

Create a new file called ‘server.js’, and open it using a text editor.

touch server.js
open server.js

To create a very simple server, paste the following in server.js:

var http = require('http');
 
function requestHandler(req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('<h1>hello world!</h1>');
  console.log('requested main page');
}
 
http.createServer(requestHandler).listen(8080, function(){
    console.log("server listening on port 8080");
});

To run this server, go back to the terminal and type:

node server.js

After you do this, the following message should appear in your terminal:

server listening on port 8080

This:

imports a node module called HTTP, which lets us do HTTP stuff
creates a http server using http.createServer()
sets it up to listen for anything trying to connect to port 8080 on your computer
creates a ‘requestHandler’: every time the server recieves a request (‘req’) – that’s anything that goes to port 8080 of your computer, it needs to come out with a response.

To check this works, we can go to localhost:8080 and check it out! Try editing the HTML to create a different page!

more complex: handling http endpoints

In order to show different pages depending on the url in our address bar, we want to use something called ‘routing’. For this, we’ll want the ‘express’ middleware I mentioned earlier. To use express, you’ll want to download it first using the ‘node package manager’.

First we need to set up our node environment. To do this, type:

npm init

into the command line, and press enter through each question. This will set up the project Now, type:

npm install express

and you should see a neat little loading bar, and the package has installed.

If you type ls into your command line, you should now see two new things: a folder called node_modules, and a file called package.json (it might also have made a file called package-lock.json). These are used for managing the project, and we don’t need to worry about them right now.

Now we can update our server file with a new HTTP endpoint:

var express = require('express');
var app = express();
var path = require('path');

app.get('/', function(req, res) {
	console.log('requested main page');
    res.end('<h1>hello world!</h1>');
});

app.get('/about', function(req, res) {
	console.log('requested about page');
    res.end('<h1>About Page</h1><p>boo boo be doo</p>');
});

app.listen(8080);

Now, every time I go to the main webpage, I see the same thing as before. But if I go to localhost:8080/about, I get a different page! cool! That’s because the express middleware can act as a router, directing us to different webpages on the same site. We can add as many of these as we like!

looking at each other’s pages !!! the fun bit !!! For this part, we all need to do three things:

make sure your computer is connected to HunterNet. We all need to be on the same network for this to work properly.
find our IP addresses once connected to HunterNet
add all our IP addresses to this google sheet

Now, try typing in someone else’s ip address, followed by the port ‘8080’. What do you see?

How to find my IP address?
If you’re using a mac, copy and paste this line into your terminal:

ipconfig getifaddr en0

If you’re using windows, try this:

‘select the Start button , start typing View network connections, and then select it in the list’
Select the HunterNet network connection, and then, in the toolbar, select View status of this connection. (You might need to select the chevron icon to find this command.)
Select Details.
Your PC’s IP address appears in the Value column, next to IPv4 Address.

serving a HTML file What if we don’t want to write out the whole of our HTML file in the request handler, and instead want to serve it as a separate file? Make a new HTML file in the same repository with:

touch index.html

Then, in the handler where you’d like to display the file, replace res.end() with res.sendfile("index.html"), where ‘index.html’ here is the filepath. (if it was in a folder called HTML, you’d use res.sendfile("html/index.html"))

var express = require('express');
var app = express();
var path = require('path');

app.get('/', function(req, res) {
	console.log('requested main page');
  	res.sendfile("index.html");
});

app.get('/about', function(req, res) {
	console.log('requested about page');
    res.end('<h1>About Page</h1><p>boo boo be doo</p>');
});

app.listen(8080);

adding a post request wow, last bit! I doubt we’ll get to these in class but this is where it gets really interesting.

Add this to the code in your JS file.

app.use(express.json())

app.post('/login', function(req,res){
  var user=req.body.user;
  var password=req.body.password;
  console.log("User name = "+user+", password is "+password);
  res.end(JSON.stringify("got username and password"));
});

This will, when a POST request is made to the ‘/login’ endpoint, print the username and password submitted through this request to the command line. To test this out, we can use cURL:

curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"user":"abc","password":"xyz"}' \
  "http://localhost:8080/login"

On the command line, you should see printed:

user is abc, password is xyz

Change your ‘index.html’ page to let the user make a POST request by submitting a form:

<html>
  <head>
    <title>login page</title>
    <script>

    function postData(){
          const data = { 
            user: document.getElementById("user").value,
            password: document.getElementById("password").value 
          };

          fetch('http://localhost:8080/login', {
            method: 'POST',
            headers: {
              'Content-Type': 'application/json',
            },
            body: JSON.stringify(data),
          })
          .then((response) => response.json())
          .then((data) => {
            console.log('Success:', data);
          })
        }
    </script>
  </head>
  <body>
    <h1>post requests!</h1>
    <input type="TEXT" id="user" size="40"><br>
    <input type="password" id="password" size="40"><br>
    <input type="button" onclick="postData()" value="Submit">
  </body>
</html>

We use fetch to handle the post request. So, now, we have a server that hosts a webpage: and when someone goes to the webpage, we can take in input data from them! If we wanted to do something cool, we could use that data to generate a whole new page, or change what the old page would look like.

obvious limitations to these simple servers, and their solutions:

this only works on local wifi! this can be solved temporarily using a service like ngrok, which configures an outward
when i shut my laptop, the server turns off! to have a dedicated web server, you want a machine that’s on and connected all the time. This is part of the reason why cloud services and CDNs are so popular, cause you don’t need to worry about someone at Google accidentally unplugging the machine hosting your site. To do this at home, though, is also pretty fine: get an old computer and install Linux on it, then plug it in in the corner of a room with a big ‘do not touch’ sign on it.
If you’re making a performance web server, chances are you’re not writing it yourself, as there’s a lot of really complex things it has to do to perform consistently under different conditions. Popular web servers are things like Nginx (pronounced ‘engine X’, not ‘n-ginks’), and Apache. These boast all kinds of fancy features that you’d struggle to write by hand, but are possibly overkill for smaller projects like the ones we’ll do in class.
8080 isn’t actually the HTTP port? If your server is just there to serve one site, servers like Nginx/Apache will serve it out of the default HTTP port, port 80. However, this requires elevated priveleges.

assignment:

due 03/08
Develop the server you made in class to handle both a GET request and a POST request.

reading

Karly Wildenhaus Towards a Library Without Walls
Daniel Romein (Bellingcat) Europol’s Asian City Child Abuse Photographs Geolocated