Creating a CGI form in Haskell

Posted on February 25, 2018

Because I wanted to avoid making my website more complex than it needed to be, I went with a static site as I had very little need for dynamic content. However, the one place where a simple static site doesn’t work is my Contact Me page, as I wanted to have a form where a user could input their message and have it email me. In the name of keeping things simple, I decided to use CGI.

CGI

For those unfamiliar with it, CGI stands for Common Gateway Interface, and was one of the first ways of serving dynamic content via HTTP. CGI is very simple, and a typical request is as follows:

  1. The browser issues either a POST or a GET request for a page that is a CGI program.

  2. The webserver starts up the CGI program, and places any arguments to the request in environment variables in the case of a GET request, or the program’s standard input in the case of a POST request.

  3. The program executes, generating some dynamic content or handling some data from a form.

  4. The program places its data into standard output, and the webserver serves it to the client.

The standard does not place any requirements on what language is used, merely that it supports environment variables, reading from standard input and printing to standard output. This means that the CGI program can be written in anything from Assembly, to Brainf*ck, to Lisp. In my case I chose Haskell because I was comfortable programming in it, and I knew it was a very safe language.

My Problem

For my contact page, I need the CGI script to do a few things:

  1. Get the data in the name, email, subject, and message textboxes from the html page.
  2. Validate that each piece of data exists, except for subject which is optional
  3. Sanitize the subject and email fields to prevent possible email injection (not sure if this is necessary)
  4. Send the email through sendmail

I’ve chosen to use two libraries, cgi for handling the cgi request, and smtp-mail for sending the emails.

My Program

Let’s first start with the data types.

The Message data type defines all the data that I take from the HTML form. Notice that subject is of type Maybe String, because I don’t require there to be a subject in the form, so if the subject is there, subject is Just "Whatever the subject is", but if it’s not there, subject will be Nothing. We’ll see more of this later.

The getMessage Function

The first and second requirements are handled in this function:

We can see from the type that getMessage runs in the CGI monad, and returns a Maybe Message. Inside, it uses the MaybeT monad transformer to handle Maybe transparently. Basically, each <- and >>= checks to see if its data is Nothing, and if it is, returns Nothing immediately, otherwise it removes one layer of Maybe (so Just a becomes a), and continues. This is great because I don’t have to manually check whether each item is Nothing, it’s automatic.

The first 3 lines of the body are very similar, name <- ... unwraps one Maybe like I mentioned above, (MaybeT $ getInput "field name") gets a piece of data from the HTML form, and lifts it into the MaybeT monad. Finally >>= maybeEmpty again unwraps one Maybe from the data, and passes it to the maybeEmpty function, which is defined below:

Basically, maybeEmpty returns Nothing if the array (or in this case String) is empty, otherwise it returns the array wrapped in Maybe. This is used to ensure that the user actually entered data into the form for the name, email, and message fields, as if maybeEmpty returns Nothing, the entire function will return Nothing.

Line 4 of the getMessage function is a little different, and this is because I want to allow an empty subject to be entered. We see that this line does not include a maybeEmpty check like the other 3, so it will only fail if the form does not contain the subject field. Finally, the remainder of the function assembles the pieces of data into a Message. The only notable part here is the subject=(maybeEmpty subject), which converts an empty subject to Nothing, so it can be replaced with a default message later on.

The sanitizeMessage Function

The next function handles point 3, sanitizing the input data to try and prevent email injection:

We see from the type, Message -> Message that this function merely transforms a Message. On the first line, we can see that the function unwraps a Message into the fields n, e, s , and m for easier handling inside the function. The next line begins a let block, where several local variables are defined. The first two variables, emailRegex and subjectRegex use mkRegexWithOpts to build a regular expression, which in this case matches newlines.

The next line is a little complicated. The first part of it s >>= \sub -> ... feeds s (which is a Maybe String) to >>= which as we learned before unwraps a Maybe and feeds it to the input of a function. The \sub -> ... defines a lambda, which is an unnamed function, that has one arguement named sub. In effect, this unwraps s, and if s is not Nothing, places the data in s into sub. The rest of the function, return $ subRegex subjectRegex sub "" finds every time subRegex matches sub, and replaces it with “” (deletes it), and finally return wraps the result of the subRegex in a Maybe.

The makeAddress and buildMail Functions

Next I move on to point 4, sending an email, but before I can get to that, I need a couple functions to assemble a Message into something that can be sent over email.

This function takes a Maybe String (name) and a String (email), and returns an Address (which is defined in the smtp-mail library). Inside, the function first defines some local variables, which convert name and email from String to Text. We once again see >>= and return used to unwrap and wrap a Maybe. Finally, the function assembles textname and textemail into an Address.

buildMail uses makeAddress to convert my Message type into Mail, which is the type used by smtp-mail to send an email. We see that this function uses makeAddress to create the to and from addresses, and that the from address is hard coded to my (redacted) email address. Next, since I don’t want any people to receive CCs or BCCs, cc and bcc are set to empty arrays. The next line uses fromMaybe to take the data out of subject if subject has some data (is Just x), or replaces it with "Website Message" if it has no data, and then passes it to Text.pack to convert it to Text. Finally, the body is packed with LText.pack (where LText is Data.Text.Lazy), and finally all the variables are assembled into a Mail by simpleMail.

sendMail and main

Finally, we can send our email:

This function is pretty straightforward, it uses sanitizeMessage to sanitize the message, then builds a Mail with buildMail, sends it with sendMail (using liftIO to take the sendMail function into the CGI monad), and finally redirects the user to my homepage.

Now that we have all of the building blocks, we can finally take a look at main.

We see that main doesn’t really do much other than calling cgiMain, instead, most of the important parts are in `cgiMain.

We see that cgiMain uses getMessage to get the data from my HTML form, and if it is empty (the user forgot something), it uses emptyFields to display an error message to the user. Otheriwse, if the user input all the data correctly, it feeds the message to sendEmail, where the email is sanitized, built, and sent.

And that’s it! We see that the cgi library makes it relatively straightforward to create a simple CGI program that does something useful in response to an action by the user.