Working with Forms

HTML forms are used to get data from the user. Forms can use both the GET and POST methods for transferring data to the server, but it is recommended to use HTTP POST method just because it doesn't highlight data in the URL and because of the many things we discussed in the chapter Web Programming Basics.

Forms are rendered inside templates, which we'll see in a later chapter. As of now we want to understand how to get data from the end user by using forms.

There are two parts of working with forms, the HTML part and the Go part. The HTML page gets the data and sends it to the server as a POST/GET and the Go part will parse the form to do some task like letting the user to login or inserting data in the database.

Each form element has a name which is referenced in the server side part of the form, below we have the file upload and a drop down list. Both of which have a unique name.

<form action="/add/" method="POST">
    <div class="form-group">
        <input type="text" name="title" class="form-control" id="add-note-title" placeholder="Title" 
        style="border:none;border-bottom:1px solid gray; box-shadow:none;">
    </div>
    <div class="form-group">
        <textarea class="form-control" name="content" id="add-note-content" placeholder="Content" 
        rows="10" style="border:none;border-bottom:1px solid gray; box-shadow:none;"></textarea>
    File: <input    type="file"    name="uploadfile"    /> <br>
    Priority: <select name="priority">
        <option>---</option>
        <option value="3">High</option>
        <option value="2">Medium</option>
        <option value="1">Low</option>
        </select>
    </div>
    </div>
    <div class="modal-footer">
        <button type="button" class="btn btn-default" data-dismiss="modal">Close</button>
        <input type="submit" text="submit" class="btn btn-default" />
    </div>
</form>

When we are populating this HTML page, we can also populate it dynamically, without getting bogged down by syntax, ignore templating for the while We have a variable called Navigation and one called Categories, we loop through Categories and if the Navigation is equal to that category then the checked value is true.

Category:
    <select name="category" class="dropdown">
        <option>---</option>
        {{$navigation := .Navigation}} {{$categories := .Categories}}
        {{range $cat := $categories}}
        <option value="{{$cat.Name}}" {{if eq $cat.Name $navigation }} checked="checked" {{end}}> {{$cat.Name}} </option>
        {{end}}
    </select>

Here we have used a drop down box, you can use radio buttons like below

<input type="radio" name="gender" value="1">Female
<input type="radio" name="gender" value="2">Male
<input type="radio" name="gender" value="3">Other

Or checkboxes

<input type="checkbox" name="gender" value="female">Female
<input type="checkbox" name="gender" value="male">Male
<input type="checkbox" name="gender" value="other">Other

We get the value of a drop down box/radio button/checkbox on the server side by using the name field like below:

value := r.Form.Get("gender")
value := r.FormValue("gender")

As we saw earlier, our webservers basically take a HTTP Request object and return an HTTP Response Object, below is an example of a sample HTTP Request object.

The host is the IP address sending the req, User Agent: fingerprint of the machine, the Accept- fields define various parts like the language, encoding Referer is which IP made the call, Cookie is the value of the cookie stored on the system and Connection is the type of connection.

In this request snapshot, we also have a file which we upload, for file upload, the content type is multipart/form-data

Request Header

Host: 127.0.0.1:8081
User-Agent: ...
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Referer: http://127.0.0.1:8081/
Cookie: csrftoken=abcd
Connection: keep-alive

Request Body

Content-Type: multipart/form-data; 
boundary=---------------------------6299264802312704731507948053
Content-Length: 15031

-----------------------------6299264802312704731507948053
Content-Disposition: form-data; name="title"

working with forms
-----------------------------6299264802312704731507948053
Content-Disposition: form-data; name="CSRFToken"

abcd
-----------------------------6299264802312704731507948053
Content-Disposition: form-data; name="content"

finish the chapter working with forms
-----------------------------6299264802312704731507948053
Content-Disposition: form-data; name="uploadfile"; filename="2.4workingwithform.md"
Content-Type: text/x-markdown

--file content--
-----------------------------6299264802312704731507948053
Content-Disposition: form-data; name="priority"

3
-----------------------------6299264802312704731507948053--

If you had wondered how Google's home page shows a pop up when you visit google.com on IE of Firefox, it checks your User-Agent. The thing with HTTP request is that they can be modified to any extent, the Chrome developer tools gives you quite sophisticated tools to modify your requests, even User Agent spoofing is a default feature avaiable, this feature is available so we can test our webapps in one window simulating many internet browsers at one go.

The basic part of working with forms is to identify which user that particular form belongs to, there are ways to attain that, we can either have a stateful or a stateless web server.

A stateless server doesn't store sessions, it requires an authentication key for each request while a stateful server stores sessions. For storing sessions a cookie is used, which is a file which is stored in the private memory of the web browser which we use. Only the website which created the cookie can access the cookie, no third party websites can access the cookies, but the OS user can read/edit/delete cookies using the web browser.

CSRF

CSRF stands for Cross Request Site Forgery. Any website can send a POST request to your web server, who sent the request can be found in the Referer field of your HTTP response. It is a form of confused deputy attack in which the deputy is your web browser. A malicious user doesn't have direct access to your website, so it makes use of your browser to send a malicious request. Typically cookies enable your browser to authenticate itself to a webserver, so what these malicious websites do is, they send in a HTTP request on behalf of your browser.

We can thwart this attack by restricting the referer to your own domain, but it is quite easy to manipulate the misspelt referer field of a HTTP request.

Another way is to use tokens. While rendering our form, we send in a hidden field with crypto generated string of 256 characters, so when we process the POST request, we first check if the token is valid or not and then decide if the data came from a genuine source or from a malicious source. It doesn't have to be malicious actually, even if a legitimate user tried to trick your webserver into accepting data, we shouldn't entertain it.

To check the csrf token, we serve the token to the form and store it in a cookie, when we get the POST request, we check if both are equal or not. This is because a malicious person might trick a user to click on a form but they can't set cookies for your webapplication.

A point to note here is that never trust user data. Always clean/sanitize data which you get from the end user.

Javascript

If you are serious about web development, you ought to learn Javascript in detail. While building a web app, there will be times when you would want to improve the UI of your application, which would mean a change in the html page. Using JS is inevitable while building beautiful webapps, while adding some new html feature, open the "web inspector" present in the developer tools and dynamically add the html code. The web inspector allows us to manipulate the CSS and HTML part. Now open the javascript console, that enables you to test the JS feature which you are willing to add. For instance, in the tasks application, there was no provision to expand/contract the size of the task, so I added a button in the web inspector,

<button class="toggle"></button>

In the JS console, to toggle the visibility of my .noteContent field, I did this:

$('.toggle').next().toggle()

This proved that it works, so now go to your template and actually add the code. Make sure the html is correct because while running, the html files are parsed once, so for any html change, you have to run the web app for each HTML change. So once the html is set up, if you change the JS/CSS then you just have to refresh the page, because the html page gets the JS/CSS each time the page is loaded.

As we saw in the above paragraph, for preventing CSRF, we need to generate a token and send as a hidden field in the form and store it in a cookie, when we get the POST request from the

Forms in Go

In the below function we set the cookie, we first generate a CSRF token, which'll be unique for each HTTP request which we get and store it in a cookie.

//ShowAllTasksFunc is used to handle the "/" URL which is the default ons
func ShowAllTasksFunc(w http.ResponseWriter, r *http.Request) {
    if r.Method == "GET" {
        context := db.GetTasks("pending") //true when you want non deleted notes
        if message != "" {
            context.Message = message
        }
        context.CSRFToken = "abcd"
        message = ""
        expiration := time.Now().Add(365 * 24 * time.Hour)
        cookie    :=    http.Cookie{Name: "csrftoken",Value:"abcd",Expires:expiration}
        http.SetCookie(w, &cookie)
        homeTemplate.Execute(w, context)
    } else {
        message = "Method not allowed"
        http.Redirect(w, r, "/", http.StatusFound)
    }
}

The below handler handles the POST request sent by our form, it fetches the value of the csrftoken cookie and gets the value of the hidden CSRFToken field of the add task form. If the value of the cookie is equal to the value fetched by the form, then we allow it to go to the database.

The call to ParseForm will parse the contents of the form into Gettable fields which we can fetch using the Get function. This call is compulsory.

    //AddTaskFunc is used to handle the addition of new task, "/add" URL
    func AddTaskFunc(w http.ResponseWriter, r *http.Request) {
        if r.Method == "POST" { 
            r.ParseForm()
            file, handler, err := r.FormFile("uploadfile")
            if err != nil {
                log.Println(err)
            }

            taskPriority, priorityErr := strconv.Atoi(r.FormValue("priority"))
            if priorityErr != nil {
                log.Print("unable to convert priority to integer")
            }
            priorityList := []int{1, 2, 3}
            for _, priority := range priorityList {
                if taskPriority != priority {
                    log.Println("incorrect priority sent")
                    //might want to log as security incident
                    taskPriority=1 //this defaults the priority to low
                }
            }
            title := template.HTMLEscapeString(r.Form.Get("title"))
            content := template.HTMLEscapeString(r.Form.Get("content"))
            formToken := template.HTMLEscapeString(r.Form.Get("CSRFToken"))

            cookie, _ := r.Cookie("csrftoken")
            if formToken == cookie.Value {
                if handler != nil {
                    r.ParseMultipartForm(32 << 20) //defined maximum size of file
                    defer file.Close()
                    f, err := os.OpenFile("./files/"+handler.Filename, os.O_WRONLY|os.O_CREATE, 0666)
                    if err != nil {
                        log.Println(err)
                        return
                    }
                    defer f.Close()
                    io.Copy(f, file)
                    filelink := 
                    "<br> <a href=/files/" + handler.Filename + ">" + handler.Filename + "</a>"
                    content = content + filelink
                }

                truth := db.AddTask(title, content, taskPriority)
                if truth != nil {
                    message = "Error adding task"
                    log.Println("error adding task to db")
                } else {
                    message = "Task added"
                    log.Println("added task to db")
                }
                http.Redirect(w, r, "/", http.StatusFound)
            } else {
                log.Fatal("CSRF mismatch")
            }

        } else {
            message = "Method not allowed"
            http.Redirect(w, r, "/", http.StatusFound)
        }
    }

Note Cookies

Cookie is a way to store data on the browser, HTTP is a stateless protocol, it wasn't built for sessions, basically the Internet itself wasn't built considering security in mind since initially it was just a way to share documents online, hence HTTP is stateless, when the web server receives requests, it can't distinguish between two consequitive requests, hence the concept of cookies were added, thus while starting a session, we generate a session ID and store it on the database in memory or on the database and we store the same ID on a cookie on the web browser and we validate both of them to authenticate them. We have to note that, if we set an expiry date for a cookie, then it is stored on the filesystem otherwise it is stored in memory in the browser. In the incognito mode, this is the case, all the cookies are stored in memory and not in the filesystem.

HTTP Request

Host: localhost:8080
User-Agent: .....
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Referer: http://localhost:8080/
Cookie: csrftoken=abcd
Connection: keep-alive
Cache-Control: max-age=0

The browser, while sending a response appends all the cookie data stored in its memory or file along with the other aspects of the HTTP request so we can access the cookie as r.Cookie, it'll contain every cookie for that particular domain, we'd then loop through it to fetch the data which we want.

This is important for security reasons, suppose I set a cookie on my imaginedomain.com and if it contains a csrf token and if that cookie is accessible to some other webapp then it is horrible since they can masquerade as any legitmate user. But this ain't possible, since a website can only access the cookie stored for its own domain, plus we have the feature to set HTTP only value of a cookie as true, which says that even javascript can't access the cookies.

HTTP Response

Content-Type: text/html; charset=utf-8
Date: Tue, 12 Jan 2016 16:43:53 GMT
Set-Cookie: csrftoken=abcd; Expires=Wed, 11 Jan 2017 16:43:53 GMT
Transfer-Encoding: chunked

When we set cookies, we write then to the HTTP response which we send to the client and the browser reads the Cookie information and stores them in the memory or the filesystem depending on the Expires field of the response.

From the go documentation:

type Cookie struct {
    Name  string
    Value string

    Path       string    // optional
    Domain     string    // optional
    Expires    time.Time // optional
    RawExpires string    // for reading cookies only

    // MaxAge=0 means no 'Max-Age' attribute specified.
    // MaxAge<0 means delete cookie now, equivalently 'Max-Age: 0'
    // MaxAge>0 means Max-Age attribute present and given in seconds
    MaxAge   int
    Secure   bool
    HttpOnly bool
    Raw      string
    Unparsed []string // Raw text of unparsed attribute-value pairs
}

Input Validation

The basic aspect of web applications is that no data can be trusted, even if the user isn't malicious herself, there are many ways to trick the browser into sending HTTP requests and fooling the web server to respond to what seems like a legitimate request. Hence we have to verify everything that comes from the user.

One might argue here that we do all sorts of validation using javascript, but there are ways to evade JS validations, the simplest way is to disable JS and more sophisticated ways are to manipulate the HTTP request before the browser sends it, it is literally trivial when we use just the web developer tools that these days browsers provide.

if formToken == cookie.Value and title != nil and content!=nil

The title and content of the task is mandatory, hence we added the not nil part.

We do input validation when we don't want junk data to go into our database, suppose the user hit the submit button twice, or some other scenario. But we also have to consider the case where the user wants to run some script on our website, which is dangerous, so we use the template.HTMLEscapeString method to escape what might run in memory of the current browser session. Even the data which comes from your drop down list should be validated.

Everything in a web form is sent via a Request as we saw in the above example, we use a drop down list when we have are expecting a particular set of inputs but for someone who knows what firefox dev tools are, can easily modify anything in the request, for example we have the priority as drop down list we might think that since we have only three entries in the drop down list, should we keep a check in our view handler to not accept anything but the three values which are present in our template. We ought to keep a check like below:

priorityList := []int{1, 2, 3}
for _, priority := range priorityList {
    if taskPriority != priority {
        log.Println("incorrect priority sent")
        //might want to log as security incident
    }
}

This is the priority field of our request

    Content-Disposition: form-data; name="priority"
        3

All a malicious user has to do is change this value and resend the request, they can literally insert any value here, just think what if they send rm -fr *.* and accidentally enough this command is executed on our server. This also brings in another aspect of security, never run your machine in root mode, always keep the root mode for admin tasks and use a non root mode. Even if that isn't the case, a less dangerous example will be sending a huge number with the request, assuming that we have used integer as our variable, the program might crash if it has to handle a number beyond its storage capacity. This might be termed as a denial of service attack.

Links

-Previous section -Next section

Form handling