Update: With Jekyll & GitHub Pages’ ability to serve extensionless
permalinks, I’ve updated the article
to show you how to use your own domain of top of the (ever reliable) gopkg.in -
turning example.com/repo.v1 into the canonical import URL, but with gopkg.in serving the latest
tag for you.
Here’s a short tutorial on how to combine GitHub Pages and gopkg.in to both version and
serve your Go libraries and projects from a vanity import. Think yourdomain.com/pkgname.v1 instead
of gopkg directly, or github.com/you/yourproject. This allows you to vendor your libraries without
having to create multiple repositories, and therefore multiple sets of documentation, issues, and
distinct/confusing import URLs.
Importantly, it works today, is maintainable, and is compatible with those vendoring your library
downstream.
Note: the go-imports tag will be able to specify a branch/revision when issue
10913 is resolved, but users on older versions of Go
won’t be able to pull your package down.
Domain Setup
I’ll assume you have your own domain and know enough to point a CNAME or A record
for your domain to a host or IP. If not, GitHub’s
documentation
on this is pretty good, so reach for that if you get stuck.
Once you’ve set up the CNAME
file
for your GitHub Pages branch—usually gh-pages or the master branch if you
have a user.github.io repository—you can get started with the rest.
Creating the Import URL
Assuming a new or existing Jekyll
installation, create a new layout under
_layouts in the root of your Jekyll project:
$ vim _layouts/imports.html
Put the following template into the layout you’ve just created:
We’ve also added a ‘go-source’ <meta> tag so that
GoDoc knows where to
fetch our API documentation from, and a re-direct so that anyone hitting that
page in the browser is re-directed to GoDoc. You could alternatively
have this re-direct to the domain itself, the GitHub repo or no-where at all.
We’ll also need to configure our Jekyll installation to use extensionless
permalinks in _config.yml:
vim _config.yml
# Site settings
title: 'your project'
tagline: 'Useful X for solving Y'
baseurl: "/"
permalink: "/:title"
Once you’ve done this, create a pkgname.vN.html file at (e.g.) root of your Jekyll project: e.g.
pkgname.v1.html. Future versions
# GitHub Pages + Jekyll 3.x can serve this as example.com/pkgame.v1 without the extension.$ vim pkgname.v1.html
Now you can configure the template—this is the one you’ll re-use for future
versions or other packages you create.
Commit that and push it to your GitHub Pages repository, and users can now import
your package via go get -u example.com/pkgname.v1 (and eventually, a v2!).
Canonical Import Paths
Go 1.4 also introduced canonical import
paths, which ensure that the
package is imported using the canonical path (read: our new custom domain)
instead of the underlying repository, which can cause headaches later due to
duplicate imports and/or a lack of updates (if you change the underlying repo).
Thankfully, this is easy enough to fix—add the canonical path alongside
your package declarations:
packagemypkg// import example.com/pkgname.v1
Users can’t ‘accidentally’ import github.com/you/mypkg now (it won’t compile).
Notes
As a final note, doing all of this is much easier with a new project or library. You can
move an existing repository over to a custom domain, but existing users on the
old import path may run into issues. Make sure to document it well!
Cross-Site Request Forgery (CSRF) is probably one of the most common browser-based attacks on the web. In short, an attacker’s site ‘tricks’ a user into performing an action on your site using the user’s existing session. Often this is disguised as an innocuous-looking link/button, and without any way to validate that the request is occurring “cross-site”, a user might end up adding an attacker’s email address to their account, or transferring currency to them.
If you’re coming from a large framework in another language—you might have CSRF protection enabled by default. Since Go is a language and not a web framework, there’s a little legwork we’ll need to do to secure our own applications in the same manner. I contributed the gorilla/csrf package to the Gorilla Project (a collection of useful HTTP libs for Go), so we’ll use that.
Adding CSRF Protection
The example below provides a minimal (but practical) example of how to add CSRF protection to a Go web application:
packagemainimport("net/http"// Don't forget to `go get github.com/gorilla/csrf`"github.com/gorilla/csrf""github.com/gorilla/mux")funcmain(){r:=mux.NewRouter()r.HandleFunc("/signup",ShowSignupForm)// All POST requests without a valid token will return HTTP 403 Forbidden.r.HandleFunc("/signup/post",SubmitSignupForm)CSRF:=csrf.Protect([]byte("32-byte-long-auth-key"))// PS: Don't forget to pass csrf.Secure(false) if you're developing locally// over plain HTTP (just don't leave it on in production).log.Fatal(http.ListenAndServe(":8000",CSRF(r)))}funcShowSignupForm(whttp.ResponseWriter,r*http.Request){// signup_form.tmpl just needs a template tag for// csrf.TemplateField to inject the CSRF token into. Easy!t.ExecuteTemplate(w,"signup_form.tmpl",map[string]interface{csrf.TemplateTag:csrf.TemplateField(r),})}funcSubmitSignupForm(whttp.ResponseWriter,r*http.Request){// We can trust that requests making it this far have satisfied// our CSRF protection requirements.}
With the above we get:
Automatic CSRF protection on all non-idempotent requests (effectively anything that’s not a GET, HEAD, OPTIONS or TRACE)
A token available in the request context for us to inject into our responses
A useful template helper via csrf.TemplateField that replaces a {{.csrfField}} template tag with a hidden input field containing the CSRF token for you
JavaScript Clients
Alternatively, if your Go application is the backend for a React, Ember or other client-side JavaScript application, you can render the token in a <meta> tag in the head of your index.html template (i.e. the entry point of your JS application) provided your Go application is rendering it. Your JavaScript client can then get the token from this tag and return it via the X-CSRF-Token header when making AJAX requests.
Here’s a quick demo, with the HTML template representing your index.html template first:
// Using the the https://github.com/github/fetch polyfillfetch(‘/auth/login', {
method: 'post',
headers: {
// Vanilla, unadorned JavaScript
‘X-CSRF-Token’: document.getElementsByTagName(“meta”)[“gorilla.csrf.Token”].getAttribute(“content”)
},
body: new FormData(loginForm)
})
If you’re using jQuery, an AJAX prefilter is well-suited to this task—pass the header name & token to xhr.setRequestHeader inside your prefilter to automatically add the CSRF token header to every request.
How Does It Work?
The CSRF prevention approach used here is kept simple, and uses the proven double-submitted cookie method. This is similar to the approach used by Django and Rails, and relies on comparing the cookie value with the submitted form value (or HTTP header value, in the case of AJAX).
gorilla/csrf also attempts to mitigate the BREACH attack (in short: detecting secrets through HTTP compression) by randomizing the CSRF token in the response.
This is done by XOR’ing the CSRF token against a randomly generated nonce (via Go’s crypto/rand) and creating a ‘masked’ token that effectively changes on every request. The nonce is then appended to the XOR output - e.g. $maskedtoken$nonce - and used to XOR the masked token (reversing it) on the next request. Since underlying token used for comparison (stored in a signed cookie) doesn’t change, which means that this approach doesn’t break user experience across multiple tabs.
An XOR operation is used over a hash function or AES primarily for performance, but also because the mitigation is provided by making our secrets unpredictable across requests at large.
context.Context is quickly becoming the de-facto way to pass request-scoped values in Go HTTP applications and is likely to be incorporated into net/http with Go 1.7, but there’s little stopping you from adopting it now.
Buffers are extremely useful in Go and I’ve written a little about them before.
Part of that has been around rendering HTML templates: ExecuteTemplate returns an error, but if you’ve passed it your http.ResponseWriter it’s too late to do anything about the error. The response is gone and you end up with a malformed page. You might also use a buffer when creating a json.NewEncoder for encoding (marshalling) into before writing out bytes to the wire—another case where you want to catch the error before writing to the response.
Here’s a quick refresher:
buf:=new(bytes.Buffer)// Write to the buffer first so we can catch the errorerr:=template.ExecuteTemplate(buf,"forms/create.html",user)// or err := json.NewEncoder(buf).Encode(value)iferr!=nil{returnerr}buf.WriteTo(w)
In this case (and the JSON case) however, we’re creating and then implicitly throwing away a temporary buffer when the function exits. This is wasteful, and because we need a buffer on every request, we’re just adding an increasing amount of garbage collector (GC) pressure by generating garbage that we might be able to avoid.
So we use a buffer pool—otherwise known as a free list or leaky buffer—that maintains a pool of buffers that we get and put from as needed. The pool will attempt to issue an existing buffer (if one exists) else it will create one for us, and it will optionally discard any buffers after the pool reaches a certain size to keep it from growing unbounded. This has some clear benefits, including:
Trading some additional static memory usage (pre-allocation) in exchange for reduced pressure on the garbage collector (GC)
Reducing ad-hoc makeSlice calls (and some CPU hits as a result) from re-sizing fresh buffers on a regular basis—the buffers going back into the pool have already been grown
So a buffer pool definitely has its uses. We could implement this by creating a chan of bytes.Buffers that we Get() and Put() from/to. We also set the size on the channel, allowing us to discard excess buffers when our channel is full, avoiding repeated busy-periods from blowing up our memory usage. We’re also still free to issue additional buffers beyond the size of the pool (when business is good), knowing that they’ll be dropped when the pool is full. This is simple to implement and already nets us some clear benefits over throwing away a buffer on every request.
Enter SizedBufferPool
But there’s a slight quirk: if we do have the odd “large” response, that (now large) buffer returns to the pool and the extra memory we allocated for it isn’t released until that particular buffer is dropped. That would only occur if we had to give out more buffers than our pool was initially sized for. This may not always be true if our concurrent requests for buffers don’t exceed the total number of buffers in the pool. Over enough requests, we’re likely to end up with a number of buffers in the pool sized for our largest responses and consuming (wasting) additional memory as a result.
Further, all of our initial (“cold”) buffers might require a few rounds of makeSlice to resize (via copying) into a final buffer large enough to hold our content. It’d be nice if we could avoid this as well by setting the capacity of our buffers on creation, making the memory usage of our application over time more consistent. The typical response size across requests within a web service is unlikely to vary wildly in size either, so “pre-warming” our buffers is a useful trick.
Let’s see how we can address these concerns—which is thankfully pretty straightforward:
packagebpooltypeSizedBufferPoolstruct{cchan*bytes.Bufferaint}// SizedBufferPool creates a new BufferPool bounded to the given size.// size defines the number of buffers to be retained in the pool and alloc sets// the initial capacity of new buffers to minimize calls to make().funcNewSizedBufferPool(sizeint,allocint)(bp*SizedBufferPool){return&SizedBufferPool{c:make(chan*bytes.Buffer,size),a:alloc,}}// Get gets a Buffer from the SizedBufferPool, or creates a new one if none are// available in the pool. Buffers have a pre-allocated capacity.func(bp*SizedBufferPool)Get()(b*bytes.Buffer){select{caseb=<-bp.c:// reuse existing bufferdefault:// create new bufferb=bytes.NewBuffer(make([]byte,0,bp.a))}return}// Put returns the given Buffer to the SizedBufferPool.func(bp*SizedBufferPool)Put(b*bytes.Buffer){b.Reset()// Release buffers over our maximum capacity and re-create a pre-sized// buffer to replace it.ifcap(b.Bytes())>bp.a{b=bytes.NewBuffer(make([]byte,0,bp.a))}select{casebp.c<-b:default:// Discard the buffer if the pool is full.}}
This isn’t a significant deviation from the simple implementation and is the code I pushed to the (ever-useful) oxtoacart/bpool package on GitHub.
We create buffers as needed (providing one from the pool first), except we now pre-allocate buffer capacity based on the alloc param we provided when we created the pool.
When a buffer is returned via Put we reset it (discard the contents) and then check the capacity.
If the buffer capacity has grown beyond our defined maximum, we discard the buffer itself and re-create a new buffer in its place before returning that to the pool. If not, the reset buffer is recycled into the pool.
Note: dominikh pointed out a new buffer.Cap() method coming in Go 1.5 which is a different from calling cap(b.Bytes()). The latter returns the capacity of the unread (see this CL) portion of the buffer’s underlying slice, which may not be the total capacity if you’ve read from it during its lifetime. This doesn’t affect our implementation however as we call b.Reset() (which resets the read offset) before we check the capacity, which means we get the “correct” (full) capacity of the underlying slice.
Setting the Right Buffer Size
What would be especially nice is if we could pre-set the size of our buffers to represent our real-world usage so we’re not just estimating it.
So: how do we determine what our usage is? If you have test data that’s representative of your production data, a simple approach might be to collect the buffer sizes used throughout our application (i.e. your typical HTTP response body) and calculate an appropriate size.
Approaches to this would include:
Measuring the (e.g.) 80th percentile Content-Length header across your application. This solution can be automated by hitting your routes with a http.Client and analysing the results from resp.Header.Get("Content-Length").
Instrumenting your application and measure the capacity of your buffers before returning them to the pool. Set your starting capacity to a low value, and then call buf.Reset() and cap(buf.Bytes()) as we did in the example above. Write the output to a log file (simple) or aggregate them into a structure safe for concurrent writes that can be analysed later.
Determining whether to set the value as the average (influenced by outliers), median or an upper percentile will depend on the architecture of your application and the memory characteristics you’re after. Too low and you’ll increase GC pressure by discarding a greater number of buffers, but too high and you’ll increase static memory usage.
Postscript
We now have an approach to more consistent memory use that we can take home with us and use across our applications.
You can import and use the SizedBufferPool from the oxtoacart/bpool package (mentioned previously). Just go get -u github.com/oxtoacart/bpool and then call bufPool := bpool.NewSizedBufferPool(x, y) to create a new pool. Make sure to measure the size of the objects you’re storing into the pool to help guide the per-buffer capacity.
Worth reading is CloudFlare’s “Recycling Memory Buffers in Go” article that talks about an alternative approach to re-usable buffer pools.
It’s worth also mentioning Go’s own sync.Pool type that landed in Go 1.3, which is a building block for creating your own pool. The difference is that it handles dynamic resizing of the pool (rather than having you define a size) and discards objects between GC runs.
In contrast, the buffer pool in this article retains objects and functions as a free list and explicitly zeroes (resets) the contents of each buffer (meaning they are safe to use upon issue), as well as discarding those that have grown too large. There’s a solid discussion on the go-nuts list about sync.Pool that covers some of the implementation quirks.
I wrote an article a while back on implementing custom handler types to avoid a few common problems with the existing http.HandlerFunc—the func MyHandler(w http.ResponseWriter, r *http.Request) signature you often see. It’s a useful “general purpose” handler type that covers the basics, but—as with anything generic—there are a few shortcomings:
Having to remember to explicitly call a naked return when you want to stop processing in the handler. This is a common case when you want to raise a re-direct (301/302), not found (404) or internal server error (500) status. Failing to do so can be the cause of subtle bugs (the function will continue) and because the function signature doesn’t require a return value, the compiler won’t alert you.
You can’t easily pass in additional arguments (i.e. database pools, configuration values). You end up having to either use a bunch of globals (not terrible, but tracking them can scale poorly) or stash those things into a request context and then type assert each of them out. Can be clunky.
You end up repeating yourself. Want to log the error returned by your DB package? You can either call log.Printf in your database package (in each query func), or in every handler when an error is returned. It’d be great if your handlers could just return that to a function that centrally logs errors and raise a HTTP 500 on the ones that call for it.
My previous approach used the func(http.ResponseWriter, *http.Request) (int, error) signature. This has proven to be pretty neat, but a quirk is that returning “non error” status codes like 200, 302, 303 was often superfluous—you’re either setting it elsewhere or it’s effectively unused - e.g.
funcSomeHandler(whttp.ResponseWriter,r*http.Request)(int,error){db,err:=someDBcall()iferr!=nil{// This makes sense.return500,err}ifuser.LoggedIn{http.Redirect(w,r,"/dashboard",302)// Superfluous! Our http.Redirect function handles the 302, not // our return value (which is effectively ignored).return302,nil}}
It’s not terrible, but we can do better.
A Little Different
So how can we improve on this? Let’s lay out some code:
packagehandler// Error represents a handler error. It provides methods for a HTTP status // code and embeds the built-in error interface.typeErrorinterface{errorStatus()int}// StatusError represents an error with an associated HTTP status code.typeStatusErrorstruct{CodeintErrerror}// Allows StatusError to satisfy the error interface.func(seStatusError)Error()string{returnse.Err.Error()}// Returns our HTTP status code.func(seStatusError)Status()int{returnse.Code}// A (simple) example of our application-wide configuration.typeEnvstruct{DB*sql.DBPortstringHoststring}// The Handler struct that takes a configured Env and a function matching// our useful signature.typeHandlerstruct{*EnvHfunc(e*Env,whttp.ResponseWriter,r*http.Request)error}// ServeHTTP allows our Handler type to satisfy http.Handler.func(hHandler)ServeHTTP(whttp.ResponseWriter,r*http.Request){err:=h.H(h.Env,w,r)iferr!=nil{switche:=err.(type){caseError:// We can retrieve the status here and write out a specific// HTTP status code.log.Printf("HTTP %d - %s",e.Status(),e)http.Error(w,e.Error(),e.Status())default:// Any error types we don't specifically look out for default// to serving a HTTP 500http.Error(w,http.StatusText(http.StatusInternalServerError),http.StatusInternalServerError)}}}
The code above should be self-explanatory, but to clarify any outstanding points:
We create a custom Error type (an interface) that embeds Go’s built-in error interface and also has a Status() int method.
We provide a simple StatusError type (a struct) that satisfies our handler.Error type. Our StatusError type accepts a HTTP status code (an int) and an error that allows us to wrap the root cause for logging/inspection.
Our ServeHTTP method contains a type switch—which is the e := err.(type) part that tests for the errors we care about and allows us to handle those specific cases. In our example that’s just the handler.Error type. Other error types—be they from other packages (e.g. net.Error) or additional error types we have defined—can also be inspected (if we care about their details).
If we don’t want to inspect them, our default case catches them. Remember that the ServeHTTP method allows our Handler type to satisfy the http.Handler interface and be used anywhere http.Handler is accepted: Go’s net/http package and all good third party frameworks. This is what makes custom handler types so useful: they’re flexible about where they can be used.
Note that the net package does something very similar. It has a net.Error interface that embeds the built-in error interface and then a handful of concrete types that implement it. Functions return the concrete type that suits the type of error they’re returning (a DNS error, a parsing error, etc). A good example would be defining a DBError type with a Query() string method in a ‘datastore’ package that we can use to log failed queries.
Full Example
What does the end result look like? And how would we split it up into packages (sensibly)?
packagehandlerimport("net/http")// Error represents a handler error. It provides methods for a HTTP status // code and embeds the built-in error interface.typeErrorinterface{errorStatus()int}// StatusError represents an error with an associated HTTP status code.typeStatusErrorstruct{CodeintErrerror}// Allows StatusError to satisfy the error interface.func(seStatusError)Error()string{returnse.Err.Error()}// Returns our HTTP status code.func(seStatusError)Status()int{returnse.Code}// A (simple) example of our application-wide configuration.typeEnvstruct{DB*sql.DBPortstringHoststring}// The Handler struct that takes a configured Env and a function matching// our useful signature.typeHandlerstruct{*EnvHfunc(e*Env,whttp.ResponseWriter,r*http.Request)error}// ServeHTTP allows our Handler type to satisfy http.Handler.func(hHandler)ServeHTTP(whttp.ResponseWriter,r*http.Request){err:=h.H(h.Env,w,r)iferr!=nil{switche:=err.(type){caseError:// We can retrieve the status here and write out a specific// HTTP status code.log.Printf("HTTP %d - %s",e.Status(),e)http.Error(w,e.Error(),e.Status())default:// Any error types we don't specifically look out for default// to serving a HTTP 500http.Error(w,http.StatusText(http.StatusInternalServerError),http.StatusInternalServerError)}}}funcGetIndex(env*Env,whttp.ResponseWriter,r*http.Request)error{users,err:=env.DB.GetAllUsers()iferr!=nil{// We return a status error here, which conveniently wraps the error// returned from our DB queries. We can clearly define which errors // are worth raising a HTTP 500 over vs. which might just be a HTTP // 404, 403 or 401 (as appropriate). It's also clear where our // handler should stop processing by returning early.returnStatusError{500,err}}fmt.Fprintf(w,"%+v",users)returnnil}
… and in our main package:
packagemainimport("net/http""github.com/you/somepkg/handler")funcmain(){db,err:=sql.Open("connectionstringhere")iferr!=nil{log.Fatal(err)}// Initialise our app-wide environment with the services/info we need.env:=&handler.Env{DB:db,Port:os.Getenv("PORT"),Host:os.Getenv("HOST"),// We might also have a custom log.Logger, our // template instance, and a config struct as fields // in our Env struct.}// Note that we're using http.Handle, not http.HandleFunc. The // latter only accepts the http.HandlerFunc type, which is not // what we have here.http.Handle("/",handler.Handler{env,handler.GetIndex})// Logs the error if ListenAndServe fails.log.Fatal(http.ListenAndServe(":8000",nil))}
In the real world, you’re likely to define your Handler and Env types in a separate file (of the same package) from your handler functions, but I’ve keep it simple here for the sake of brevity. So what did we end up getting from this?
A practical Handler type that satisfies http.Handler can be used with
frameworks like net/http, gorilla/mux,
Goji and any others that sensibly accept a http.Handler type.
Clear, centralised error handling. We inspect the errors we want to handle
specifically—our handler.Error type—and fall back to a default
for generic errors. If you’re interested in better error handling practices in Go,
read Dave Cheney’s blog post,
which dives into defining package-level Error interfaces.
A useful application-wide “environment” via our Env type. We don’t have to
scatter a bunch of globals across our applications: instead we define them in
one place and pass them explicitly to our handlers.
If you have questions about the post, drop me a line via @elithrar on Twitter,
or the Gopher community on Slack.
simple-scrypt is a convenience wrapper around Go’s existing scrypt library. The existing library has a limited API and doesn’t facilitate generating salts, comparing keys or retrieving the parameters used to generate a key. The last point is a limitation of the scrypt specification, which doesn’t enforce this by default. Using Go’s bcrypt library as inspiration, I pulled together a more complete scrypt package with some “sane defaults” to make it easier to get started. The library provides functionality to provide your own parameters via the scrypt.Params type, and the public API should be rock solid (I’m planning to tag a v1.0 very soon).
scrypt itself, for those that don’t know, is a memory-hard key derivation function (KDF) entirely suitable for deriving strong keys from ‘weak’ input (i.e. user passwords). It’s often described as a way to ‘hash’ passwords, but unlike traditional hashes (SHA-1, the SHA-2 family, etc.) that are designed to be fast, it’s designed to be “configurably” slow. This makes it ideal for storing user passwords in a way that makes it very hard to brute force or generate rainbow tables against.
Here’s an example of how to get started with it for deriving strong keys from user passwords (e.g. via a web form):
The package also provides functions to compare a password with an existing key using scrypt.CompareHashAndPassword and to retrieve the parameters used in a previously generated key via scrypt.Cost. The latter is designed to make it easy to upgrade parameters as hardware improves.
Pull requests are welcome, and I have a few things on the to-do list to make it configurable based on hardware performance.