Using databases in Go
Go doesn't provide out of the box support for any database, but it provides an interface, which can be used by database library creators to keep all the database libraries compatible with each other.
We will use sqlite for this book.
Creating and configuring database
Before we can build the front end, we need the backend ready. Below is the DDL and the DML which will fill our database with dummy data.
Use the following insert statements to enter data in our table, so we'll begin reading data in our ShowAllTasks function which we wrote in the previous chapter
--user
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE user (
id integer primary key autoincrement,
username varchar(100),
password varchar(1000),
email varchar(100)
);
INSERT INTO "user" VALUES(1,'suraj','suraj','[email protected]');
--category
CREATE TABLE category(
id integer primary key autoincrement,
name varchar(1000) not null,
user_id references user(id)
);
INSERT INTO "category" VALUES(1,'TaskApp',1);
--status
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE status (
id integer primary key autoincrement,
status varchar(50) not null
);
INSERT INTO "status" VALUES(1,'COMPLETE');
INSERT INTO "status" VALUES(2,'PENDING');
INSERT INTO "status" VALUES(3,'DELETED');
COMMIT;
--task
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE task (
id integer primary key autoincrement,
title varchar(100),
content text,
created_date timestamp,
last_modified_at timestamp,
finish_date timestamp,
priority integer,
cat_id references category(id),
task_status_id references status(id),
due_date timestamp,
user_id references user(id),
hide int
);
INSERT INTO "task" VALUES(1,'Publish on github','Publish the source of tasks and picsort on github','2015-11-12 15:30:59','2015-11-21 14:19:22','2015-11-17 17:02:18',3,1,1,NULL,1,0);
INSERT INTO "task" VALUES(4,'gofmtall','The idea is to run gofmt -w file.go on every go file in the listing, *Edit turns out this is is difficult to do in golang **Edit barely 3 line bash script. ','2015-11-12 16:58:31','2015-11-14 10:42:14','2015-11-13 13:16:48',3,1,1,NULL,1,0);
CREATE TABLE comments(id integer primary key autoincrement, content ntext, taskID references task(id), created datetime, user_id references user(id));
CREATE TABLE files(name varchar(1000) not null, autoName varchar(255) not null, user_id references user(id), created_date timestamp);
Installing sqlite driver
We'll use the go-sqlite3 driver created by mattn. The reason being it implements the database/sql
interface. The advantage of using a database library which uses the database/sql interface is that, the libraries are swappable by other libraries which implement the interface.
Type this in your terminal:
go get -u "github.com/mattn/go-sqlite3"
Accessing the database
To access databases in Go, you use a sql.DB. You use this type to create statements and transactions, execute queries, and fetch results.
The first thing you should know is that a sql.DB isn’t a database connection. It also doesn’t map to any particular database software’s notion of a “database” or “schema.” It’s an abstraction of the interface and existence of a database, which might be as varied as a local file, accessed through a network connection, or in-memory and in-process.
The sql.DB performs some important tasks for you behind the scenes:
- It opens and closes connections to the actual underlying database, via the driver.
- It manages a pool of connections as needed, which may be a variety of things as mentioned.
The sql.DB abstraction is designed to keep you from worrying about how to manage concurrent access to the underlying datastore. A connection is marked in-use when you use it to perform a task, and then returned to the available pool when it’s not in use anymore. One consequence of this is that if you fail to release connections back to the pool, you can cause db.SQL to open a lot of connections, potentially running out of resources (too many connections, too many open file handles, lack of available network ports, etc). We’ll discuss more about this later.
After creating a sql.DB, you can use it to query the database that it represents, as well as creating statements and transactions.
Importing driver
To use database/sql you’ll need the package itself, as well as a driver for the specific database you want to use.
You generally shouldn’t use driver packages directly, although some drivers encourage you to do so. (In our opinion, it’s usually a bad idea.) Instead, your code should only refer to types defined in database/sql, if possible. This helps avoid making your code dependent on the driver, so that you can change the underlying driver (and thus the database you’re accessing) with minimal code changes. It also forces you to use the Go idioms instead of ad-hoc idioms that a particular driver author may have provided.
In this documentation, we’ll use the excellent MySQL drivers from @julienschmidt and @arnehormann for examples.
Add the following to the top of your Go source file:
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
)
Notice that we’re loading the driver anonymously, aliasing its package qualifier to _ so none of its exported names are visible to our code. Under the hood, the driver registers itself as being available to the database/sql package, but in general nothing else happens.
Now you’re ready to access a database.
Every database has a connection mechanism, file for sqlite and IP address for MySQL/Postgres.
Retrieving Result Sets
There are several idiomatic operations to retrieve results from the datastore.
- Execute a query that returns rows.
- Prepare a statement for repeated use, execute it multiple times, and destroy it.
- Execute a statement in a once-off fashion, without preparing it for repeated use.
- Execute a query that returns a single row. There is a shortcut for this special case.
Go’s database/sql function names are significant. If a function name includes Query
, it is designed to ask a question of the database, and will return a set of rows, even if it’s empty. Statements that don’t return rows should not use Query functions; they should use Exec().
Fetching Data from the Database
Let’s take a look at an example of how to query the database, working with results. We’ll query the users table for a user whose id is 1, and print out the user’s id and name. We will assign results to variables, a row at a time, with rows.Scan().
getTaskSQL = "select id, title, content, created_date from task
where finish_date is null and is_deleted='N' order by created_date asc"
rows, err := database.Query(getTaskSQL)
if err != nil {
log.Println(err)
}
defer rows.Close()
for rows.Next() {
err := rows.Scan(&TaskID, &TaskTitle, &TaskContent, &TaskCreated)
TaskContent = strings.Replace(TaskContent, "\n", "<br>", -1)
if err != nil {
log.Println(err)
}
fmt.Println(TaskID, TaskTitle, TaskContent, TaskCreated)
}
taskSQL := "delete from task"
tx := database.begin()
_, err = tx.Stmt(SQL).Exec(args...)
if err != nil {
tx.Rollback()
} else {
tx.Commit()
}
Defer keyword
We use defer inside a function call.
package main
import (
"fmt"
"io/ioutil"
"os"
)
func main() {
file, err := os.Open('file.dat')
if err != nil {
fmt.Println("File doesn't exist or you don't have
read permission")
}
defer file.Close()
inputReader := bufio.NewReader(file)
//do something about inputReader
}
The defer statement puts the function call at the bottom of the call stack, so whenever the function returns, defer is triggered. One has to be careful with using defer, it can cause difficult to find bugs.
file ~/main/main.go
Find and fix the bug:
package main
import (
_ "github.com/mattn/go-sqlite3"
"fmt"
)
var database *sql.DB
func init() {
defer database.Close()
database, err = sql.Open("sqlite3", "./tasks.db")
if err != nil {
fmt.Println(err)
}
}
//intentional bug exists, fix it
func main() {
getTaskSQL = "select id, title, content, created_date from task
where finish_date is null and is_deleted='N' order by created_date asc"
rows, err := database.Query(getTaskSQL)
if err != nil {
fmt.Println(err)
}
defer rows.Close()
for rows.Next() {
err := rows.Scan(&TaskID, &TaskTitle, &TaskContent, &TaskCreated)
TaskContent = strings.Replace(TaskContent, "\n", "<br>", -1)
if err != nil {
fmt.Println(err)
}
fmt.Println(TaskID, TaskTitle, TaskContent, TaskCreated)
}
err = rows.Err()
if err != nil {
log.Fatal(err)
}
}
Always defer rows.Close()
, to free the database connection in the pool. So long as rows
contains the result set, the database connection is in use and not available in the connection pool.
When the rows.Next()
function returns EOF (End of File), which means that it has reached the end of records, it'll call rows.Close()
for you, Close()
can be called multiple times without side effects.
Here’s what’s happening in the above code:
- We’re using db.Query() to send the query to the database. We check the error, as usual.
- We defer rows.Close(). This is very important.
- We iterate over the rows with rows.Next().
- We read the columns in each row into variables with rows.Scan().
- We check for errors after we’re done iterating over the rows.
This is pretty much the only way to do it in Go. You can’t get a row as a map, for example. That’s because everything is strongly typed. You need to create variables of the correct type and pass pointers to them, as shown.
A couple parts of this are easy to get wrong, and can have bad consequences.
- You should always check for an error at the end of the for rows.Next() loop. If there’s an error during the loop, you need to know about it. Don’t just assume that the loop iterates until you’ve processed all the rows.
- Second, as long as there’s an open result set (represented by rows), the underlying connection is busy and can’t be used for any other query. That means it’s not available in the connection pool. If you iterate over all of the rows with rows.Next(), eventually you’ll read the last row, and rows.Next() will encounter an internal EOF error and call rows.Close() for you. But if for some reason you exit that loop – an early return, or so on – then the rows doesn’t get closed, and the connection remains open. (It is auto-closed if rows.Next() returns false due to an error, though). This is an easy way to run out of resources.
- rows.Close() is a harmless no-op if it’s already closed, so you can call it multiple times. Notice, however, that we check the error first, and only call rows.Close() if there isn’t an error, in order to avoid a runtime panic.
- You should always defer rows.Close(), even if you also call rows.Close() explicitly at the end of the loop, which isn’t a bad idea.
- Don’t defer within a loop. A deferred statement doesn’t get executed until the function exits, so a long-running function shouldn’t use it. If you do, you will slowly accumulate memory. If you are repeatedly querying and consuming result sets within a loop, you should explicitly call rows.Close() when you’re done with each result, and not use defer.
How Scan() Works
When you iterate over rows and scan them into destination variables, Go performs data type conversions work for you, behind the scenes. It is based on the type of the destination variable. Being aware of this can clean up your code and help avoid repetitive work.
For example, suppose you select some rows from a table that is defined with string columns, such as VARCHAR(45) or similar. You happen to know, however, that the table always contains numbers. If you pass a pointer to a string, Go will copy the bytes into the string. Now you can use strconv.ParseInt() or similar to convert the value to a number. You’ll have to check for errors in the SQL operations, as well as errors parsing the integer. This is messy and tedious.
Or, you can just pass Scan() a pointer to an integer. Go will detect that and call strconv.ParseInt() for you. If there’s an error in conversion, the call to Scan() will return it. Your code is neater and smaller now. This is the recommended way to use database/sql.
Preparing Queries
You should, in general, always prepare queries to be used multiple times. The result of preparing the query is a prepared statement, which can have placeholders (a.k.a. bind values) for parameters that you’ll provide when you execute the statement. This is much better than concatenating strings, for all the usual reasons (avoiding SQL injection attacks, for example).
In MySQL, the parameter placeholder is ?, and in PostgreSQL it is $N, where N is a number. SQLite accepts either of these. In Oracle placeholders begin with a colon and are named, like :param1. We’ll use ? because we’re using MySQL as our example.
stmt, err := db.Prepare("select id, title, content, created_date from task
where finish_date is null and is_deleted='N' and task.user=?")
if err != nil {
log.Fatal(err)
}
defer stmt.Close()
rows, err := stmt.Query(1)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
for rows.Next() {
// ...
}
if err = rows.Err(); err != nil {
log.Fatal(err)
}
Under the hood, db.Query() actually prepares, executes, and closes a prepared statement. That’s three round-trips to the database. If you’re not careful, you can triple the number of database interactions your application makes! Some drivers can avoid this in specific cases, but not all drivers do. See prepared statements for more.
Single-Row Queries
If a query returns at most one row, you can use a shortcut around some of the lengthy boilerplate code:
var name string
query:="select taskDescription from task where id = ?"
err = db.QueryRow(query, 1).Scan(&taskDescription)
if err != nil {
log.Fatal(err)
}
fmt.Println(name)
Errors from the query are deferred until Scan() is called, and then are returned from that. You can also call QueryRow() on a prepared statement:
query :="select taskDescription from task where id = ?"
stmt, err := db.Prepare(query, 1).Scan(&taskDescription)
if err != nil {
log.Fatal(err)
}
var taskDescription string
err = stmt.QueryRow(1).Scan(&taskDescription)
if err != nil {
log.Fatal(err)
}
fmt.Println(taskDescription)
Modifying Data and Using Transactions
Now we’re ready to see how to modify data and work with transactions. The distinction might seem artificial if you’re used to programming languages that use a “statement” object for fetching rows as well as updating data, but in Go, there’s an important reason for the difference.
Statements that Modify Data
Use Exec(), preferably with a prepared statement, to accomplish an INSERT, UPDATE, DELETE, or other statement that doesn’t return rows. The following example shows how to insert a row and inspect metadata about the operation:
stmt, err := db.Prepare("INSERT INTO users(username, password, email) VALUES(?,?,?)")
if err != nil {
log.Fatal(err)
}
res, err := stmt.Exec("Sherlock", "notaSmartPassword", "[email protected]")
if err != nil {
log.Fatal(err)
}
lastId, err := res.LastInsertId()
if err != nil {
log.Fatal(err)
}
rowCnt, err := res.RowsAffected()
if err != nil {
log.Fatal(err)
}
log.Printf("ID = %d, affected = %d\n", lastId, rowCnt)
Executing the statement produces a sql.Result that gives access to statement metadata: the last inserted ID and the number of rows affected.
What if you don’t care about the result? What if you just want to execute a statement and check if there were any errors, but ignore the result? Wouldn’t the following two statements do the same thing?
_, err := db.Exec("DELETE FROM users") // OK
_, err := db.Query("DELETE FROM users") // BAD
The answer is no. They do not do the same thing, and you should never use Query() like this. The Query() will return a sql.Rows, which reserves a database connection until the sql.Rows is closed. Since there might be unread data (e.g. more data rows), the connection can not be used. In the example above, the connection will never be released again. The garbage collector will eventually close the underlying net.Conn for you, but this might take a long time. Moreover the database/sql package keeps tracking the connection in its pool, hoping that you release it at some point, so that the connection can be used again. This anti-pattern is therefore a good way to run out of resources (too many connections, for example).
Working with Transactions
In Go, a transaction is essentially an object that reserves a connection to the datastore. It lets you do all of the operations we’ve seen thus far, but guarantees that they’ll be executed on the same connection.
You begin a transaction with a call to db.Begin(), and close it with a Commit() or Rollback() method on the resulting Tx variable. Under the covers, the Tx gets a connection from the pool, and reserves it for use only with that transaction. The methods on the Tx map one-for-one to methods you can call on the database itself, such as Query() and so forth.
Prepared statements that are created in a transaction are bound exclusively to that transaction. See prepared statements for more.
You should not mingle the use of transaction-related functions such as Begin() and Commit() with SQL statements such as BEGIN and COMMIT in your SQL code. Bad things might result:
- The Tx objects could remain open, reserving a connection from the pool and not returning it.
- The state of the database could get out of sync with the state of the Go variables representing it.
- You could believe you’re executing queries on a single connection, inside of a transaction, when in reality Go has created several connections for you invisibly and some statements aren’t part of the transaction.
While you are working inside a transaction you should be careful not to make calls to the Db variable. Make all of your calls to the Tx variable that you created with db.Begin(). The Db is not in a transaction, only the Tx is. If you make further calls to db.Exec() or similar, those will happen outside the scope of your transaction, on other connections.
If you need to work with multiple statements that modify connection state, you need a Tx even if you don’t want a transaction per se. For example:
- Creating temporary tables, which are only visible to one connection.
- Setting variables, such as MySQL’s SET @var := somevalue syntax.
- Changing connection options, such as character sets or timeouts.
If you need to do any of these things, you need to bind your activity to a single connection, and the only way to do that in Go is to use a Tx.
Below lies an example of using transaction
file db/db.go
//RestoreTask is used to restore tasks from the Trash
func RestoreTask(id int) error {
query := "update task set is_deleted='N',last_modified_at=datetime() where id=?"
restoreSQL, err := database.Prepare(query)
if err != nil {
fmt.Println(err)
}
tx, err := database.Begin()
if err != nil {
fmt.Println(err)
}
_, err = tx.Stmt(restoreSQL).Exec(id)
if err != nil {
fmt.Println("doing rollback")
tx.Rollback()
} else {
tx.Commit()
}
return err
}
Using Prepared Statements
Prepared statements have all the usual benefits in Go: security, efficiency,
convenience. But the way they're implemented is a little different from what
you might be used to, especially with regards to how they interact with some of
the internals of database/sql
.
Prepared Statements And Connections
At the database level, a prepared statement is bound to a single database connection. The typical flow is that the client sends a SQL statement with placeholders to the server for preparation, the server responds with a statement ID, and then the client executes the statement by sending its ID and parameters.
In Go, however, connections are not exposed directly to the user of the
database/sql
package. You don't prepare a statement on a connection. You
prepare it on a DB
or a Tx
. And database/sql
has some convenience
behaviors such as automatic retries. For these reasons, the underlying
association between prepared statements and connections, which exists at the
driver level, is hidden from your code.
Here's how it works:
- When you prepare a statement, it's prepared on a connection in the pool.
- The
Stmt
object remembers which connection was used. - When you execute the
Stmt
, it tries to use the connection. If it's not available because it's closed or busy doing something else, it gets another connection from the pool and re-prepares the statement with the database on another connection.
Because statements will be re-prepared as needed when their original connection is busy, it's possible for high-concurrency usage of the database, which may keep a lot of connections busy, to create a large number of prepared statements. This can result in apparent leaks of statements, statements being prepared and re-prepared more often than you think, and even running into server-side limits on the number of statements.
Avoiding Prepared Statements
Go creates prepared statements for you under the covers. A simple
db.Query(sql, param1, param2)
, for example, works by preparing the sql, then
executing it with the parameters and finally closing the statement.
Sometimes a prepared statement is not what you want, however. There might be several reasons for this:
- The database doesn't support prepared statements. When using the MySQL driver, for example, you can connect to MemSQL and Sphinx, because they support the MySQL wire protocol. But they don't support the "binary" protocol that includes prepared statements, so they can fail in confusing ways.
- The statements aren't reused enough to make them worthwhile, and security issues are handled in other ways, so performance overhead is undesired. An example of this can be seen at the VividCortex blog.
If you don't want to use a prepared statement, you need to use fmt.Sprint()
or
similar to assemble the SQL, and pass this as the only argument to db.Query()
or db.QueryRow()
. And your driver needs to support plaintext query execution,
which is added in Go 1.1 via the Execer
and Queryer
interfaces,
documented here.
Prepared Statements in Transactions
Prepared statements that are created in a Tx
are bound exclusively to
it, so the earlier cautions about repreparing do not apply. When
you operate on a Tx
object, your actions map directly to the one and only one
connection underlying it.
This also means that prepared statements created inside a Tx
can't be used
separately from it. Likewise, prepared statements created on a DB
can't be
used within a transaction, because they will be bound to a different connection.
To use a prepared statement prepared outside the transaction in a Tx
, you can use
Tx.Stmt()
, which will create a new transaction-specific statement from the one
prepared outside the transaction. It does this by taking an existing prepared statement,
setting the connection to that of the transaction and repreparing all statements every
time they are executed. This behavior and its implementation are undesirable and there's
even a TODO in the database/sql
source code to improve it; we advise against using this.
Caution must be exercised when working with prepared statements in transactions. Consider the following example:
tx, err := db.Begin()
if err != nil {
log.Fatal(err)
}
defer tx.Rollback()
stmt, err := tx.Prepare("INSERT INTO foo VALUES (?)")
if err != nil {
log.Fatal(err)
}
defer stmt.Close() // danger!
for i := 0; i < 10; i++ {
_, err = stmt.Exec(i)
if err != nil {
log.Fatal(err)
}
}
err = tx.Commit()
if err != nil {
log.Fatal(err)
}
// stmt.Close() runs here!
Before Go 1.4 closing a *sql.Tx
released the connection associated with it back into the
pool, but the deferred call to Close on the prepared statement was executed
after that has happened, which could lead to concurrent access to the
underlying connection, rendering the connection state inconsistent.
If you use Go 1.4 or older, you should make sure the statement is always closed before the transaction is
committed or rolled back. This issue was fixed in Go 1.4 by CR 131650043.
Parameter Placeholder Syntax
The syntax for placeholder parameters in prepared statements is database-specific. For example, comparing MySQL, PostgreSQL, and Oracle:
MySQL PostgreSQL Oracle
===== ========== ======
WHERE col = ? WHERE col = $1 WHERE col = :col
VALUES(?, ?, ?) VALUES($1, $2, $3) VALUES(:val1, :val2, :val3)
Handling Errors
Almost all operations with database/sql
types return an error as the last
value. You should always check these errors, never ignore them.
There are a few places where error behavior is special-case, or there's something additional you might need to know.
Errors From Iterating Resultsets
Consider the following code:
for rows.Next() {
// ...
}
if err = rows.Err(); err != nil {
// handle the error here
}
The error from rows.Err()
could be the result of a variety of errors in the
rows.Next()
loop. The loop
might exit for some reason other than finishing the loop normally, so you always
need to check whether the loop terminated normally or not. An abnormal
termination automatically calls rows.Close()
, although it's harmless to call it
multiple times.
Errors From Closing Resultsets
You should always explicitly close a sql.Rows
if you exit the loop
prematurely, as previously mentioned. It's auto-closed if the loop exits
normally or through an error, but you might mistakenly do this:
for rows.Next() {
// ...
break; // whoops, rows is not closed! memory leak...
}
// do the usual "if err = rows.Err()" [omitted here]...
// it's always safe to [re?]close here:
if err = rows.Close(); err != nil {
// but what should we do if there's an error?
log.Println(err)
}
The error returned by rows.Close()
is the only exception to the general rule
that it's best to capture and check for errors in all database operations. If
rows.Close()
returns an error, it's unclear what you should do.
Logging the error message or panicing might be the only sensible thing,
and if that's not sensible, then perhaps you should just ignore the error.
Errors From QueryRow()
Consider the following code to fetch a single row:
var name string
err = db.QueryRow("select name from users where id = ?", 1).Scan(&name)
if err != nil {
log.Fatal(err)
}
fmt.Println(name)
What if there was no user with id = 1
? Then there would be no row in the
result, and .Scan()
would not scan a value into name
. What happens then?
Go defines a special error constant, called sql.ErrNoRows
, which is returned
from QueryRow()
when the result is empty. This needs to be handled as a
special case in most circumstances. An empty result is often not considered an
error by application code, and if you don't check whether an error is this
special constant, you'll cause application-code errors you didn't expect.
Errors from the query are deferred until Scan()
is called, and then are
returned from that. The above code is better written like this instead:
var name string
err = db.QueryRow("select name from users where id = ?", 1).Scan(&name)
if err != nil {
if err == sql.ErrNoRows {
// there were no rows, but otherwise no error occurred
} else {
log.Fatal(err)
}
}
fmt.Println(name)
One might ask why an empty result set is considered an error. There's nothing
erroneous about an empty set. The reason is that the QueryRow()
method needs
to use this special-case in order to let the caller distinguish whether
QueryRow()
in fact found a row; without it, Scan()
wouldn't do anything and
you might not realize that your variable didn't get any value from the database
after all.
You should only run into this error when you're using QueryRow()
. If you
encounter this error elsewhere, you're doing something wrong.
Identifying Specific Database Errors
It can be tempting to write code like the following:
rows, err := db.Query("SELECT someval FROM sometable")
// err contains:
// ERROR 1045 (28000): Access denied for user 'foo'@'::1' (using password: NO)
if strings.Contains(err.Error(), "Access denied") {
// Handle the permission-denied error
}
This is not the best way to do it, though. For example, the string value might vary depending on what language the server uses to send error messages. It's much better to compare error numbers to identify what a specific error is.
The mechanism to do this varies between drivers, however, because this isn't
part of database/sql
itself. In the MySQL driver that this tutorial focuses
on, you could write the following code:
if driverErr, ok := err.(*mysql.MySQLError); ok { // Now the error number is accessible directly
if driverErr.Number == 1045 {
// Handle the permission-denied error
}
}
Again, the MySQLError
type here is provided by this specific driver, and the
.Number
field may differ between drivers. The value of the number, however,
is taken from MySQL's error message, and is therefore database specific, not
driver specific.
This code is still ugly. Comparing to 1045, a magic number, is a code smell.
Some drivers (though not the MySQL one, for reasons that are off-topic here)
provide a list of error identifiers. The Postgres pq
driver does, for example, in
error.go. And there's an
external package of MySQL error numbers maintained by
VividCortex. Using such a list, the
above code is better written thus:
if driverErr, ok := err.(*mysql.MySQLError); ok {
if driverErr.Number == mysqlerr.ER_ACCESS_DENIED_ERROR {
// Handle the permission-denied error
}
}
Handling Connection Errors
What if your connection to the database is dropped, killed, or has an error?
You don't need to implement any logic to retry failed statements when this
happens. As part of the connection pooling in
database/sql
, handling failed connections is built-in. If you execute a query
or other statement and the underlying connection has a failure, Go will reopen a
new connection (or just get another from the connection pool) and retry, up to
10 times.
There can be some unintended consequences, however. Some types of errors may be
retried when other error conditions happen. This might also be driver-specific.
One example that has occurred with the MySQL driver is that using KILL
to
cancel an undesired statement (such as a long-running query) results in the
statement being retried up to 10 times.
Working with NULLs
Nullable columns are annoying and lead to a lot of ugly code. If you can, avoid
them. If not, then you'll need to use special types from the database/sql
package to handle them, or define your own.
There are types for nullable booleans, strings, integers, and floats. Here's how you use them:
for rows.Next() {
var s sql.NullString
err := rows.Scan(&s)
// check err
if s.Valid {
// use s.String
} else {
// NULL value
}
}
Limitations of the nullable types, and reasons to avoid nullable columns in case you need more convincing:
- There's no
sql.NullUint64
orsql.NullYourFavoriteType
. You'd need to define your own for this. - Nullability can be tricky, and not future-proof. If you think something won't be null, but you're wrong, your program will crash, perhaps rarely enough that you won't catch errors before you ship them.
- One of the nice things about Go is having a useful default zero-value for every variable. This isn't the way nullable things work.
If you need to define your own types to handle NULLs, you can copy the design of
sql.NullString
to achieve that.
If you can't avoid having NULL values in your database, there is another work around that most database systems support, namely COALESCE()
. Something like the following might be something that you can use, without introducing a myriad of sql.Null*
types.
rows, err := db.Query(`
SELECT
name,
COALESCE(other_field, '') as other_field
WHERE id = ?
`, 42)
for rows.Next() {
err := rows.Scan(&name, &otherField)
// ..
// If `other_field` was NULL, `otherField` is now an empty string.
// This works with other data types as well.
}
Working with Unknown Columns
The Scan()
function requires you to pass exactly the right number of
destination variables. What if you don't know what the query will return?
If you don't know how many columns the query will return, you can use
Columns()
to find a list of column names. You can examine the length of this
list to see how many columns there are, and you can pass a slice into Scan()
with the correct number of values. For example, some forks of MySQL return
different columns for the SHOW PROCESSLIST
command, so you have to be prepared
for that or you'll cause an error. Here's one way to do it; there are others:
cols, err := rows.Columns()
if err != nil {
// handle the error
} else {
dest := []interface{}{ // Standard MySQL columns
new(uint64), // id
new(string), // host
new(string), // user
new(string), // db
new(string), // command
new(uint32), // time
new(string), // state
new(string), // info
}
if len(cols) == 11 {
// Percona Server
} else if len(cols) > 8 {
// Handle this case
}
err = rows.Scan(dest...)
// Work with the values in dest
}
If you don't know the columns or their types, you should use sql.RawBytes
.
cols, err := rows.Columns() // Remember to check err afterwards
vals := make([]interface{}, len(cols))
for i, _ := range cols {
vals[i] = new(sql.RawBytes)
}
for rows.Next() {
err = rows.Scan(vals...)
// Now you can check each element of vals for nil-ness,
// and you can use type introspection and type assertions
// to fetch the column into a typed variable.
}
The connection pool
There is a basic connection pool in the database/sql package. There isn’t a lot of ability to control or inspect it, but here are some things you might find useful to know:
- Connection pooling means that executing two consecutive statements on a single database might open two connections and execute them separately. It is fairly common for programmers to be confused as to why their code misbehaves. For example, LOCK TABLES followed by an INSERT can block because the INSERT is on a connection that does not hold the table lock.
- Connections are created when needed and there isn’t a free connection in the pool.
- By default, there’s no limit on the number of connections. If you try to do a lot of things at once, you can create an arbitrary number of connections. This can cause the database to return an error such as “too many connections.”
- In Go 1.1 or newer, you can use db.SetMaxIdleConns(N) to limit the number of idle connections in the pool. This doesn’t limit the pool size, though.
- In Go 1.2.1 or newer, you can use db.SetMaxOpenConns(N) to limit the number of total open connections to the database. Unfortunately, a deadlock bug (fix) prevents db.SetMaxOpenConns(N) from safely being used in 1.2.
- Connections are recycled rather fast. Setting a high number of idle connections with db.SetMaxIdleConns(N) can reduce this churn, and help keep connections around for reuse.
- Keeping a connection idle for a long time can cause problems (like in this issue with MySQL on Microsoft Azure). Try db.SetMaxIdleConns(0) if you get connection timeouts because a connection is idle for too long.
Surprises, Antipatterns and Limitations
Although database/sql
is simple once you're accustomed to it, you might be
surprised by the subtlety of use cases it supports. This is common to Go's core
libraries.
Resource Exhaustion
As mentioned throughout this site, if you don't use database/sql
as intended,
you can certainly cause trouble for yourself, usually by consuming some
resources or preventing them from being reused effectively:
- Opening and closing databases can cause exhaustion of resources.
- Failing to read all rows or use
rows.Close()
reserves connections from the pool. - Using
Query()
for a statement that doesn't return rows will reserve a connection from the pool. - Failing to be aware of how prepared statements work can lead to a lot of extra database activity.
Large uint64 Values
Here's a surprising error. You can't pass big unsigned integers as parameters to statements if their high bit is set:
_, err := db.Exec("INSERT INTO users(id) VALUES", math.MaxUint64) // Error
This will throw an error. Be careful if you use uint64
values, as they may
start out small and work without error, but increment over time and start
throwing errors.
Connection State Mismatch
Some things can change connection state, and that can cause problems for two reasons:
- Some connection state, such as whether you're in a transaction, should be handled through the Go types instead.
- You might be assuming that your queries run on a single connection when they don't.
For example, setting the current database with a USE
statement is a typical
thing for many people to do. But in Go, it will affect only the connection that
you run it in. Unless you are in a transaction, other statements that you think
are executed on that connection may actually run on different connections gotten
from the pool, so they won't see the effects of such changes.
Additionally, after you've changed the connection, it'll return to the pool and potentially pollute the state for some other code. This is one of the reasons why you should never issue BEGIN or COMMIT statements as SQL commands directly, too.
Database-Specific Syntax
The database/sql
API provides an abstraction of a row-oriented database, but
specific databases and drivers can differ in behavior and/or syntax, such as
prepared statement placeholders.
Multiple Result Sets
The Go driver doesn't support multiple result sets from a single query in any way, and there doesn't seem to be any plan to do that, although there is a feature request for supporting bulk operations such as bulk copy.
This means, among other things, that a stored procedure that returns multiple result sets will not work correctly.
Invoking Stored Procedures
Invoking stored procedures is driver-specific, but in the MySQL driver it can't be done at present. It might seem that you'd be able to call a simple procedure that returns a single result set, by executing something like this:
err := db.QueryRow("CALL mydb.myprocedure").Scan(&result) // Error
In fact, this won't work. You'll get the following error: Error 1312: PROCEDURE mydb.myprocedure can't return a result set in the given context. This is because MySQL expects the connection to be set into multi-statement mode, even for a single result, and the driver doesn't currently do that (though see this issue).
Multiple Statement Support
The database/sql
doesn't explicitly have multiple statement support, which means
that the behavior of this is backend dependent:
_, err := db.Exec("DELETE FROM tbl1; DELETE FROM tbl2")
// Error/unpredictable result
The server is allowed to interpret this however it wants, which can include returning an error, executing only the first statement, or executing both.
Similarly, there is no way to batch statements in a transaction. Each statement in a transaction must be executed serially, and the resources in the results, such as a Row or Rows, must be scanned or closed so the underlying connection is free for the next statement to use. This differs from the usual behavior when you're not working with a transaction. In that scenario, it is perfectly possible to execute a query, loop over the rows, and within the loop make a query to the database (which will happen on a new connection):
rows, err := db.Query("select * from tbl1") // Uses connection 1
for rows.Next() {
err = rows.Scan(&myvariable)
// The following line will NOT use connection 1, which is already in-use
db.Query("select * from tbl2 where id = ?", myvariable)
}
But transactions are bound to just one connection, so this isn't possible with a transaction:
tx, err := db.Begin()
rows, err := tx.Query("select * from tbl1") // Uses tx's connection
for rows.Next() {
err = rows.Scan(&myvariable)
// ERROR! tx's connection is already busy!
tx.Query("select * from tbl2 where id = ?", myvariable)
}
Go doesn't stop you from trying, though. For that reason, you may wind up with a corrupted connection if you attempt to perform another statement before the first has released its resources and cleaned up after itself. This also means that each statement in a transaction results in a separate set of network round-trips to the database.
Database Encapsulation
We encapsulate our db object inside a struct. We also encapsulate the database actions as shown below
var database Database
//Database encapsulates database
type Database struct {
db *sql.DB
}
func (db Database) begin() (tx *sql.Tx) {
tx, err := db.db.Begin()
if err != nil {
log.Println(err)
return nil
}
return tx
}
func (db Database) prepare(q string) (stmt *sql.Stmt) {
stmt, err := db.db.Prepare(q)
if err != nil {
log.Println(err)
return nil
}
return stmt
}
func (db Database) query(q string,
args ...interface{}) (rows *sql.Rows) {
rows, err := db.db.Query(q, args...)
if err != nil {
log.Println(err)
return nil
}
return rows
}
func init() {
database.db, err =
sql.Open("sqlite3", "./newtask.db")
if err != nil {
log.Fatal(err)
}
}
//Close database connection
func Close() {
database.db.Close()
}
//taskQuery encapsulates Exec()
func taskQuery(sql string, args ...interface{}) error {
SQL := database.prepare(sql)
tx := database.begin()
_, err = tx.Stmt(SQL).Exec(args...)
if err != nil {
log.Println("taskQuery: ", err)
tx.Rollback()
} else {
tx.Commit()
}
return err
}
Note: init()
The init function is the first function to run when the package is imported or executed. This is why we do the initialization in it.
The fault in our code:
Fixing the intentional bug in the above code:
func init() {
defer database.Close()
database, err = sql.Open("sqlite3", "./tasks.db")
if err != nil {
fmt.Println(err)
}
}
Homework
See the /code/chapter-4/4.5database
in our code repository and modify the file to insert data from the 4.3formsupload
folder.
We have two working code set, one of printing form values on the console and one of fetching db values and rendering a template.
What you have to do is based on this chapter, write methods to insert values from the form to the database.