Using PostgreSQL Enums in Clojure
Our biggest Open Source project at bevuta is Pepa, a web-based document management & archival system. Pepa is written in Clojure and ClojureScript. It uses PostgreSQL to store all data, including uploaded files and rendered images.
PostgreSQL ENUM
PostgreSQL was chosen for its Open Source nature and previous experience with it. Support in Clojure (via clojure.java.jdbc) is excellent and lightweight.
One very common type of data to store are enumerated values.
Examples are presence information (available
, busy
, away
),
weekdays (monday
, tuesday
, …), or status information
(success
, failure
, pending
).
PostgreSQL supports this via the ENUM
data type:
CREATE TYPE PROCESSING_STATUS AS ENUM ('pending', 'failed', 'processed');
This piece of DDL creates a new type called PROCESSING_STATUS
which can assume the values pending
, failed
, or processed
.
While ENUM
is a useful data type, usage from Clojure isn't
trivial:
CREATE TABLE files (
name TEXT,
status PROCESSING_STATUS
);
This defines a new table called files
with two columns: name
(a
string), and status
(our previously defined enumeration). We can
insert data via raw SQL easily:
INSERT INTO files VALUES ('my-file.txt', 'pending');
-- INSERT 0 1
Trying to insert invalid values will fail, as expected:
INSERT INTO files VALUES ('my-other-file.txt', 'invalid_status')
-- ERROR: invalid input value for enum processing_status: "invalid_status"
Unfortunately, inserting rows from Clojure doesn't work like that:
(insert! pg-db :files {:name "my-file.txt", :status "pending"})
;; PSQLException ERROR: column "status" is of type processing_status but expression is of type character varying
This is caused by the rather strict type handling of JDBC: We try to insert a string into a field which is an enum. Raw SQL will automatically cast it, JDBC won't.
One common workaround is just dropping the use of ENUM
and go with
string values but this doesn't offer the same consistency guarantees
and is thus a frequent cause of errors.
Another dubiuos workaround is passing stringtype=unspecified
in the
JDBC connection URL. This causes strings to always bind to to the
'unknown' type, allowing PostgreSQL to cast it into whatever it
wants.
Yet another possibility is creating a PGobject
with the correct type
and pass it to insert!
:
(let [status (doto (PGobject.)
(.setType "processing_status")
(.setValue "pending"))]
(insert! pg-db :files {:name "my-file.txt", :status status}))
While this is type-safe and correct, it's pretty verbose. There's a better way.
Extending clojure.java.jdbc
clojure.java.jdbc
provides a protocol to handle type conversion:
ISQLValue
. This protocol contains one function sql-value
which
is used to convert any value to a PGobject
.
In Clojure, we usually use keywords or namespace-prefixed keywords
to handle enumerations. Using the latter, our processing_status
pending
could look like :processing-status/pending
.
Getting from the keyword to an equivalent PGobject
is pretty
easy:
(defn kw->pgenum [kw]
(let [type (-> (namespace kw)
(s/replace "-" "_"))
value (name kw)]
(doto (PGobject.)
(.setType type)
(.setValue value))))
Now we just need to implement ISQLValue
on clojure.lang.Keyword
:
(extend-type clojure.lang.Keyword
jdbc/ISQLValue
(sql-value [kw]
(kw->pgenum kw)))
This small piece of code allows us write our insert!
like:
(insert! pg-db :files {:name "my-file.txt", :status :processing-status/pending})
Now we just have to implement the other direction. Querying a
PROCESSING_STATUS
still gives us a string and not our
namespace-prefixed keyword. That means a queried value will be of a
totally different type than the value we inserted. Unthinkable!
Fortunately, clojure.java.jdbc
provides a protocol to handle this
case too: IResultSetReadColumn
.
It too contains a single function, result-set-read-column
, which
will get called once for every column of every row fetched from the
database. Arguments are the value of the column, an object
containing information about the current row, and the index of the
current column.
With that, and a list of defined ENUM
types, we can write the
following:
(def +schema-enums+
"A set of all PostgreSQL enums in schema.sql. Used to convert
enum-values back into Clojure keywords."
#{"processing_status"})
(extend-type String
jdbc/IResultSetReadColumn
(result-set-read-column [val rsmeta idx]
(let [type (.getColumnTypeName rsmeta idx)]
(if (contains? +schema-enums+ type)
(keyword (s/replace type "_" "-") val)
val))))
This piece of code will check the name of the type of the column,
and if it is in our set of enumerations, convert the value to a
namespace prefixed keyword. With this in place, fetching a
processing_status
from our table will work as expected:
(query (:db user/system) ["SELECT * FROM files"])
;=> ({:status :processing-status/pending, :name "my-file.txt"})
PostgreSQL Enums in Clojure - Closing Thoughts
Once again, Clojure has shown its flexibility when working with
different kinds of data. A few lines of Clojure code allowed us to
extend the capabilities of clojure.java.jdbc
to seamlessly handle
a new data type.
Of course, this isn't limited to ENUM
: PostgreSQL has many nice
data types, including types for spatial data (POINT
, LINE
,
POLYGON
, …), monetary units (MONEY
), and network addresses
(INET
, MACADDR
). One can easily imagine how the method presented
here can be adapted to convert those to native Clojure types, too.
One drawback with this implementation is that we have to manage the
set of ENUM
types by hand. Ideally it would initially fetch all
known enumeration types from the database and store them in an
atom
.