Yesql - SQL in SQL in Clojure

Fair warning. This was initially a transcript for a yesql screencast I intended to do but time is dragging on and I really can't see myself getting to that any time soon.
Why does this matter?
Well I wouldn't be surprised if this reads pretty much like the yesql README.md with some minor embellishments here and there (and perhaps some opinion). But I'm putting it out there just, well, just because. Yesql is lovely and consider this a tribute of sorts.

Yesql is a Clojure library for accessing SQL capable databases using SQL. Unlike its contemporaries yesql, as the name alludes to, embraces SQL. The argument here is why create a DSL, no matter how beautifully crafted, when we already have a perfect DSL already in the form of SQL.

To go a bit further lets look at the rationale in yesql's own README.

You could write your data access like this, as embedded SQL strings,

(query "SELECT * FROM users WHERE country_code = ?" "GB")

But this doesn't scale very well,

commands can get scattered about the code base,
its hard to see syntax errors, and,
longer statements can be difficult to read without any sort of syntax highlighting.

The next logical step taken to solve this problem is by adding an additional layer of abstraction. Language specific DSLs are created to dynamically build SQL.

(select :*
        (from :users)
        (where (= :country_code "GB")))

The benefit these DSLs have is that they remove the need to know the specific dialect of SQL you're using. As each database has a slightly different SQL dialect you simply specify an adapter, such as PostgresAdapater) and the translated SQL will magically be Postgres compatible. Moving to a different database engine is a matter of specifying a new adapter with no code changes.

Sounds good right? Well not quite - as with almost all abstractions they start to break down when you tread off the happy path.

Every database system will likely have additional, non-core features, that you'll probably want to exploit, in fact these features are usually factored into selection criteria when you were deciding what to use. Postgres array and JSON data types are a good example of this. These are features that your clever DSL abstraction can't support without extra wrangling because they address the generality of problems. Suddenly you're back to writing raw sql for some of the queries and we are back where we started.

Yesql tackles these problems by putting SQL in SQL files and allows you to reference these blobs of SQL as clojure functions. So you'll lose the ability to execute the same code across different databases, which if my experience is anything to go by isn't the worst thing that can happen, but you'll gain better editor support, correct syntax highlighting and one less abstraction to ruin your mood.

Seeing it in action is the best way to demonstrate its simple but powerful api. We'll start by cloning a ready rolled project template

git clone -b 00-starting-point [email protected]:yobriefcasts/007-yesql.git

This is nothing more than a project generated by lein new with 2 dependencies added,

One is the Yesql library dependency itself, currently at version 0.4.0, and
The other is the h2 database driver which we will use as our database in these examples

Finally we have a file based h2 database instance which is pre-populated with a people table containing a few rows of dummy data.

1	James	Hughes	[email protected]
2	Emma	Hughes	[email protected]
3	Ollie	Hughes	[email protected]
4	Nate	Hughes	[email protected]

The First Query

So lets get this data out of the database.

As is common across most SQL based libraries in Clojure Yesql derives its connections from a db-spec. This is the same db-spec map you'd pass Clojure's JDBC library or even korma. I'll provide a link to the relevant documentation if you want more information.

The db-spec required to connect to the current h2 database is already defined in the core.clj file.

(def db-spec {:classname    "org.h2.Driver"
              :subprotocol  "h2"
              :subname      "./resources/demo.db"
              :user         "sa"
              :password     "" })

So now if we want to create a SQL file get-all-people.sql under the src/sql directory we can write a simple select * statement from the people table.

select * from people

Its worth noting that the sql folder name isn't necessary I simply used it for neatness. By default when calling Yesql - paths will be relative to the src directory so you can organise it however you want.

Now we need to call this from our code. You'll notice in core.clj that I have already included a single require for defquery from the yesql.core namespace.

defquery is one of the 2 macros that gives you an entry point into Yesql, we will get to the other one shortly. defquery takes a name and a path to the query file. This allows defquery to parse the SQL file, convert it into a callable function and define a var based on the name you've given it.

We can define our get-all-people query like so,

(defquery get-all-people "sql/get-all-people.sql")

Although I've kept them the same the name of the file and the query name don't need to match. If we evaluate this line now we get back our def'd function. Subsequent evaluations, for example if you've changed the contents of the SQL, will result in the def being redefined.

If we call our new function as-is we'll get an error.

(get-all-people)

This is because we also need to supply the db-spec that was already defined for us. If we call the function correctly we should see our results.

(get-all-people db-spec)

Lets give our output a bit more structure. We can import clojure.pprints pprint function

(ns yesql-tinkering.core
  (:require [yesql.core :refer [defquery]]
            [clojure.pprint :refer [pprint]]))

And run our query again.

(pprint (get-all-people db-spec))

Passing data to queries

Executing static SQL isn't going to get you very far, if it does and you are only writing read-only systems with ready rolled queries - can I please come work with you? At the very least your queries often require WHERE clauses that need to take values based on some user input from outside the database.

To demonstrate how we do this in Yesql lets add a new query that filters by surname. We can start by creating a new SQL file src/sql/get-people-by-surname.sql and write our query. As with most frameworks that make use of parameterised queries we put token in place of the value we want to pass in, in the case of Yesql this can be a keyword that will associate a descriptive name against the argument..

select * from people where last_name = :last_name

Next we can use the defquery macro to "realise" this query.

(defquery get-people-by-surname "sql/get-people-by-surname.sql")

We can call this as we previous did except we need to pass it an argument for last_name.

(pprint (get-people-by-surname db-spec "Hughes"))

Documentation

Lets diverge a little from our path. Imagine a more complex SQL statement that accepts more than one argument, lets say it has a very domain specific name like isolate_aggregated_upstream_resources - how would you 1. find out what this statement does and 2. how you would call it correctly? Well most obviously you could simply look at the SQL and work it out but that's not really an effective way to go about things.

In out last example SQL statement we used a parameter name of last_name, well this wasn't just for fun, if well look at the documentation for our query we can see that the parameter name is used as an argument name to our function

(clojure.repl/doc get-people-by-surname)

This is handy but we can go one step further and supply doc compatible documentation in our SQL in the form of comments. By prepending a comment on top of our SQL we can dive into supplying additional commentary around what this SQL is actually doing.

-- Find all people for a given surname

If we evaluate our defquery again and look at the function signature we should see our comment available as a docstring.

(clojure.repl/doc get-people-by-surname)

yesql-tinkering.core/get-people-by-surname
([db last_name])
  Find all people for a given surname

Grouping queries

The one file per query thing is all well and good until it isn't. You could rather quickly end up with a raft of folders with 3 or 4 queries each which is a bit of a pants way of arranging things.

Yesql provides a way to group similar queries in a single file, well they don't have to be similar but lets assume we have some logical grouping here.

So lets add a few accessor methods for getting people out of our database. If we create a SQL file src/sql/get-persons.sql and add 2 queries, one for getting a user by ID and another for getting a user by email address. When you use defquery you specify the name of the function in the declaration and the entire file is used as a basis for the query however when we are defining multiple queries at once we need both logically separate the queries in the file and associate a name with each one. We do this by adding a structured comment above each query starting with name: and followed by the name you want to refer to the query as in your code.

-- name: get-person-by-id
select * from people where id = :id

-- name: get-person-by-email
select * from people where email_address = :email_address

To define these queries in our code we need to import the other macro in yesqls arsenal defqueries. We can then use this to import our queries,

(defqueries "sql/get-persons.sql")

You'll notice that defqueries returns a vector of the defined functions and we can see our two function handles. We can call each of these in the usual way.

(get-person-by-id db-spec 1)
(get-person-by-email db-spec "[email protected]")

Non-Queries

We've still only dealt with querying our database but most systems usually need to manipulate the contained data by either writing new rows or updating and deleting existing rows.

Rather than add a new set of macros - yesql uses naming conventions to signal to the macros that the operation is a non-query operation.

For example if we create a SQL file containing a statement for inserting a new user that expects first_name, last_name and email_address, the database will auto generate our ID value.

insert into people(first_name, last_name, email_address)
values(:first_name, :last_name, :email_address)

We can use defquery again to create our function except this time we prepend the name with an exclamation mark. This is the marker to tell yesql that this is a statement rather than a query.

(defquery insert-person! "sql/insert-person.sql")

The function expects the usual db-spec object as well as our arguments

(clojure.repl/doc insert-person!)

And we can call it in the normal way.

(insert-person! db-spec "Bill" "Thompson" "[email protected]")

This style can be used with any non-query statement such as create, update, delete, alter and all those other statements your database engine supports.

Statements will return the number of rows affected by default however in the case of insert statements against tables that have auto-generated ids you may want the id of the newly inserted row returned instead. You can achieve this in yesql with the less than exclamation (<!) instead. If we we redefine our insert-person! function to use this we can see that it will return a different set of information

(defquery insert-person<! "sql/insert-person.sql")

(insert-person<! db-spec "Emma" "Thompson" "[email protected]")

What is actually returned depends on the database engine, h2 returns a map containing the id but some engines return the full row.

Wrap Up

Thats just about covers the main features of yesql which as you can see is a neat little solution to bridging the gap between the database and your code. It avoids all the normal pain of wrangling with other peoples conventions and leaky abstractions and allows you to use SQL directly which is a much better medium to express queries against relational data stores.