--- layout: post.njk title: Using Awk with Kakoune to Generate SQL Types for OCaml tags: post date: 2024-03-03 ---
Lately I have been using the Kakoune text editor for my editing. What blows my mind is how interoperable the editor is with Unix scripts and how easy it is to create extensions.
So I am working on a Web server in OCaml. In my work, I levarage a library called ppx_rapper, which allows you to write SQL queries which are then converted to OCaml types. So a select query looks something like this:
let my_query =
[%rapper
get_opt
{sql|
SELECT @int{id}, @string{username}, @bool{following}, @string?{bio}
FROM usersstring{wrong_user} AND id > %int{min_id}
WHERE username <> % |sql}]
my_query
becomes a function that takes string
,
int
and sql_connection
and returns an n-tuple,
(int,string,bool,string)
. Tuples can be hard to work with, especially a larger query can return many values of the same type, it becomes easy to mix them up. If you have a record type defined
that looks exactly the same as the return type of the SQL statement, you
can therefore add a record_out
statement, which ppx_rapper will handle and
automatically return a record that matches its structure instead of a tuple:
type my_query_result =
int
{ id : string
; username : bool
; following : string option
; bio :
}let my_query =
[%rapper
get_opt
{sql|
SELECT @int{id}, @string{username}, @bool{following}, @string?{bio}
FROM usersstring{wrong_user} AND id > %int{min_id}
WHERE username <> %(* notice record out *) |sql} record_out]
Now my_query
will return a typed record
my_query_result
instead of a tuple. Records are easier to work with; they
help you avoid mixing up the values and work as self-documentation.
Of course, writing these types out by hand quickly becomes tedious, so yesterday I conjured up an Awk script that can be used to generate the types for you.
#!/usr/bin/env awk -f
BEGIN {
= 0
inside_rapper = ""
statement_name = 0
add_bracket }
# Extract function name so that we can use that
# for the type name
/let [a-zA-Z0-9_]+ =/ && inside_rapper == 0 {
= $2
statement_name }
# If [%rapper, begin extraction
/\[%rapper/ && inside_rapper == 0 {
= 1
inside_rapper = 1
add_bracket printf "type " statement_name "_result = \n"
}
# Match patterns like @int{string}
== 1 && /@[a-z?]+{[a-z_0-9]+}/g {
inside_rapper = 0
from # Remove everything before "@" and after "}"
= match ($0, /@[a-z?]+{[a-z_0-9]+}/, val)
pos while (0 < pos) {
# Split the line based on curly braces
split(val[0], parts, /{/)
# Extract @type, located in parts[1]
# Use substr to remove the @
= substr(parts[1], 2)
type
# Remove the last character (}) of key
= substr(parts[2], 1, length(parts[2])-1)
key
# ? = optional, so convert that to option
if (substr(type, length(type)) == "?") {
= substr(type, 1, length(type) - 1) " option"
type }
if (add_bracket == 1) {
print " { " key " : " type
= 0
add_bracket }
else {
# Print the key
print " ; " key " : " type
}
# In case there is more than 1 key per line, make sure we get all of them.
# Awk matches on the first occurance though, so we need to remove the
# previous match from the string and run again.
+= pos + val[0, "length"]
from = match( substr( $0, from ), /@[a-z?]+{[a-z_0-9]+}/, val )
pos }
}
# If we see |sql], that means we are done.
/\|sql}/ && inside_rapper == 1 {
printf " }\n"
= 0
inside_rapper }
The my_query_result
type you see above is generated with this script. I will not go into detail how the script work, but I have added comments explaining the steps. I chose Awk because it is specifically built for simple line
manipulation, a perfect fit for this problem.
You can run the script above with
chmod +x my-awk-script
./my-awk-script file_with_my_query.ml
I used the Awk in 20 Minutes tutorial to learn how to write Awk scripts, and levaraged that to write the code above. The syntax feels foreign at first but the language is super easy to use.
Already we have a CLI tool, but ideally I would like to use it within my editor. This is where Kakoune makes it so easy.
Kakoune allows you to run any POSIX script, pipe the selected input, and have the output printed in the editor. So I do not even have to write a plugin to use it with the editor (!).
Allow me to demonstrate: https://asciinema.org/a/644993.
In this video, I show how, without even creating any plugin, I could use my script in my editing flow to generate the OCaml types. All I had to do was:
You can easily levarage this with many other tools. For example, you can use Chatgpt CLI or Ollama to have an LLM autocomplete or explain code for you, similar to codepilot but free.
I hope this demonstrates why I am so in love with this editor. It comes with a bunch of sane standards OOB, a great LSP integration, but it is also so easy to extend and add your own functionality. Even if it does not have many plugins on its own due to a smaller community, it interops so well with other UNIX tools that it makes up for it. Had I wanted to write some more complex text manipulation, I could have used a language like Rust, Go or OCaml that compiles down to a single binary and use that instead.