Earlier this year I participated in a hackathon in the company where I work. The idea was to transfer data from a Mongodb to a MySQL db to fulfill some business requirements. I needed to apply some transformations on the JSONs and validation. Each record was to be decomposed and normalized into multiple tables. I needed to Update or Insert a record depending on some conditions.
It became apparent to me that alot of the code I wrote could be abstracted into a library that handles most of the common logic flow. The idea kept brewing in the back of my mind until I recently realized it into a npm module I call Json-Transqlify (Json Transform Sqlify)
In this blog I hope to demonstrate some of the functionality I’ve already baked into Json-Transqlify and some of the other ideas I’m working on.
Example 1
The backbone of Json-Transqlify is a yaml defininiton file that defines how the JSON (entity) should be validated, transformed, and loaded (insert/update) into the tables.
I’ll start with a simple example where I have some User objects that have the following schemaJavaScript
{ "name": "FIRST_NAME LAST_NAME", "age": "NUMBER", "address":{ "country": "STRING", "city": "CITY" } }
I want to insert those objects into the following users tableSQL
CREATE TABLE `users` ( `id` int(11) NOT NULL AUTO_INCREMENT, `fname` varchar(45) NOT NULL, `lname` varchar(45) NOT NULL, `age` int(11) NOT NULL, `country` varchar(45) NOT NULL, `city` varchar(45) NOT NULL, PRIMARY KEY (`id`) )
The json-transqlify definition file might look something like this
version: 1.0 validator: schema: # validate user object schema default: user-schema.json loaders: # notice loaders is an array - insert: label: InsertUser # name of this operation (can be anything) tableName: users # table to which the json will be inserted transform: columns: # map each column to appropreiate field on json - column: fname # insert into a column name fname value: $entity.name.split(' ')[0] # $entity refers to the user object we are inserting. - column: lname value: $entity.name.split(' ')[1] # grap last name - column: country value: $entity.address.country - column: city value: $entity.address.city - column: age value: $entity.age
I’ll break down the previous definition file and go through each section.
1. Validator
json-transqlify allows you to define json-schema rules to check if each record you try to process fulfills a certain schema (behind the scenes i’m using ajv to handle validation). I need to create a user-schema.json file that describes the schema of user objects and place it next to my json-transqlify yaml definition file.JavaScript
{ "type": "object", "properties": { "name": { "type": "string" }, "age": { "type": "number" }, "address": { "type": "object", "properties": { "country": { "type": "string" }, "city": { "type": "string" } }, "required": [ "country", "city" ] }, "required": [ "name", "age" ] }
If your json-schema is a large file you can actually break it down into separate files and instruct json-transqlify definition file to reference them .
Json-transqlify also allows you to define a custom validator function that is required to return a promise that resolves to true or rejects.
2. Loaders
Loaders perform the Insert/Update on db. You can define multiple loaders (insert and update multiple tables) and json-transqlify will run them inside one transaction. In the example above I’m using Insert loader which requires tableName, label and transformer.
Each loader you define needs a label attached to it. In cases where you have multiple loaders, it’s possible to access the result of loaderA when running loaderB perhaps to use the insertedId to fulfill a foreignkey constraint (more on this later).
3. Transformers
Transformers map the user object (entity) to table columns. There are multiple types of transformers, the one i’m using in this example is Columns Transformer.
Columns Transformer allows you to define the value of each column by defining an expression that will be evaluated at runtime. In this example $entity is a reserved variable that refers to the json we are processing (the user object). The expression you define can be anything as long as it’s a valid JS expression that returns a value. It’s also worth to note that by default you can use lodash in the expression.
I’m making the assumption that a user name is always “<First Name> <Last Name>” however it’s possible to have spaces in between. a improvement would be to split the user name by space, the first element in the array is the firstName, the rest would be the lastName (I know this isn’t always correct but for the sake of the example bear with me). We could implement this by using lodash’s head and tail methods.
transformer: columns: - column: fname value: _.head($entity.split(' ')) - columns: lname value: _.tail($entity.split(' ')).join(' ')
json-transqlify allows you to define your own custom transformer function. The function will be invoked with the $entity, $history ( more on this guy later), and $conn (db connection).. The function most return a promise that resolves to a JSON where the keys represet table columns .
Now that I’ve covered the json-transqlify definition file, We need to consume and parse this guy somehow.
first add json-transqlify to your project
npm install json-transqlify
Then instantiate a json-transqlifer factory and construct a transqlifer object as such
JavaScript
const createFactory = require('json-transqlify').createFactory; const db = { host: 'localhost', user: 'root', password: '', database: 'json_transqlify_demos', connectionLimit: 2 } const factory = createFactory(db) const transqlifier = factory.createTransqlifier('./insert-user.yaml'); const obj = { name: "Harry Potter", age: 10, address: { city: 'UK', country: 'Little Whinging' } }; // transqlifier will return a promise transqlifier(obj);
Example 2
Let’s try a more complicated example this time. Given a Lesson object such asJavaScript
{ "name": "Lesson1", "course": { "name": "course1 } }
And the following two tables tablesSQL
CREATE TABLE `lessons` ( `id` int(11) NOT NULL AUTO_INCREMENT, `title` varchar(45) NOT NULL, `course_id` int NOT NULL FOREIGN KEY REFERENCES courses(course_id), PRIMARY KEY (`id`) ) CREATE TABLE `courses` ( `id` int(11) NOT NULL AUTO_INCREMENT, `title` varchar(45) NOT NULL, `difficulty` int NOT NULL, PRIMARY KEY (`id`) ) ALTER TABLE `courses` ADD UNIQUE INDEX `course_title` (`title`);
The following transqlify definition makes sure that each lesson object also creates a course. and enforces course name to be unique.
version: 1.0 validator: schema: default: lesson-schema.json loaders: - insert: label: InsertCourse tableName: courses transform: columns: - column: title value: $entity.course.title - column: difficulty value: $entity.course.difficulty on: # precondition, if it evaluates to false the loader does not run - db: # db precondition runs a query and allows custom evaluation of result query: SELECT 1 from courses where title = ? # check if a course with same name exists params: - $entity.course.title expect: $rows.length === 0 # $rows refers to the result return from query - insert: label: InsertLesson tableName: lessons transform: func: transform-lesson.js
Validator:
Similar to the previous example, the validator is a simple json-schema defiitnion JavaScript
{ "type": "object", "properties": { "title": { "type": "string" }, "course": { "type": "object", "properties": { "title": { "type": "string" }, "difficulty": { "type": "number" } }, "required": [ "title", "difficulty" ] } }, "required": [ "title", "course" ] }
Preconditions
The “on” field allows you to define a list of preconditions. If any of them evaluates to false the loader is not executed. I use a Db Precondition which runs a custom query and allows custom evaluation of the returned rows ($rows). I check whether a course with same title already exists, and if so the InsertCourse loader does not run
json-transqlify currently supports 2 extra preconditions; Expression Precondition and Custom Function Precondition.
Custom Function Transformer
On the second loader, I’m using a custom function to do the transformation, because I want to grab the course_id . The function looks like thisJavaScript
const transformLesson = async ({ $entity, $history, $conn }) => { if ($history.InsertCourse) { return { title: $entity.title, course_id: $history.InsertCourse.$insertedId } } const result = await $conn.query('SELECT id from courses where title = ?', [$entity.course.title]) return { title: $entity.title, course_id: result[0].id } } module.exports = transformLesson
- $entity is the object we are transforming (lesson object)
- $conn is db connection.
- $history contains the result of all previous loaders. The result of each loader is accessed via its label. In this case $history.InsertCourse which contains the result of the transformer in addition to a special field $insertedId.
Because InsertCourse loader has a precondition, $history.InsertCourse might not always be defined (when a course already exists in db with the same name), in that case I query the db for a course with same title and grab its id
Finally, parse the yaml file and construct a transqlifer JavaScript
const createFactory = require('json-transqlify').createFactory; const db = { host: 'localhost', user: 'root', password: '', database: 'json_transqlify_demos', connectionLimit: 2 } const factory = createFactory(db) const transqlifier = factory.createTransqlifier('./insert-lesson.yaml'); const main = async () => { const lesson1 = { title: 'Lesson1 Title', course: { title: 'Course Title', difficulty: 3 } } const lesson2 = { title: 'Lesson2 Title', course: { title: 'Course Title', difficulty: 3 } } await transqlifier(lesson1) await transqlifier(lesson2) factory.closePool() } main()
What’s Next
There are a couple of improvements I’d like to add to Json-Transqlify before I drop the alpha label.
- Bulk Insert Loader
- Bulk Update Loader
- Upsert Loader
- Reuse loaders by defining them in separate files and importing them.
- Custom function loader
- Decouple MySQL from Json-Transqlify tosupport other kinds of SQL based DBs.
- Improve logging and error reporting.
The package is available on NPM and Github.
The code for those examples can be found on the Github repository as well