Stored procedures are not very commonly used. I worked for a while before i had to write my first stored procedure. They weren’t exactly hard but they weren’t easy too.

So what are stored procedures? The technical definition is

A procedure (often called a stored procedure) is a subroutine like a subprogram in a regular computing language, stored in database. A procedure has a name, a parameter list, and SQL statement(s)

Source https://www.w3resource.com/mysql/mysql-procedure.php

Think of them as stored MySQL queries but not to be confused with MySQL views. Views are mostly used to generate commonly called queries and they mostly only display data. Procedures are a bit complex as they can be used to do simple to complex processing of data and later do an insert or update into another table.

So what exactly are the advantages or benefits of using stored procedures? Why not just run normal SQL queries?

  • Stored procedures are fast. MySQL server takes some advantage of caching, just as prepared statements do. The main speed gain comes from reduction of network traffic. If you have a repetitive task that requires checking, looping, multiple statements, and no user interaction, do it with a single call to a procedure that’s stored on the server.
  • Stored procedures are portable. When you write your stored procedure in SQL, you know that it will run on every platform that MySQL runs on, without obliging you to install an additional runtime-environment package, or set permissions for program execution in the operating system, or deploy different packages if you have different computer types. That’s the advantage of writing in SQL rather than in an external language like Java or C or PHP.
  • Stored procedures are always available as ‘source code’ in the database itself. And it makes sense to link the data with the processes that operate on the data.
  • Stored procedures are migratory! MySQL adheres fairly closely to the SQL:2003 standard. Others (DB2, Mimer) also adhere.

Source https://www.w3resource.com/mysql/mysql-procedure.php#SP

Alright. Enough of talking, lets code

Imagine a scenario where you are storing a person’s data i.e their DOB(date of birth), Height and weight. With this data you are required to generate the age and bmi of the person for reporting purposes.

So what do you do? You could write an SQL query that calculates that data and shows the required data. Something like

select
id as person_id,
ROUND(((weight)/POWER(2,height/100)),0) as bmi,
TIMESTAMPDIFF(YEAR,DOB,CURDATE()) as age
from
person_data);

But think if you had to do this for thousands of people and even need to order the list based on something like age, bmi and even make assessments whether the person is overweight, underweight or just normal. Now imagine if you were still using the same tables for recording data as well as reading data for thousands of patients. Chances are your database will begin to get overwhelmed with all the connection and queries

That’s where things like ETL come in. ET what?

What is ETL? ETL stands for Extract, Transform and Load, which is a process used to collect data from various sources, transform the data depending on business rules/needs and load the data into a destination database

Essentially what we do is get data from the source table, process it and store it into a destination table from which we can be able to read easily. Lets do an example

We’ll use MySQL database. You can install it or if you are familiar with Docker you can spin a MySQL container from an image. One installed, we’ll create two tables. Lets call one person_data and the other flat_data . person_data will hold the raw person data while flat_data will hold the processed data

We’ll create 2 databases. One for holding the raw data. We’ll call it db and the other for warehousing our processed data. We’ll call it etl

CREATE SCHEMA etl ;

CREATE SCHEMA db ;

CREATE TABLE db.person_data (
id INT NOT NULL AUTO_INCREMENT,
DOB DATETIME NOT NULL,
height INT NOT NULL,
weight INT NOT NULL,
PRIMARY KEY (id));

CREATE TABLE etl.flat_data (
person_id INT NULL,
bmi INT NULL,
age INT NULL);

Add any number of records to the person_data table. In our case we’ll add 2 rows

INSERT INTO `db`.`person_data` (`DOB`, `height`, `weight`) VALUES ('1990-01-01', '164', '73'), (‘1990-01-01’, ‘184’, ’74’);

Nice! Now we should have 2 rows in our table. You can confirm this by running the query

SELECT * FROM db.person_data;

Now we write the stored procedure

DELIMITER $$
CREATE PROCEDURE generate_flat_data()
BEGIN
insert into etl.flat_data(
select
id as person_id,
ROUND(((weight)/POWER(2,height/100)),0) as bmi,
TIMESTAMPDIFF(YEAR,DOB,CURDATE()) as age
from
db.person_data);
END$$
DELIMITER ;

You might be wondering what that “DELIMITER $$” stands for. Usually a delimiter shows the end of the code which is usually a “;” for SQL. By stating “DELIMITER $$” we are telling MySQL to instead look at “$$” as indication where the code ends. This is because a procedure tends to have SQL statements inside which will have a “;”. That will cause confusion as MySQL will assume that’s the end of the query. By changing the delimiter we can now freely use semicolon in the queries and dollar sign to indicate the end of the stored procedure

We are creating a stored procedure called generate_flat_data which doesn’t take any parameters otherwise it would have been something like generate_flat_data(parameter). . This procedure can now be found under the stored procedures folder as shown below

So what is happening inside the stored procedure?

We are doing the following

  1. Fetching data from person_data table
  2. Processing it i.e converting the data to age and bmi which are calculated values
  3. Storing the processed values into another table called flat_data

So now that we have created the stored procedure, how do we call it?

Simple just run CALL etl.generate_flat_data();

If it was successful and no errors were returned then you can confirm if the data was actually successfully transformed and saved

Wala! The data should be stored after having been transformed. Congratulations, you have successfully done your first ETL using MySQL stored procedure . Hooray!

There are plenty of tools used to do ETL and that is up to you. So don’t be limited to stored procedures

Written By

Maiko

Comments :

Leave a Reply

Your email address will not be published. Required fields are marked *