intro
As a data analyst, a client called John approaches you and tells you that he is starting an online ecommerce business. John then asks you to design and build a relational database that will enable him to store important information that his ecommerce business generates.
The data the client wants to be stored in the database is data related to items on sale, orders made, and customers who make the orders. After that, he wants you to analyze the data to come up with analytics represented visually so that he can monitor the business performance.
In this article, I will show you how to design and build a relational database from scratch in Postgres for John’s ecommerce business. We are using Postgres because it is an advanced open-source relational database system that helps data analysts to manage data regardless of the size of the dataset.
After that, I will show you how to run SQL queries on data stored in the database to come up with analytics and visualize the analytics using charts on DbVisualizer.
Prerequisites
To follow through with this article, you need a database designing tool, a database management system, and a SQL client. In this case, we will use the Quick Database Diagrams tool for design, Postgres as the database management system, and DbVisualizer as the database SQL client.
To install PostgreSQL, navigate to the PostgreSQL download page and download it for your operating system. You can follow this guide to install PostgreSQL on Windows, this guide to install it on Linux, and this guide to install it on macOS.
To install DbVisualizer, navigate to the DbVisualizer download page. Then download the recommended installer for your operating system. After that, execute the installer you have downloaded and follow the instructions that follow.
When you start DbVisualizer, you are prompted to choose the DbVisualizer plan to proceed. You can proceed with the DbVisualizer free version with limited features or get a 21-day free trial on the pro version.
Designing the Database Using Quick Database Diagrams
First of all we need to design the database. There are many tools you can use for this, and for this example we will be using Quick Database Diagrams, also known as QuickDBD. QuickDBD is a simple web-based tool that lets you quickly draw database diagrams by typing. QuickDBD enables you to design database tables as diagrams and then allows you to download a SQL file containing the SQL queries needed to create the tables in a database.
Since the client wants to collect data on items, customers, and orders, we can say that the data classes were already specified for each category. For items, he wants data on the following data classes:
For customers, he wants data on the following:
For orders, he wants data on the following:
Looking at the client’s data requirements, we can assume that the database needs three tables to store data from each category.
To design the database and the tables, we need to specify all the fields needed for each table. Then we will do data normalization and define table relationships. Normalizing data is the process of organizing data to reduce redundancy and make it more flexible and efficient.
Let’s now spec out all the fields needed for the items table as shown below,
We can now use the Quick Database Diagrams tool to see how the items table will look inside the database.
You specify tables, their fields, and field data types on the left sidebar, as shown below.
On the right are diagrams produced by tables and their fields, as shown below.
Below are the steps you can follow to create the items table.
Step 1: On the left sidebar, write the table's name, jump to the next line and add a dash.
You can now see the items table has been added to the diagrams.
Step 2: Add the items table fields
by specifying the field names and their data types.
Item_id
field is a unique identifier for each item data stored in the items table
hence it is the primary key (pk)
. The field has an int
data type with the item_price
field because the fields will hold data in number form.
Item_name
and item_category
fields have varchar
data types because they will store data in string form with a maximum length of 50 characters. You can now see how the items table will look on the diagram.
Let us now define the fields needed for the customers table
, as shown below.
We can now use the Quick Database Diagrams tool to design the Customers Table
as shown below.
Let us now define all the fields needed for the orders table
, as shown below.
We can now use the Quick Database Diagrams tool to design the orders table
as shown below.
Looking at the three tables we designed, you will realize that the orders table
has fields similar to the items table
and customers table
. This is where we can use normalization by breaking a larger table, like orders table
, into smaller tables and defining the relationships between the tables. This will enable us to eliminate repetitive data fields and ensure data is stored logically.
The first thing we will do is remove all fields on the orders table that look similar to fields on the items table
. Then we will add the field that is a unique identifier of the items table
to the orders table
and connect the fields on both tables to define the two tables' relationship, as shown below.
Let us now do the same for the customers table
and define its relationship with the orders table
, as shown below.
After defining the tables' relationships, let us now download the SQL file that contains SQL queries required to create the tables in a database. You can download the SQL file by clicking the EXPORT
button and selecting PostgreSQL
, as shown below.
Tools like QDBD help you to define database structure by typing and diagrams where they also provide you with SQL queries that you can run on DbVisualizer to create tables for a database in PostgreSQL.
Connecting Postgres to DbVisualizer
Step 1: Start DbVisualizer and click create a connection
button as shown below.
Step 2: Search and select the Postgres driver from the popup menu on the left side of your screen, as shown below.
An object view tab for the Postgres connection is opened.
Step 3: Fill in the empty fields as shown below:
Name: JohnEcommerceDatabase
Database: postgres
Database Userid: postgres
Database Password: Enter Your Postgres password
Once you have filled in all the empty fields, click the connect button at the bottom. If the connection is successful, your screen should look as shown below.
Creating a database on Postgres using DbVisualizer
Let us now create a database that will house the tables we designed.
Step 1: Open the database connection JohnEcommerceDatabase
tab tree as shown below.
Inside the connection tab tree, open the Databases
tab tree.
Step 2: Right-click on the Databases
tab tree and select Create Database
option.
Step 3: Fill in the fields as shown below and then click the Execute
button to create the database.
New Database Name: “Name for your Database.”
Owner: “Owner of the database in PostgreSQL.”
Template: “Boilerplate files containing SQL scripts that help you create objects in a database.”
Encoding: “Encoding converts data into a standard format.”
Collation: “Collation specifies how data is sorted and compared in a database”
If you look at the JohnEcommerceDatabase
connection, you will see the database ecommercedb
has been created.
Creating Tables Designed On Quick Database Diagrams
Step 1: Navigate to the folder where the SQL file containing SQL queries for creating the tables we designed was saved.
Step 2: Double-click the SQL file, which should open on DbVisualizer, as shown below.
The SQL file has queries defining our database structure, where related data is grouped into tables consisting of rows and columns.
Step 3: Open [ choose connection ]
drop-down menu and select JohnEcommerceDatabase
as the connection containing the database where you want to create the tables.
Step 4: Select ecommercedb
as the database you want to create the tables.
Step 5: Press the (Ctrl+Enter)
keys to run the SQL queries. Right-click on the table tab and refresh to see the tables created.
You should now see that all the designed tables have been created on the database we created.
Importing Data Table Into Postgres Using DbVisualizer
To run SQL queries, we need to add data to the tables we created. I have created three files containing data we can import into the tables and run SQL queries. You can download the data files by navigating to this link.
Once the download is complete, navigate to the folder where the data files have been saved and you should see the three data files for each table as shown below.
To import data into the customers
table, right-click on the customers
table and select the Import Table Data
option.
Navigate to the directory containing the data files, select the customers data file and click open.
Click the Next
button and keep clicking it until you reach the window shown below, and then click the import button. The window provides an option to import all rows or a just 50 rows from the data file. Batch import option significantly improves the data import speed.
Once the data has been imported, right-click on the customers
table and refresh to see the table data.
Open the table and navigate to the Data
tab. You should be able to see the imported data, as shown below.
Repeat the same process and import data into the items table
and orders table
.
Running SQL queries on DbVisualizer
If we look at the orders table
, we will see that it has data coming from the customers table
and items table
represented by customer_id and item_id columns, respectively. For that reason, we can’t get much information from the orders table such as which items are selling well, as shown below.
To get more information about the orders, we can create a view which is a virtual table based on the result of an SQL statement. The SQL statement will use SQL joins where a JOIN clause is used to combine rows from two or more tables based on a related column between them.
The SQL query for creating the view table is as shown below.
1
CREATE VIEW TotalOrders AS
2
SELECT orders.order_id, customers.customer_firstname,
3
customers.customer_lastname, items.item_name,
4
items.item_category, items.item_price, orders.items_bought,
5
customers.customer_address, items.item_price *
6
orders.items_bought AS spend
7
FROM orders
8
INNER JOIN customers ON
9
orders.customer_id=customers.customer_id
10
INNER JOIN items ON orders.item_id=items.item_id
The SQL query first creates a view table called TotalOrders
. Then all columns needed for the TotalOrders
view table are selected from the orders table. After that, the SQL query uses the INNER JOIN
clause to join the customers table
and the items table
to the orders table using related columns. In simple terms, the SQL query gets all the columns with their data from the items table
and customers table
and adds them to the TotalOrders view table
.
To run the SQL query in DbVisualizer, click the SQL Commander
tab at the top of your screen and select New SQL Commander
, as shown below. Doing so will create a new tab to run SQL queries in.
Once the SQL Commander opens, type the SQL query and press the (Ctrl+Enter)
keys to run the query.
If you open the views
tab tree, you should see a view called totalorders has been created.
Open the view, and you should see more information about the orders.
Since the client also wants to get some analytics from his ecommerce business to monitor its performance, we can create a SQL query that shows top-selling products by total sales
, as shown below.
1
2
SELECT totalorders.item_name, SUM(spend) AS sales
3
FROM totalorders, items
4
WHERE totalorders.item_name=items.item_name
5
GROUP BY totalorders.item_name
6
ORDER BY SUM(spend) DESC
From the SQL query above, we are selecting the item_name
column from the totalorders
view table, summing up customer spending on each product as sales
where values in the item_name
column in the totalorders
view table are equal to values in item_name
column in items
table, and then, the results are grouped by the item_name
column in the totalorders
view table and ordered by sales in a descending order, as shown below.
When using DbVisualizer, you can present data from any table in a configurable chart displayed in a line, bar, area, or pie chart. To activate the chart view, click the rightmost button in the result toolbar, as shown below.
When switching to the chart view, DbVisualizer automatically picks the first date or text column as the X-axis
for the chart and the first numeric column as the Y-axis
. In the following example, the item_number
is the X-axis, and Sales
as Y-axis, as shown below.
Changing Chart Type
To change the chart type, right-click on the chart and select the chart type option. The screenshot below shows our chart is in line form.
We can convert the line chart into a bar chart by selecting the Bar option as shown below.
The chart shows Airpods as the top-selling product with 2400 in sales and Water Filter as lowest selling product with 184 in sales.
Conclusion
In this article, you have learned how to design, build, and normalize a PostgreSQL database from scratch. You have also learned how to connect Postgres to a SQL client, create a database, run SQL queries and convert table data into charts. To learn more about database development and follow the newest trends in the database space, make sure to follow the DbVisualizer blog, and we will see you in the next one.