This tutorial is about db modelling with a small example. Of course this is just an overwiew and there is much more. I assume you know how to create tables and Co.
Databases
A db is a collection of related data and it contains tables. Mostly it's called database system like (SQlite or MySQL, etc.) as it brings a lot of methods, functions and services with it like access methods like the db language to do queries or to create new tables.
Tables
A table can be called an object or class. It contains all data related to this object. Objects are "things" as they exist in reality like a customer, an item, a car, an order, etc. When you know this, you are almost there when you create a new db. For every object there should/must be an own table. So when you think about an app handling customers, orders and items you know you need 3 tables. So all the data related to a customer (like name and address) will be stored here. A table has columns and rows like you know it from Excel.
Relations
This is a very important point. Every object (table) interacts with each other as they are related to each other. A customer places an order so there's a relation between the customer and his/her order. Easy but important. Each order has items which is another relation. No order without items and no order without a customer placing it. Like views in B4x this relations can be seen as parent/child relations. In this case the cutomer is the parent and the orders are "children" of the customer object. Every "child" has a tag (foreign key) to identify it's parent.
Indexes
In the early days we had phone books on paper. To find someones phone number there was a register/index with the first letter of the surname. So we "stored" the person John Smith under "S". In every db system you can add an index to any column you want. The goal is to get a very fast access to the data you need at that time. If you need to get all birthdays of all persons you would need to scan all of the persons in you paper phone book which takes a lot of time. Using db's solves this problem by just adding a new index to your table. This can be done at any time of you need it. This of course depends on the needs of your app.
Unique/Primary Keys/Autoincrement
If you handle customers in a table you need to identify the exact one you mean. The name isn't a good idea because there are probably thousands of John Smith's you have as you customers. Even if you combine it with the birthday there are several Johns with the same birthdate. To uniquely identitify ONE person/customer there is always a Unique Key like customer number, social number, tax number, etc. You could create this unique ID on your own (if a specific format is needed) but all db systems help you here. You can add a column (integer) and define it as autoincrement. When you insert a new row (= new customer here) the db system cares automatically and inserts a new number. If the last one was 1111 the next will be 1112 and wil be unique. For every table I use this method so I can identify each entry by it's unique key (like customer 1112 has placed an order with order id 4455 with items with id's 6665, 9993 and 4444).
Normalization/Denormalization
Wikipedia: "Normalization or normalisation refers to a process that makes something more normal or regular"
Think about the customer you want to handle. A customer has an address and some other properties. He/she can have more than one address and more than one payment methods. The idea here is to store the addresses and payment methods to another table which is a good idea. Here we talk about db modeling. You can bring this idea to the max and "disassemble" a customer to the atoms like a customer is human and has children and is male and so one like a class with methods and entities, etc. This is sometimes too mathematic and leads to speed problems. On the other side it is very powerful when you have to create a CRM system (customer relation system).
The db model is one thing. To have a more realistic scenery you can "denormalizate" this model. Just like you meet your friends. Sometimes it's ok to have some drinks and food. You don't need to know everything at this moment. Just the things you need at that point. DB: You only create one table for the customer with one address and all properties. Enough for your app. In the model it's maybe 12 objects a customer is "made of".
Size doesn't matter & most common mistakes
I sometimes read threads with "My app is slow because the db is slow" or "I want to load 100.000 rows in a ListView". My answer here: Either a design or a programming mistake and the people wonder.
With a good design, size doesn't matter. If you have a db with 1 bilion entries, the selection of data is as fast as if you have with 100 entries. A good query takes 0.02 secs or less. If it's 2.4 secs (which is over a 100 times slower) there IS an error in your db design, query or even a problem in your code flow.
Never retrieve more data than a user can handle at one time! Loading 100.000 rows in a ListView does NOT make sense as one user can't handle so many data. Good practice is to load 50 or so and load another 50 if needed. Of course there are exceptions but then we are talking about a batch system which processes data and not a single user.
So let's start with a simple example
I want to build an app where customers can place orders with items. So I have 4 tables:
customers
orders
orderitems
items
Why orderitems AND items?
Items is the catalog of the items you CAN order. Orderitems are the items the customer actually orders. Imagine you change a price later. With this completed orders are "safe" as they represent the finished order as it was at the time of ordering.
In customers we have 3 customers:
CID Name
1 Bill Miller
2 Sam Smith
3 Caroline James
As you see every customer has it's uniquey ID. If we address "Sam Smith" we will use ID 2 in our app. The name is not important anymore (except to display it). Benefit, too: We save a lot of space as integers mostly need less space than a full name.
In items:
IID Itemname Price
1 Cake 10
2 Teddybear 15
3 Milk 1.5
Same here. Every item has an unique ID, too. Like in customers we only use the ID from here.
And what if I need to change the price from a specific date on?
Two thoughts here:
- the price for previous orders must be kept
- the new price is valid in the future
Just add to columns like validfrom and voidfrom. Insert a new item (same name) and by setting the dates in both you can easily set when the old is void and the new one is valid
IID Itemname Price validfrom voidfrom
1 Cake 10 2018-01-01 2018-04-04
4 Cake 12 2018-01-04 9999-12-31
2 Teddybear 15 2018-01-01 9999-12-31
3 Milk 1.5 2018-01-01 9999-12-31
To display the item list use
"Select * FROM items WHERE validfrom <= 'today' and voidfrom > 'today'"
which only displays ID 4 for new orders from April 4th and 1 before. Benefit: You can change a price for the future (like from January 1st, 2019).
So if Sam orders milk:
orders
OID CID
2233 = New Order 2 = Sam
By inserting a new row in orders the OrderId (OID) will be increased and set automatically. Customer is Sam with his ID 2 (see the customer's table)
orderitems
IOID OID IID Price
1 2233 = which order 3 = Milk 1.5
In words: Customer 2 has placed an order with an id 2233 and this order has one item with itemorderid 3 (= milk). If you add a second item:
IOID OID IID Price
1 2233 = which order 3 = Milk 1.5
2 2233 = same order 4 = Cake 12
Note: All data (like prices) in orderitems are frozen as it represents the order at the time the order was placed. This may not be important for today but you need to keep the data for years and the prices here may not change because it is a historic view (finished order).
Accessing orders and it's items
As you know the customer (here e.g. Sam Smith with ID 2) you can easily access his orders by
"Select * FROM orders WHERE CID=2"
As a customer can have more than one order, use a cursor here to process all of them. Assuming you display the orders in a view and clicking on it will show the order details (orderitems) you know the ID or the order and can access the items by
"Select * FROM orderitems WHERE OID=2233"
Add more columns you need like orderdate, orderstate, quantity, etc.
Databases
A db is a collection of related data and it contains tables. Mostly it's called database system like (SQlite or MySQL, etc.) as it brings a lot of methods, functions and services with it like access methods like the db language to do queries or to create new tables.
Tables
A table can be called an object or class. It contains all data related to this object. Objects are "things" as they exist in reality like a customer, an item, a car, an order, etc. When you know this, you are almost there when you create a new db. For every object there should/must be an own table. So when you think about an app handling customers, orders and items you know you need 3 tables. So all the data related to a customer (like name and address) will be stored here. A table has columns and rows like you know it from Excel.
Relations
This is a very important point. Every object (table) interacts with each other as they are related to each other. A customer places an order so there's a relation between the customer and his/her order. Easy but important. Each order has items which is another relation. No order without items and no order without a customer placing it. Like views in B4x this relations can be seen as parent/child relations. In this case the cutomer is the parent and the orders are "children" of the customer object. Every "child" has a tag (foreign key) to identify it's parent.
Indexes
In the early days we had phone books on paper. To find someones phone number there was a register/index with the first letter of the surname. So we "stored" the person John Smith under "S". In every db system you can add an index to any column you want. The goal is to get a very fast access to the data you need at that time. If you need to get all birthdays of all persons you would need to scan all of the persons in you paper phone book which takes a lot of time. Using db's solves this problem by just adding a new index to your table. This can be done at any time of you need it. This of course depends on the needs of your app.
Unique/Primary Keys/Autoincrement
If you handle customers in a table you need to identify the exact one you mean. The name isn't a good idea because there are probably thousands of John Smith's you have as you customers. Even if you combine it with the birthday there are several Johns with the same birthdate. To uniquely identitify ONE person/customer there is always a Unique Key like customer number, social number, tax number, etc. You could create this unique ID on your own (if a specific format is needed) but all db systems help you here. You can add a column (integer) and define it as autoincrement. When you insert a new row (= new customer here) the db system cares automatically and inserts a new number. If the last one was 1111 the next will be 1112 and wil be unique. For every table I use this method so I can identify each entry by it's unique key (like customer 1112 has placed an order with order id 4455 with items with id's 6665, 9993 and 4444).
Normalization/Denormalization
Wikipedia: "Normalization or normalisation refers to a process that makes something more normal or regular"
Think about the customer you want to handle. A customer has an address and some other properties. He/she can have more than one address and more than one payment methods. The idea here is to store the addresses and payment methods to another table which is a good idea. Here we talk about db modeling. You can bring this idea to the max and "disassemble" a customer to the atoms like a customer is human and has children and is male and so one like a class with methods and entities, etc. This is sometimes too mathematic and leads to speed problems. On the other side it is very powerful when you have to create a CRM system (customer relation system).
The db model is one thing. To have a more realistic scenery you can "denormalizate" this model. Just like you meet your friends. Sometimes it's ok to have some drinks and food. You don't need to know everything at this moment. Just the things you need at that point. DB: You only create one table for the customer with one address and all properties. Enough for your app. In the model it's maybe 12 objects a customer is "made of".
Size doesn't matter & most common mistakes
I sometimes read threads with "My app is slow because the db is slow" or "I want to load 100.000 rows in a ListView". My answer here: Either a design or a programming mistake and the people wonder.
With a good design, size doesn't matter. If you have a db with 1 bilion entries, the selection of data is as fast as if you have with 100 entries. A good query takes 0.02 secs or less. If it's 2.4 secs (which is over a 100 times slower) there IS an error in your db design, query or even a problem in your code flow.
Never retrieve more data than a user can handle at one time! Loading 100.000 rows in a ListView does NOT make sense as one user can't handle so many data. Good practice is to load 50 or so and load another 50 if needed. Of course there are exceptions but then we are talking about a batch system which processes data and not a single user.
So let's start with a simple example
I want to build an app where customers can place orders with items. So I have 4 tables:
customers
orders
orderitems
items
Why orderitems AND items?
Items is the catalog of the items you CAN order. Orderitems are the items the customer actually orders. Imagine you change a price later. With this completed orders are "safe" as they represent the finished order as it was at the time of ordering.
In customers we have 3 customers:
CID Name
1 Bill Miller
2 Sam Smith
3 Caroline James
As you see every customer has it's uniquey ID. If we address "Sam Smith" we will use ID 2 in our app. The name is not important anymore (except to display it). Benefit, too: We save a lot of space as integers mostly need less space than a full name.
In items:
IID Itemname Price
1 Cake 10
2 Teddybear 15
3 Milk 1.5
Same here. Every item has an unique ID, too. Like in customers we only use the ID from here.
And what if I need to change the price from a specific date on?
Two thoughts here:
- the price for previous orders must be kept
- the new price is valid in the future
Just add to columns like validfrom and voidfrom. Insert a new item (same name) and by setting the dates in both you can easily set when the old is void and the new one is valid
IID Itemname Price validfrom voidfrom
1 Cake 10 2018-01-01 2018-04-04
4 Cake 12 2018-01-04 9999-12-31
2 Teddybear 15 2018-01-01 9999-12-31
3 Milk 1.5 2018-01-01 9999-12-31
To display the item list use
"Select * FROM items WHERE validfrom <= 'today' and voidfrom > 'today'"
which only displays ID 4 for new orders from April 4th and 1 before. Benefit: You can change a price for the future (like from January 1st, 2019).
So if Sam orders milk:
orders
OID CID
2233 = New Order 2 = Sam
By inserting a new row in orders the OrderId (OID) will be increased and set automatically. Customer is Sam with his ID 2 (see the customer's table)
orderitems
IOID OID IID Price
1 2233 = which order 3 = Milk 1.5
In words: Customer 2 has placed an order with an id 2233 and this order has one item with itemorderid 3 (= milk). If you add a second item:
IOID OID IID Price
1 2233 = which order 3 = Milk 1.5
2 2233 = same order 4 = Cake 12
Note: All data (like prices) in orderitems are frozen as it represents the order at the time the order was placed. This may not be important for today but you need to keep the data for years and the prices here may not change because it is a historic view (finished order).
Accessing orders and it's items
As you know the customer (here e.g. Sam Smith with ID 2) you can easily access his orders by
"Select * FROM orders WHERE CID=2"
As a customer can have more than one order, use a cursor here to process all of them. Assuming you display the orders in a view and clicking on it will show the order details (orderitems) you know the ID or the order and can access the items by
"Select * FROM orderitems WHERE OID=2233"
Add more columns you need like orderdate, orderstate, quantity, etc.