An invoice can contain 1 or more orders, how to archive this?
Example of Invoice:
OrderID | Order Date | Amount
31 10/02/2011 £1.50
43 12/02/2011 £1.50
74 13/02/2011 £5.00
=======
Total £8.00
If the Total is minus (eg: -8.00), it mean client owes me money.
Without minus, I pay client some money.
Here what I came up with:
Orders Table
CREATE TABLE IF NOT EXISTS `orders` (
`OrderID` int(11) NOT NULL AUTO_INCREMENT,
`Total` decimal(6,2) NOT NULL,
`OrderDate` datetime NOT NULL,
`Status` int(11) NOT NULL,
`userID` int(11) NOT NULL,
`InvoiceID` int(11) NOT NULL,
PRIMARY KEY (`OrderID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;
Invoice Table
CREATE TABLE IF NOT EXISTS `invoice` (
`InvoiceID` int(11) NOT NULL DEFAULT '0',
`InvoiceDate` datetime NOT NULL,
`Amount` decimal(6,2) NOT NULL,
`Status` int(11) NOT NULL,
PRIMARY KEY (`InvoiceID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
invoice.Status (0 Processing, 1 Invoice Sent, 2 Cancelled, 3 Completed)
or what better status can be?
Payment Table
CREATE TABLE IF NOT EXISTS `payment` (
`PaymentID` int(11) NOT NULL AUTO_INCREMENT,
`InvoiceID` int(11) NOT NULL,
`Amount` decimal(6,2) NOT NULL,
`DatePayment` datetime NOT NULL,
`PaymentType` int(11) NOT NULL,
PRIMARY KEY (`PaymentID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
payment.PaymentType = (1: Payment Received From Customer (Owes Money), 2: Payment Sent To Customer)
Database Result:
mysql> select * from orders;
+---------+-------+---------------------+--------+--------+-----------+
| OrderID | Total | OrderDate | Status | userID | InvoiceID |
+---------+-------+---------------------+--------+--------+-----------+
| 1 | 20.00 | 2011-06-18 15:51:51 | 1 | 123 | 1 |
| 2 | 10.00 | 2011-06-19 15:51:57 | 1 | 123 | 1 |
| 3 | 5.00 | 2011-06-20 15:52:00 | 1 | 123 | 1 |
+---------+-------+---------------------+--------+--------+-----------+
mysql> select * from invoice;
+-----------+---------------------+--------+--------+
| InvoiceID | InvoiceDate | Amount | Status |
+-----------+---------------------+--------+--------+
| 1 | 2011-06-30 15:55:21 | 35.00 | 1 |
+-----------+---------------------+--------+--------+
mysql> select * from payment;
+-----------+-----------+--------+---------------------+-------------+
| PaymentID | InvoiceID | Amount | DatePayment | PaymentType |
+-----------+-----------+--------+---------------------+-------------+
| 1 | 1 | 35.00 | 2011-06-29 15:56:16 | 1 |
+-----------+-----------+--------+---------------------+-------------+
Im I on the right path? What can be improved/changed or suggestion?
Thanks.
Ok, you have some serious issues here. Orders have mulitple items, invoices have multiple orders and payments may apply to mulitple orders and invoices. Orders may appear on multiple invoices (if they don't pay right aways which is common).
So what you need are linking tables. You should start with an ORDERINVOICE table which has both the order id and the invoice ID. Then an ORDERPAYMENT table with paymentid and Order id.
You also need to consider that in an ordering situation, you must record the details of the order as it occurred at the time. That means that while you should have the user_id to link to the current user, you should record the user's name, billing address and shipping addres as it was at the time of the order. You will need this information later to deal with any questions on the order. Further you need to ensure that you store the details of the order in a separate table called ORDERDETAILS which store the indivdual line items, the price at the time of the order and the name of the item ordered. You will need this for accounting reasons. You do not under any cuircumstances want to rely on a join to a product table to figure out the price of an order in the past. This will cause your finanacial records to be inaccurate.
Looks good.
The only thing I would add are some details like transaction id / check number to the payment table. This way you keep all the payment details together.
It looks alright to me, this is what i would have done aswell.
(I would think a payment is linked with an order, but if you intended to link it to an invoice this is fine)
Regards,
MN
Without knowing more about your requirements, so far so good.
Be sure to store your Invoice Status and Payment Type decodes in a lookup table so that they can be enforced in the database and don't have to rely on programmers coding it correctly.
Related
I'm looking for some help with a SQL/MySQL problem.
I have three source tables:
CREATE TABLE `customers` (
`cid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`customer_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`cid`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8
CREATE TABLE `standards` (
`sid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`standard_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`sid`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8
CREATE TABLE `partial_standard_compliance` (
`customer` bigint(20) unsigned NOT NULL,
`standard` bigint(20) unsigned NOT NULL,
`standard_compliance` bigint(20) unsigned DEFAULT NULL,
`created_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8
The idea is a customer gives themselves a rating using the standard_compliance column in the partial_standard_compliance table.
Customers can rate the same standard multiple times.
Result example:
+----------+----------+---------------------+---------------------+
| customer | standard | standard_compliance | created_time |
+----------+----------+---------------------+---------------------+
| 1 | 1 | 50 | 2023-01-28 16:19:34 |
| 1 | 1 | 60 | 2023-01-28 16:19:40 |
| 1 | 1 | 70 | 2023-01-28 16:19:48 |
| 2 | 10 | 30 | 2023-01-28 16:58:21 |
| 2 | 8 | 60 | 2023-01-28 16:58:32 |
| 2 | 9 | 60 | 2023-01-28 16:58:39 |
| 2 | 9 | 80 | 2023-01-28 16:58:43 |
+----------+----------+---------------------+---------------------+
I need to create a 4th table that has customer name, standard name and the most recent rating they have given themselves.
I have been trying with JOINS and CREATE AS SELECT, but haven't been able to solve it.
Any point in the right direction would be great. Thanks.
I have been trying with JOINS and CREATE AS SELECT
I need to create a 4th table that has customer name, standard name and
the most recent rating they have given themselves
Would be better if you create a view instead.
create view fourth_table as
select customer_name ,
standard_name ,
standard_compliance,
created_time
from (select c.customer_name,
s.standard_name,
psc.standard_compliance,
psc.created_time,
row_number() over(partition by c.customer_name order by psc.created_time desc ) as rn
from customers c
inner join partial_standard_compliance psc on psc.customer=c.cid
inner join standards s on s.sid=psc.standard
) x
where rn=1;
https://dbfiddle.uk/ZiK-k8jN
MySQL View
I'm trying to figure out which is the best way to optimize my current selection query on a MySQL database.
I have 2 MySQL tables with a relationship one-to-many. One is the user table that contains the unique list of users and It has around 22krows. One is the linedata table which contains all the possible coordinates for each user and it has around 490k rows.
In this case we can assume the foreign key between the 2 tables is the id value. In the case of the user table the id is also the auto-increment primary key, while in the linedata table it's not primary key cause we can have more rows for the same user.
The CREATE STMT structure
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isActive` tinyint(4) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`gender` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
`age` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21938 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` datetime NOT NULL,
`x` float NOT NULL,
`y` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The selection query
SELECT
u.id,
u.isActive,
u.userId,
u.name,
u.gender,
u.age,
GROUP_CONCAT(CONCAT_WS(', ',timestamp,x, y)
ORDER BY timestamp ASC SEPARATOR '; '
) as linedata_0
FROM user u
JOIN linedata l
ON u.id=l.id
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
GROUP BY userId;
The EXPLAIN output
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| ID | SELECT_TYPE | TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 491157 | "Using where; Using temporary; Using filesort" |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | l.id | 1 | NULL |
+-------+---------------+-----------+-----------+-------------------+-----------+---------------+-----------+-----------+------------------------------------------------------------+
The selection query works if for example I add another WHERE condition for filter single users. Let's say that I want to select just 200 user, then I got around 14 seconds as execution time. Around 7 seconds if I select just the first 100 users. But in case of having only datetime range condition it seems loading without an ending point. Any suggestions?
UPDATE
After following the Rick's suggestions now the query benchmark is around 14 seconds. Here below the EXPLAIN EXTENDED:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,PRIMARY,u,index,PRIMARY,PRIMARY,4,NULL,21959,100.00,NULL
1,PRIMARY,l,ref,id_timestamp_index,id_timestamp_index,4,u.id,14,100.00,"Using index condition"
2,"DEPENDENT SUBQUERY",NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,"No tables used"
I have changed a bit some values of the tables:
Where the id in user table can be joined with userId in linedata table. And they are integer now. We will have string type just for the userId value in user table cause it is a sort of long string identifier like 0000309ab2912b2fd34350d7e6c079846bb6c5e1f97d3ccb053d15061433e77a_0.
So, just for make a quick example we will have in user and in linedata table:
+-------+-----------+-----------+-------------------+--------+---+
| id | isActive | userId | name | gender |age|
+-------+-----------+-----------+-------------------+--------+---+
| 1 | 1 | x4by4d | john | m | 22|
| 2 | 1 | 3ub3ub | bob | m | 50|
+-------+-----------+-----------+-------------------+--------+---+
+-------+-----------+-----------+------+---+
| id | userId |timestamp | x | y |
+-------+-----------+-----------+------+----+
| 1 | 1 | somedate | 30 | 10 |
| 2 | 1 | somedate | 45 | 15 |
| 3 | 1 | somedate | 50 | 20 |
| 4 | 2 | somedate | 20 | 5 |
| 5 | 2 | somedate | 25 | 10 |
+-------+-----------+-----------+------+----+
I have added a compound index made of userId and timestamp values in linedata table.
Maybe instead of having as primary key an ai id value for linedata table, if I add a composite primary key made of userId+timestamp? Should increase the performance or maybe not?
I need to help you fix several bugs before discussing performance.
First of all, '2018-02-28T20:00:00.000Z' won't work in MySQL. It needs to be '2018-02-28 20:00:00.000' and something needs to be done about the timezone.
Then, don't "hide a column in a function". That is DATEDIFF(l.timestamp ...) cannot use any indexing on timestamp.
So, instead of
WHERE DATEDIFF(l.timestamp, '2018-02-28T20:00:00.000Z') >= 0
AND DATEDIFF(l.timestamp, '2018-11-20T09:20:08.218Z') <= 0
do something like
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218'
I'm confused about the two tables. Both have id and userid, yet you join on id. Perhaps instead of
CREATE TABLE `linedata` (
`id` int(11) NOT NULL,
`userId` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
...
you meant
CREATE TABLE `linedata` (
`id` int(11) NOT NULL AUTO_INCREMENT, -- (the id for `linedata`)
`userId` int NOT NULL, -- to link to the other table
...
PRIMARY KEY(id)
...
Then there could be several linedata rows for each user.
At that point, this
JOIN linedata l ON u.id=l.id
becomes
JOIN linedata l ON u.id=l.userid
Now, for performance: linedata needs INDEX(userid, timestamp) - in that order.
Now, think about the output. You are asking for up to 22K rows, with possibly hundreds of "ts,x,y" strung together in one of the columns. What will receive this much data? Will it choke on it?
And GROUP_CONCAT has a default limit of 1024 bytes. That will allow for about 50 points. If a 'user' can be in more than 50 spots in 9 days, consider increasing group_concat_max_len before running the query.
To make it work even faster, reformulate it this way:
SELECT u.id, u.isActive, u.userId, u.name, u.gender, u.age,
( SELECT GROUP_CONCAT(CONCAT_WS(', ',timestamp, x, y)
ORDER BY timestamp ASC
SEPARATOR '; ')
) as linedata_0
FROM user u
JOIN linedata l ON u.id = l.userid
WHERE l.timestamp >= '2018-02-28 20:00:00.000'
AND l.timestamp < '2018-11-20 09:20:08.218';
Another thing. You probably want to be able to look up a user by name; so add INDEX(name)
Oh, what the heck is the VARCHAR(255) for userID?? Ids are normally integers.
I'm new working with mysql and I'm trying to insert data into a table from a different table. I've searched and I've found that I need to do something like this:
INSERT INTO Customers (CustomerName, City, Country)
SELECT SupplierName, City, Country FROM Suppliers;
These are the tables' composition:
Credit_request:
Fieldname DataType
| request_id | int(10) unsigned
| customer_id | int(11)
| total_credit_value | int(11)
| Credit_request_checked | set('yes','no')
| does_apply | enum('yes','no')
| creation_time | datetime
Customers:
Fieldname DataType
| Customer_id | int(11) unsigned
| name | varchar(70)
| lastname | varchar(70)
| sex | enum('M','F')
| personal_id | varchar(16)
| phone_number | varchar(20)
| email | varchar(70)
| birthdate | date
| address | varchar(70)
| city | varchar(70)
| job | varchar(60)
| salary | int(11)
| registration_date | datetime
+-------------------+------------------
but when I try, I get a syntax error. This is my code:
INSERT INTO Credit_request(null,'Carlos',custormer_id
,default,default,
default,default,default,default)
SELECT Customer_id FROM Customers WHERE Customer_name ='Carlos';
these defaults values are supposed to be there, I've set them however, I've noticed in order to do this kind of insert I have to reference to each field name but in this case I just want one piece of information from the other table. Any help would be appreciated.
If you want to insert only secific columns, you can specify those column names and rest of the columns will be assinged default values. Your script would be:
INSERT INTO Credit_request(custormer_id)
SELECT Customer_id FROM Customers WHERE Customer_name ='Carlos';
This will add one row into Credit_request table with Carlos's customer_id in customer_id column and null in other columns.
You've got your syntax a little mixed up. The INSERT statement should declare the columns you're inserting into, the SELECT statement declares the values. For columns that have a default, you can omit them. Nullable columns always default to NULL unless otherwise stated.
INSERT INTO Credit_request(
`customer_id`,
`creation_time`
) SELECT
`Customer_id`,
NOW()
FROM `Customers`
WHERE `Customer_name` ='Carlos';
In a scheduling application I am working on I am dealing with a fairly complex database schema in order to describe a series of kids assigned to groups on timeslots on certain dates. Now in this schema, I want to query the database what the number of scheduled kids are on a certain group for a certain timeslot on a certain range of dates.
DB Schema
Timeslot: A timeslot has a certain start and end time (e.g. 13:00 - 18:00). Time can vary in 15-minute steps. In our application we want to schedule a kid on a group for the duration of this timeslot.
Time slice: For every 15 minutes in a 24-hour period exists a time slice record (96). 15 minutes is the smallest possible planning unit. A timeslot is assigned to each slice covered between its start and end time (for example, timeslot 13:00-18:00 will have a record pointing to time slice [13:00, 13:15, 13:30...17:45]). This makes it possible to count how many kids are 'occupying' the same time slice at any give time and date.
Kid: A kid is simply the entity being scheduled
Group: A group is a representation of a physical location with a specific capacity
GroupAssignment: A group assignment is bound in time. Between date 1 and 2 it could be group A, between date 2 and 3 it could be group B.
Occupancy: The main scheduling record. This has a timeslot_id, kid_id, start and end date. note: a kid is scheduled on the start day and every subsequent 7 days up to the end date.
DB Schema SQL
The number of records can be roughly derived from the auto_increment value. If not present, I mentioned them manually.
CREATE TABLE `group_assignment_caches` (
`group_id` int(11) DEFAULT NULL,
`occupancy_id` int(11) DEFAULT NULL,
`start` date DEFAULT NULL,
`end` date DEFAULT NULL,
KEY `index_group_assignment_caches_on_occupancy_id` (`occupancy_id`),
KEY `index_group_assignment_caches_on_group_id` (`group_id`),
KEY `index_group_assignment_caches_on_start_and_end` (`start`,`end`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/* (~1500 records) */
CREATE TABLE `kids` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`archived` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=592 DEFAULT CHARSET=utf8;
CREATE TABLE `occupancies` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`kid_id` int(11) DEFAULT NULL,
`timeslot_id` int(11) DEFAULT NULL,
`start` date DEFAULT NULL,
`end` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_occupancies_on_kid_id` (`kid_id`),
KEY `index_occupancies_on_timeslot_id` (`timeslot_id`),
KEY `index_occupancies_on_start_and_end` (`start`,`end`)
) ENGINE=InnoDB AUTO_INCREMENT=2675 DEFAULT CHARSET=utf8;
CREATE TABLE `time_slices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start` time DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_time_slices_on_start` (`start`)
) ENGINE=InnoDB AUTO_INCREMENT=97 DEFAULT CHARSET=latin1;
CREATE TABLE `timeslot_slices` (
`timeslot_id` int(11) DEFAULT NULL,
`time_slice_id` int(11) DEFAULT NULL,
KEY `index_timeslot_slices_on_timeslot_id` (`timeslot_id`),
KEY `index_timeslot_slices_on_time_slice_id` (`time_slice_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/* (~1500 records) */
CREATE TABLE `timeslots` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start` time DEFAULT NULL,
`end` time DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=91 DEFAULT CHARSET=utf8;
Current solution
So far, I have designed the following query to tie it all together. While it does work, it scales very poorly. Running the query using 1 date, 1 timeslot and 1 group it takes about 50ms. However, with 100 dates this becomes 1000ms and when you start adding groups and timeslots this quickly rises exponentially in the multiple seconds. Ive noticed that the runtime is highly dependent on the size of the timeslot. It seems that when a specific timeslot covers more time slices it escalates rapidly in runtime!
SELECT subq.date, subq.group_id, subq.timeslot_id, MAX(subq.spots) AS max_spots
FROM (
SELECT di.date,
ts.start,
gac.group_id AS group_id,
tss2.timeslot_id AS timeslot_id,
COUNT(*) AS spots
FROM date_intervals di,
timeslot_slices tss2,
occupancies o
JOIN timeslots t ON o.timeslot_id = t.id
JOIN group_assignment_caches gac ON o.id = gac.occupancy_id
JOIN timeslot_slices tss1 ON t.id = tss1.timeslot_id
JOIN time_slices ts ON tss1.time_slice_id = ts.id
JOIN kids k ON o.kid_id = k.id
WHERE di.date BETWEEN gac.start AND gac.end
AND di.date BETWEEN o.start AND o.end
AND MOD(DATEDIFF(di.date, o.start),7)=0
AND k.archived = 0
AND tss1.time_slice_id = tss2.time_slice_id
AND gac.group_id IN (3) AND tss2.timeslot_id IN (5)
GROUP BY ts.start, di.date, group_id, timeslot_id
) subq
GROUP BY subq.date, subq.group_id, subq.timeslot_id
Note that running the derived subquery separately takes the same amount of time. This yields 1 record with the number of occupancies for each time slice (15 min) for the given group in the given timeslot. This is great for debugging. Obviously I am only interested in the max number of occupancies for the entire timeslot.
Date_intervals is not described in the schema. This is a temporary table I fill using a REPEAT statement at the beginning of this procedure call. Its only column is 'date' and it's filled with 10-300 dates generally in most situations. The query should be able to handle this.
If I EXPLAIN this query, I get the following results. I am not really sure how to go further from here. The first row about the derived table can be ignored, since executing the subquery takes the same amount of time. The only other table not using an index is date_intervals di which is a small temporary table with 122 records.
+----+-------------+------------+--------+----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+---------+----------------------------+------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+---------+----------------------------+------+------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5124 | Using temporary; Using filesort |
| 2 | DERIVED | tss2 | ref | index_timeslot_slices_on_timeslot_id,index_timeslot_slices_on_time_slice_id | index_timeslot_slices_on_timeslot_id | 5 | | 42 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | ts | eq_ref | PRIMARY | PRIMARY | 4 | ookidoo.tss2.time_slice_id | 1 | |
| 2 | DERIVED | tss1 | ref | index_timeslot_slices_on_timeslot_id,index_timeslot_slices_on_time_slice_id | index_timeslot_slices_on_time_slice_id | 5 | ookidoo.tss2.time_slice_id | 6 | Using where |
| 2 | DERIVED | o | ref | PRIMARY,index_occupancies_on_timeslot_id,index_occupancies_on_kid_id,index_occupancies_on_start_and_end | index_occupancies_on_timeslot_id | 5 | ookidoo.tss1.timeslot_id | 6 | Using where |
| 2 | DERIVED | k | eq_ref | PRIMARY | PRIMARY | 4 | ookidoo.o.kid_id | 1 | Using where |
| 2 | DERIVED | gac | ref | index_group_assignment_caches_on_occupancy_id,index_group_assignment_caches_on_start_and_end,index_group_assignment_caches_on_group_id | index_group_assignment_caches_on_occupancy_id | 5 | ookidoo.o.id | 1 | Using where |
| 2 | DERIVED | di | range | PRIMARY | PRIMARY | 3 | NULL | 1 | Range checked for each record (index map: 0x1) |
| 2 | DERIVED | t | eq_ref | PRIMARY | PRIMARY | 4 | ookidoo.o.timeslot_id | 1 | Using where; Using index |
+----+-------------+------------+--------+----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+---------+----------------------------+------+------------------------------------------------+
Current results
The above query yields the following results (122 records, abbreviated)
date group_id timeslot_id max_spots
+------------+----------+-------------+-----------+
| date | group_id | timeslot_id | max_spots |
+------------+----------+-------------+-----------+
| 2012-08-20 | 3 | 5 | 12 |
| 2012-08-27 | 3 | 5 | 12 |
| 2012-09-03 | 3 | 5 | 12 |
| 2012-09-10 | 3 | 5 | 12 |
+------------+----------+-------------+-----------+
| 2014-11-24 | 3 | 5 | 15 |
| 2014-12-01 | 3 | 5 | 15 |
| 2014-12-08 | 3 | 5 | 15 |
| 2014-12-15 | 3 | 5 | 15 |
+------------+----------+-------------+-----------+
Wrapping up
I would like to know a way to either restructure my query or even my database schema in order to make querying this information less time consuming. I can't imagine this being impossible, considering there are relatively so little records present in this database (10-1000's for most tables)
Any sufficient complex problem can bring a computer to its knees. Actually, it's easy to create a complex problem, and difficult to make a complex problem easy.
Your single query is very complex. It goes over the entire database. Is that necessary? What happens if, for instance, you restrict it to one date? Does it scale better?
Using just a single query to do a complex task is often very efficient, but not always, as you've found out. I often find that the only way to break the exponential time needed to execute the task, is to split it up in multiple steps. One date at a time, for instance. Perhaps you don't always need them all?
In some of those cases I use an intermediate SQLite database that resides in memory. Operations on a small (!) temporary database in memory are very fast. It work like this:
$SQLiteDB = new PDO("sqlite::memory:");
$SQLiteDB->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$SQL = "<any valid sqlite query>";
$SQLiteDB->query($SQL);
First check that you have the sqlite PHP module installed. Read the manual:
http://www.sqlite.org
When using this you first create tables in your new database and then you populate them with the needed data. You can use prepared statements if you have to copy multiple rows.
The tricky bit is taking apart your single complex query. How you would do that depends on the exact question you want to answer. The art is to limit the amount of data you have to work with. Don't copy the whole database, but make an informed selection.
A big advantage of taking multiple smaller steps is that your code may become much more readable, and understandable. I wouldn't want to be the guy who has to change your SQL query ten years from now because you went on to other things.
I have found a solution which is acceptable for my particular use case.
I have created an intermediate or 'cache' table with the following structure:
CREATE TABLE `occupancy_caches` (
`occupancy_id` int(11) DEFAULT NULL,
`kid_id` int(11) DEFAULT NULL,
`group_id` int(11) DEFAULT NULL,
`client_id` int(11) DEFAULT NULL,
`date` date DEFAULT NULL,
`timeslot_id` int(11) DEFAULT NULL,
`start` int(11) DEFAULT NULL,
`end` int(11) DEFAULT NULL,
KEY `index_occupancy_caches_on_date_and_client_id` (`date`,`client_id`),
KEY `index_occupancy_caches_on_date_and_group_id` (`date`,`group_id`),
KEY `index_occupancy_caches_on_occupancy_id` (`occupancy_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
This allowed me to completely eliminate the group_assignment_caches table and no longer did I have to search for dates using calculated columns (MOD(DATEDIFF...)). Also, I only needed a single join on the time slices instead of 2.
The downside, however, is that I now have to create an occupancy_caches record for every week covered by the original occupancies record. In most cases these occupancies describe a 4 year period. This means that for every occupancies record I now have to create 400 (!) records... Since the number of records will only grow linear, correct usage of indexes should keep this from spinning out of control when the system grows.
Time will tell, though...
Company will receive an invoice on the 1st and 16th every month. (It will run via Cron Job every 2 week. It scan through the order table and then add into 'invoice' table. Is there alternative?)
There are list of customers orders in the orders table and it also indicate which company it belong to (orders.company_id)
The invoice table calculate the total cost of the orders from orders table.
I am trying to figure it out how to design reasonable invoices tracking. Sometime company will have to send me the fees or sometime I send them the fees (invoice.amount)
I need to track the invoices with the following:
when the company have sent me the amount
when did I sent the amount to the company
how much amount has been received from the company
how much amount did I sent to the company
did I receive the full amount (if not, what do I need to update on the Db?)
invoice status (Invoice Sent, Cancelled, Amount Received, Amount Sent)
Here is the database design I have came up with:
company table
mysql> select * from company;
+----+-----------+
| id | name |
+----+-----------+
| 1 | Company A |
| 2 | Company B |
+----+-----------+
Customers can select a company from my website.
orders table
mysql> select * from orders;
+----+---------+------------+------------+---------------------+-----------+
| id | user_id | company_id | total_cost | order_date | status_id |
+----+---------+------------+------------+---------------------+-----------+
| 1 | 5 | 2 | 25.00 | 2012-02-03 23:30:24 | 1 |
| 2 | 7 | 2 | 30.00 | 2012-02-13 18:06:12 | 1 |
+----+---------+------------+------------+---------------------+-----------+
two customers have ordered the products from Company B (orders.company_id = 2). I know the orders fields is not enough, just simplified for you.
orders_products table
mysql> select * from orders_products;
+----+----------+------------+--------------+-------+
| id | order_id | product_id | product_name | cost |
+----+----------+------------+--------------+-------+
| 1 | 1 | 34 | Chair | 10.00 |
| 2 | 1 | 25 | TV | 10.00 |
| 3 | 1 | 27 | Desk | 2.50 |
| 4 | 1 | 36 | Laptop | 2.50 |
| 5 | 2 | 75 | PHP Book | 25.00 |
| 6 | 2 | 74 | MySQL Book | 5.00 |
+----+----------+------------+--------------+-------+
List of products what customers have ordered.
invoice table
mysql> select * from invoice;
+----+------------+------------+---------------------+--------+-----------+
| id | company_id | invoice_no | invoice_date | amount | status_id |
+----+------------+------------+---------------------+--------+-----------+
| 7 | 2 | 123 | 2012-02-16 23:59:59 | 55.00 | 1 |
+----+------------+------------+---------------------+--------+-----------+
This is where I am quite stuck on invoice tables design. I am not sure how it should be done. Invoices will be generated every 2 weeks. From the result example invoice.amount is 55.00 because it has been calculated from orders.company_id = 2 table
If the invoice.amount is -50.00 (minus), it mean company will need to send me the fees amount.
If the invoice.amount is 50.00, it mean I need to send the company the fees.
The status_id could be: (1)Invoice Sent, (2)Cancelled, (3)Completed
Do I need to add invoice_id field in the orders table? Update the orders.invoice_id field when row has been inserted into 'invoice' table.
invoice_payment table
mysql> select * from invoice_payment;
+----+------------+-----------------+-------------+---------------------+---------------------+
| id | invoice_id | amount_received | amount_sent | date_received | date_sent |
+----+------------+-----------------+-------------+---------------------+---------------------+
| 1 | 1 | 0.00 | 55.00 | 0000-00-00 00:00:00 | 2012-02-18 22:20:53 |
+----+------------+-----------------+-------------+---------------------+---------------------+
This is where I can track and update transaction.. the payment will be made via BACS.
Is this good tables design or what do I need to improve? What fields and tables I should add?
If the invoice has been generated and later I need to make the changes in orders_products or orders tables - should it recalculate the invoice.amount field? (I will be using PHP / MySQL).
SQL Dump:
CREATE TABLE IF NOT EXISTS `company` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(25) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
INSERT INTO `company` (`id`, `name`) VALUES
(1, 'Company A'),
(2, 'Company B');
CREATE TABLE IF NOT EXISTS `invoice` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`company_id` int(11) NOT NULL,
`invoice_no` int(11) NOT NULL,
`invoice_date` datetime NOT NULL,
`amount` decimal(6,2) NOT NULL,
`status_id` tinyint(1) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
INSERT INTO `invoice` (`id`, `company_id`, `invoice_no`, `invoice_date`, `amount`, `status_id`) VALUES
(7, 2, 123, '2012-02-16 23:59:59', '55.00', 1);
CREATE TABLE IF NOT EXISTS `invoice_payment` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`invoice_id` int(11) NOT NULL,
`amount_received` decimal(6,2) NOT NULL,
`amount_sent` decimal(6,2) NOT NULL,
`date_received` datetime NOT NULL,
`date_sent` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=2 ;
INSERT INTO `invoice_payment` (`id`, `invoice_id`, `amount_received`, `amount_sent`, `date_received`, `date_sent`) VALUES
(1, 1, '0.00', '55.00', '0000-00-00 00:00:00', '2012-02-18 22:20:53');
CREATE TABLE IF NOT EXISTS `orders` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`company_id` int(11) NOT NULL,
`total_cost` decimal(6,2) NOT NULL,
`order_date` datetime NOT NULL,
`status_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
INSERT INTO `orders` (`id`, `user_id`, `company_id`, `total_cost`, `order_date`, `status_id`) VALUES
(1, 5, 2, '25.00', '2012-02-03 23:30:24', 1),
(2, 7, 2, '30.00', '2012-02-13 18:06:12', 1);
CREATE TABLE IF NOT EXISTS `orders_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` int(11) NOT NULL,
`product_id` int(11) NOT NULL,
`product_name` varchar(100) NOT NULL,
`cost` decimal(6,2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=7 ;
INSERT INTO `orders_products` (`id`, `order_id`, `product_id`, `product_name`, `cost`) VALUES
(1, 1, 34, 'Chair', '10.00'),
(2, 1, 25, 'TV', '10.00'),
(3, 1, 27, 'Desk', '2.50'),
(4, 1, 36, 'Laptop', '2.50'),
(5, 2, 75, 'PHP Book', '25.00'),
(6, 2, 74, 'MySQL Book', '5.00');
Feel free you want to updates/add tables to Answer here.
Thanks
Have a look at my add-on for Gemini - SimplyFi. It will allow you to brand your invoices accordingly, can auto email them to customers when they generated, can log payments and send reminders for payments not received (statements) and has a full REST based API you can use to integrate into your system. Also may be able to benefit off the recurring billing it features.
Where you mention negative invoice amounts, those are effectively "Credit Notes" (from what I've understood from your post). Generally, you should not be changing invoices themselves after they have been issued to a client - if you need to make amendments to an amount (ie: add on, or subtract off) then you should be issuing a new invoice (for added amount), or a credit note, for subtracted amount.
Also, I would suggest you don;t send the customer money back if they are going to receive a new invoice in a few weeks time, simply keep track of their account balance, and only issue invoices or credit notes when necessary. Moving money around costs money, and you don't need to do it if it's not necessary. Just my 2 cents