MySQL query not acting as expected - mysql

I am trying to get all bills that have NOT been FULLY paid.
I have three tables that are needed for this.
Table 1 - billInvoiceMain
biId - unique ID
userId - users ID
type - bill or invoice
userItemId - unique ID that user chooses for their records
Table 2 - billInvoiceDetail
biId - references unique ID in billInvoiceMain
quantity
price
Table 3 - transaction
transactionId - unique ID
userId - users ID
biId - references id in billInvoiceMain
paymentAmount
So a user enters bills, and then once they make a payment (multiple smaller payments could be made on a bill until it reaches the full amount or they could make a single payment for the whole amount) they enter it and it gets saved in the transaction table.
Here is a SQL Fiddle that has abbreviated versions of test data.
CREATE TABLE IF NOT EXISTS `billInvoiceDetail` (
`biId` int(15) NOT NULL,
`productId` int(15) DEFAULT NULL,
`accountId` int(15) DEFAULT NULL,
`description` varchar(2000) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`quantity` decimal(20,3) NOT NULL,
`price` decimal(20,2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `billInvoiceDetail` (`biId`, `productId`, `accountId`, `description`, `quantity`, `price`) VALUES
(51, NULL, 7, 'Pylaisiella steerei Ando & Higuchi', 4.000, 19.65),
(51, NULL, 11, 'Rubus insons L.H. Bailey', 1.000, 10.17),
(99, NULL, 11, 'Leontodon hispidus L.', 3.000, 11.99),
(99, NULL, 7, 'Peltophorum (T. Vogel) Benth.', 5.000, 33.76),
(100, NULL, 8, 'Scleria P.J. Bergius', 1.000, 10.55),
(100, NULL, 12, 'Gilia ochroleuca M.E. Jones ssp. exilis (A. Gray) A.D. Grant & V.E. Grant', 2.000, 42.54);
CREATE TABLE IF NOT EXISTS `billInvoiceMain` (
`biId` int(15) NOT NULL,
`userId` int(15) NOT NULL,
`type` varchar(7) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`cvId` int(15) NOT NULL,
`startDate` date DEFAULT NULL,
`dueDate` date DEFAULT NULL,
`userItemId` varchar(25) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `billInvoiceMain` (`biId`, `userId`, `type`, `cvId`, `startDate`, `dueDate`, `userItemId`) VALUES
(51, 1, 'bill', 17, '2021-01-01', '2021-01-31', '53396841'),
(99, 1, 'bill', 28, '2021-01-01', '2021-01-31', '16269083'),
(100, 1, 'bill', 28, '2021-01-07', '2021-01-17', '03200283');
CREATE TABLE IF NOT EXISTS `transaction` (
`transactionId` int(15) NOT NULL,
`userId` int(15) NOT NULL,
`biId` int(15) NOT NULL,
`paymentDate` date NOT NULL,
`paymentMethod` varchar(20) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`accountId` int(15) NOT NULL,
`paymentAmount` decimal(20,2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `transaction` (`transactionId`, `userId`, `biId`, `paymentDate`, `paymentMethod`, `accountId`, `paymentAmount`) VALUES
(51, 1, 51, '2021-01-04', 'Check', 1, 78.60),
(52, 1, 51, '2021-01-19', 'Credit Card', 3, 10.17),
(53, 1, 99, '2021-01-14', 'Check', 1, 10.00);
SELECT billInvoiceMain.biId, SUM(transaction.paymentAmount), billInvoiceMain.useritemid
FROM billInvoiceMain
INNER JOIN transaction ON billInvoiceMain.biId = transaction.biId
WHERE billInvoiceMain.userId = 1 AND billInvoiceMain.type = 'bill'
GROUP BY billInvoiceMain.biId;
SELECT ROUND(ABS(SUM(billInvoiceDetail.price *billInvoiceDetail.quantity)),2)
FROM billInvoiceDetail
INNER JOIN billInvoiceMain ON billInvoiceDetail.biId = billInvoiceMain.biId
WHERE billInvoiceMain.userId=1 AND billInvoiceMain.type = 'bill'
GROUP BY billInvoiceMain.biId;
SELECT billInvoiceMain.biId, billInvoiceMain.useritemid
FROM billInvoiceMain
INNER JOIN transaction ON billInvoiceMain.biId = transaction.biId
INNER JOIN billInvoiceDetail ON billInvoiceDetail.biId = transaction.biId
WHERE billInvoiceMain.userId = 1 AND billInvoiceMain.type = 'bill'
HAVING SUM(transaction.paymentAmount) != ROUND(ABS(SUM(billInvoiceDetail.price *billInvoiceDetail.quantity)),2);
The first query allows me to sum of all the payments from transaction grouped by bill id.
The second query sums all the bills.
The third query I tried combing the two. However, when I try to use a GROUP BY, it gives an error. So, I got rid of that and now it just returns the first bill even if it has been paid.
Desired Results (retrieves the biId and userItemId of all bills that have not been fully paid based on the transaction table):
biId
userItemId
99
16269083
100
03200283
I have spent a lot of time trying to figure this out but am lost.

The following query retrieves rows that don't match the biId obtained by joining the result of sum of all the payments from transaction grouped by bill id and the result of sum of all the bills from billInvoiceDetail grouped by bill id.
SELECT biId, useritemid FROM billInvoiceMain
WHERE userId = 1 AND type = 'bill'
AND biId NOT IN(
SELECT t.biId FROM
(SELECT biId,SUM(paymentAmount) pay FROM transaction GROUP BY biId) t
INNER JOIN
(SELECT biId,ROUND(ABS(SUM(price*quantity)),2) bill FROM billInvoiceDetail GROUP BY biId) d
ON t.biId=d.biId AND t.pay=d.bill
)
SQL Fiddle

Related

Simple SQL query but wrong results returned

I have a simple click tracking system that consists of three tables "tracking" (which holds unique views), "views" (which holds raw views) and "products" (which holds products).
Here's how it works: each time a user clicks on a tracking link, if the hash present in the link does not exist in the database, it will be saved in the "tracking" table as an unique view and also in the "views" table as a raw view. If the hash present in the link does exist in the database, then it will be saved only in the "views" table. So basically the number of "raw views" can not be smaller than the number of "unique views" because each "unique view" also counts as a "raw view".
I wrote a query to create reports based on products, but the number of "raw views" returned is not correct.
I've also created a fiddle which I hope it will give a better overview of my problem.
Here's the table structure:
CREATE TABLE `products` (
`id` int(10) UNSIGNED NOT NULL,
`name` varchar(128) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `products` (`id`, `name`) VALUES
(1, 'Test product');
CREATE TABLE `tracking` (
`id` int(10) UNSIGNED NOT NULL,
`product_id` int(11) NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `tracking` (`id`, `product_id`, `hash`, `created`) VALUES
(1, 1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:50:19'),
(2, 1, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:55:34');
CREATE TABLE `views` (
`id` int(10) UNSIGNED NOT NULL,
`hash` varchar(32) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `views` (`id`, `hash`, `created`) VALUES
(1, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(2, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:30'),
(3, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:35'),
(4, '7ddf32e17a6ac5ce04a8ecbf782ca509', '2020-02-09 18:46:42'),
(5, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:56:31'),
(6, '00bb28eaf259ba0c932d67f649d90783', '2020-02-09 18:57:01');
And here's the query I wrote so far:
SELECT products.name AS `param`,
SUM(IF(tracking.product_id<>24, 1, 0)) AS `uniques`,
IF(SUM(IF(tracking.product_id<>24, 1, 0))=0, 0,
(SELECT COUNT(`hash`)
FROM `views` WHERE tracking.hash = views.hash)) AS `views`
FROM tracking
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name
As you can see I have 2 unique views and 6 raw views (4 for one hash and 2 for the other hash).
My expectation would be for the query result to be 2 uniques and 6 raw views for this given product, but instead I'm getting 2 uniques and 4 raw views. Like it's counting the views only for the first hash.
The next query can solve your situation:
SELECT
products.name,
COUNT(DISTINCT `tracking`.`hash`) AS `uniques`, -- count unique hashes
COUNT(*) AS `views` -- count total
FROM `tracking`
JOIN `views` ON `views`.hash = tracking.hash
LEFT JOIN products ON products.id = tracking.product_id
WHERE tracking.created BETWEEN '2019-01-01 00:00:00' AND '2020-02-10 00:00:00'
GROUP BY products.name;
;

select value from another table - mysql

I have 2 tables "srot_data" and "vada"
CREATE TABLE `vada`
(
`id` INT(10) NOT NULL,
`cislo` INT(10) DEFAULT NULL,
`popis` VARCHAR(50) DEFAULT NULL
)
engine=innodb
DEFAULT charset=utf8;
CREATE TABLE `srot_data`
(
`id` INT(10) NOT NULL,
`vada` INT(10) DEFAULT NULL
);
INSERT INTO `srot_data` (`ID`, `Datum`, `ID_obsluha`, `Linka`, `Kontejner`, `Vada`, `m_srot`, `m_pres`, `blok`) VALUES (1, '2018-04-16 11:23:44', 21, 'EXMET2', 'ELDY-', 18, '27.500', '12.500', 1),(2, '2018-04-16 12:18:06', 21, 'EXMET2', 'ELDY-', 5, '1.000', '0.000', NULL);
INSERT INTO `vada` (`ID`, `Cislo`, `Popis`) VALUES(1, 1, 'Najíždění výroby(resp. nové elektrody)'),(2, 2, 'Expander - poškozená mřížka'),(3, 3, 'Olověný pás - koroze '),(4, 4, 'Olověný pás - potrhaná mřížka'),(5, 5, 'Pastovačka - nedopastované elektrody');
Vada from srot_data = ID from vada. And I need to get Popis from table vada.
My sql is:
$sql = "SELECT count(blok) AS Total ,
Vada AS vada
FROM srot_data
LEFT JOIN vada ON vada.Popis = srot_data.Vada
WHERE Linka = 'EXMET1'
GROUP BY vada
ORDER BY Total DESC limit 1";
but it doesn´t work. I'm getting a number (vada from table srot_data). But not getting a text value from Popis (from table vada).
{"Total":"37", "vada":"5".}
And I need
{"Total":"37", "vada":"Pastovačka - nedopastované elektrody".}
Linka and blok are not important for this problem, so I deleted it from tabel for this time.

mysql GROUB BY idea

I have the following scenario: there are 1 table with books and two couples of tables (HD/IT) with Sales Order and Purchase Order transactions connecting through Sales Order id.
The table structure follows:
CREATE TABLE `books` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`it_id` int(11) NOT NULL,
`kind` tinyint(4) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `books` (`id`, `isbn`, `it_id`, `kind`) VALUES
(1, '12345', 1, 1),
(2, '12345', 1, 2),
(3, '67890', 2, 1),
(4, '1111111', 2, 2);
CREATE TABLE `porders_hd` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dt` date NOT NULL,
`so_id` int(11) DEFAULT NULL,
`customer` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `porders_hd` (`id`, `dt`, `so_id`, `customer`) VALUES
(1, '2017-07-02', 1, 1),
(2, '2017-08-03', NULL, 3);
CREATE TABLE `porders_it` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hd_id` int(11) NOT NULL,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`dscr` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`qty` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `porders_it` (`id`, `hd_id`, `isbn`, `dscr`, `qty`) VALUES
(1, 1, '12345', 'Book 1', 1),
(2, 2, '1111111', 'Book 2', 1);
CREATE TABLE `sorders_hd` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dt` date NOT NULL,
`customer` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `sorders_hd` (`id`, `dt`, `customer`) VALUES
(1, '2017-07-01', 1),
(2, '2017-08-01', 2);
CREATE TABLE `sorders_it` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hd_id` int(11) NOT NULL,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`dscr` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`qty` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `sorders_it` (`id`, `hd_id`, `isbn`, `dscr`, `qty`) VALUES
(1, 1, '12345', 'Book 1', 1),
(2, 2, '67890', 'Book 2', 1);
In summary there are:
* 1 Sales Order (#1) also existing in the Purchase Order (#1)
* 1 Sales Order (#2) still pending
* 1 Purchase Order (#2) created without a Sales Order
I want to be able to grab all Sales and Purchases Order per book's isbn and the connected SO and PO must be in the same line. The output must be like the one below:
so_id so_date po_id po_date isbn dscr
NULL NULL 2 2017-08-03 1111111 Book 2
1 2017-07-01 1 2017-07-02 12345 Book 1
2 2017-08-01 NULL NULL 67890 Book 3
I tried to grab the rows using a query like the one below:
SELECT
GROUP_CONCAT(so_id) so_id,
GROUP_CONCAT(so_date) so_date,
GROUP_CONCAT(po_id) po_id,
GROUP_CONCAT(po_date) po_date,
isbn,
dscr
FROM (
SELECT
hd.so_id so_id,
NULL so_date,
hd.id po_id,
hd.dt po_date,
bk.isbn,
it.dscr
FROM porders_hd hd,
porders_it it,
books bk
WHERE it.hd_id = hd.id
AND bk.isbn = it.isbn
AND kind = 2
UNION
SELECT
hd.id so_id,
hd.dt so_date,
NULL po_id,
NULL po_date,
bk.isbn,
it.dscr
FROM sorders_hd hd,
sorders_it it,
books bk
WHERE it.hd_id = hd.id
AND bk.isbn = it.isbn
AND kind = 1
) as table1
GROUP BY isbn, so_id, po_id
but since there is info missing I get the following result:
so_id so_date po_id po_date isbn dscr
NULL NULL 2 2017-08-03 1111111 Book 2
1 2017-07-01 NULL NULL 12345 Book 1
1 NULL 1 2017-07-02 12345 Book 1
2 2017-08-01 NULL NULL 67890 Book 3
Any ideas how can I achieve this ?
I think this is what you're after, but I can;t figure out the role of kind from your code. But here is a query that for each books, gets the associated po line item, finds the corresponding so line item and joins the header rows so the dates are available. Note my assumption that a sales order can't exist with a corresponding PO.
SELECT books.isbn, books.descr, sorders_hd.id, sorders_hd.dt, porders_hd.id, porders_hd.dt
FROM book
join porders_it on porders_it.isbn = books.isbn
join porders_hd on porders_hd.id = porders_it.hd_id
left outer join sorders_it on sorders_it.hd_id=porders_hd.so_id and sorders_it.isbn = porders_it.isbn
left outer join sorders_hd on sorders_hd.id = sorders_it.hd_it
You could normalize your tables so that descr need not be repeated, and also use the book.id in the other tables rather than isbn.
I'm adding a new answer because the previous one and the comments are illustrative. Based on that discussion, this requires a FULL OUTER JOIN which must be emulated by UNION ALL in mysql (which may be what OP was attempting originally).
Here is my new code, taking that into account:
SELECT sorders_hd.id as so_id, sorders_hd.dt as so_dt,
porders_hd.id as po_id, porders_hd.dt as po_dt,
books.isbn, porders_it.dscr
from books
left outer join porders_it on porders_it.isbn=books.isbn
join porders_hd on porders_hd.id=porders_it.hd_id
left outer join sorders_it on sorders_it.isbn=books.isbn and sorders_it.hd_id=porders_hd.so_id
left outer join sorders_hd on sorders_hd.id=sorders_it.hd_id
where books.kind=2
UNION ALL
SELECT sorders_hd.id as so_id, sorders_hd.dt as so_dt,
porders_hd.id as po_id, porders_hd.dt as po_dt,
books.isbn, sorders_it.dscr
from books
left outer join sorders_it on sorders_it.isbn=books.isbn
join sorders_hd on sorders_hd.id=sorders_it.hd_id
left outer join porders_it on porders_it.isbn=books.isbn
left outer join porders_hd on porders_hd.id=porders_it.hd_id and porders_hd.so_id=sorders_hd.id
where porders_hd.id is null and books.kind=1;
The output result is:
so_id so_dt po_id po_dt isbn dscr
1 2017-07-01 1 2017-07-02 12345 Book 1
(null) (null) 2 2017-08-03 1111111 Book 2
2 2017-08-01 (null) (null) 67890 Book 2
See SqlFiddle
The "trick" is to use union all with one of the two queries excluding records that linked both sides (to get the 'right' side of the FULL OUTER JOIN)
+1 to OP for providing the DDL and sample data!
I agree that the data model could be reworked, and could be normalized. The existing model still has at least the problem of a duplicate book record when a sales order and purchase order match (one of them is ignored). It seems to me that one improvement would be to have a master book list and include the id (or isbn if that is the primary key) from that table in porders_it and sorders_it, and eliminate the current books table.

specify conditions from outer query on a materialized subquery

i have got the below query which references couple of views 'goldedRunQueries' and 'currentGoldMarkings'. My issue seems to be from the view that is referred in the subquery - currentGoldMarkings. While execution, MySQL first materializes this subquery and then implements the where clauses of 'queryCode' and 'runId', which therefore results in execution time of more than hour as the view refers tables that has got millions of rows of data. My question is how do I enforce those two where conditions on the subquery before it materializes.
SELECT goldedRunQueries.queryCode, goldedRunQueries.runId
FROM goldedRunQueries
LEFT OUTER JOIN
( SELECT measuredRunId, queryCode, COUNT(resultId) as c
FROM currentGoldMarkings
GROUP BY measuredRunId, queryCode
) AS accuracy ON accuracy.measuredRunId = goldedRunQueries.runId
AND accuracy.queryCode = goldedRunQueries.queryCode
WHERE goldedRunQueries.queryCode IN ('CH001', 'CH002', 'CH003')
and goldedRunQueries.runid = 5000
ORDER BY goldedRunQueries.runId DESC, goldedRunQueries.queryCode;
Here are the two views. Both of these also get used in a standalone mode and so integrating any clauses into them is not possible.
CREATE VIEW currentGoldMarkings
AS
SELECT result.resultId, result.runId AS measuredRunId, result.documentId,
result.queryCode, result.queryValue AS measuredValue,
gold.queryValue AS goldValue,
CASE result.queryValue WHEN gold.queryValue THEN 1 ELSE 0 END AS correct
FROM results AS result
INNER JOIN gold ON gold.documentId = result.documentId
AND gold.queryCode = result.queryCode
WHERE gold.isCurrent = 1
CREATE VIEW goldedRunQueries
AS
SELECT runId, queryCode
FROM runQueries
WHERE EXISTS
( SELECT 1 AS Expr1
FROM runs
WHERE (runId = runQueries.runId)
AND (isManual = 0)
)
AND EXISTS
( SELECT 1 AS Expr1
FROM results
WHERE (runId = runQueries.runId)
AND (queryCode = runQueries.queryCode)
AND EXISTS
( SELECT 1 AS Expr1
FROM gold
WHERE (documentId = results.documentId)
AND (queryCode = results.queryCode)
)
)
Note: The above query reflects only a part of my actual query. There are 3 other left outer joins which are similar in nature to the above subquery which makes the problem far more worse.
EDIT: As suggested, here is the structure and some sample data for the tables
CREATE TABLE `results`(
`resultId` int auto_increment NOT NULL,
`runId` int NOT NULL,
`documentId` int NOT NULL,
`queryCode` char(5) NOT NULL,
`queryValue` char(1) NOT NULL,
`comment` varchar(255) NULL,
CONSTRAINT `PK_results` PRIMARY KEY
(
`resultId`
)
);
insert into results values (100, 242300, 'AC001', 'I', NULL)
insert into results values (100, 242300, 'AC001', 'S', NULL)
insert into results values (150, 242301, 'AC005', 'I', 'abc')
insert into results values (100, 242300, 'AC001', 'I', NULL)
insert into results values (109, 242301, 'PQ001', 'S', 'zzz')
insert into results values (400, 242400, 'DD006', 'I', NULL)
CREATE TABLE `gold`(
`goldId` int auto_increment NOT NULL,
`runDate` datetime NOT NULL,
`documentId` int NOT NULL,
`queryCode` char(5) NOT NULL,
`queryValue` char(1) NOT NULL,
`comment` varchar(255) NULL,
`isCurrent` tinyint(1) NOT NULL DEFAULT 0,
CONSTRAINT `PK_gold` PRIMARY KEY
(
`goldId`
)
);
insert into gold values ('2015-02-20 00:00:00', 138904, 'CH001', 'N', NULL, 1)
insert into gold values ('2015-05-20 00:00:00', 138904, 'CH001', 'N', 'aaa', 1)
insert into gold values ('2016-02-20 00:00:00', 138905, 'CH002', 'N', NULL, 0)
insert into gold values ('2015-12-12 00:00:00', 138804, 'CH001', 'N', 'zzzz', 1)
CREATE TABLE `runQueries`(
`runId` int NOT NULL,
`queryCode` char(5) NOT NULL,
CONSTRAINT `PK_runQueries` PRIMARY KEY
(
`runId`,
`queryCode`
)
);
insert into runQueries values (100, 'AC001')
insert into runQueries values (109, 'PQ001')
insert into runQueries values (400, 'DD006')
CREATE TABLE `runs`(
`runId` int auto_increment NOT NULL,
`runName` varchar(63) NOT NULL,
`isManual` tinyint(1) NOT NULL,
`runDate` datetime NOT NULL,
`comment` varchar(1023) NULL,
`folderName` varchar(63) NULL,
`documentSetId` int NOT NULL,
`pipelineVersion` varchar(50) NULL,
`isArchived` tinyint(1) NOT NULL DEFAULT 0,
`pipeline` varchar(50) NULL,
CONSTRAINT `PK_runs` PRIMARY KEY
(
`runId`
)
);
insert into runs values ('test1', 0, '2015-08-04 06:30:46.000000', 'zzzz', '2015-08-04_103046', 2, '2015-08-03', 0, NULL)
insert into runs values ('test2', 1, '2015-12-04 12:30:46.000000', 'zzzz', '2015-08-04_103046', 2, '2015-08-03', 0, NULL)
insert into runs values ('test3', 1, '2015-06-24 10:56:46.000000', 'zzzz', '2015-08-04_103046', 2, '2015-08-03', 0, NULL)
insert into runs values ('test4', 1, '2016-05-04 11:30:46.000000', 'zzzz', '2015-08-04_103046', 2, '2015-08-03', 0, NULL)
First, let's try to improve the performance via indexes:
results: INDEX(runId, queryCode) -- in either order
gold: INDEX(documentId, query_code, isCurrent) -- in that order
After that, update the CREATE TABLEs in the question and add the output of:
EXPLAIN EXTENDED SELECT ...;
SHOW WARNINGS;
What version are you running? You effectively have FROM ( SELECT ... ) JOIN ( SELECT ... ). Before 5.6, neither subquery had an index; with 5.6, an index is generated on the fly.
It is a shame that the query is built that way, since you know which one to use: and goldedRunQueries.runid = 5000.
Bottom Line: add the indexes; upgrade to 5.6 or 5.7; if that is not enough, then rethink the use of VIEWs.

mysql select record containing highest value, joining on range of columns containing nulls

Here's what I'm working with:
CREATE TABLE IF NOT EXISTS `rate` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`client_company` int(11) DEFAULT NULL,
`client_group` int(11) DEFAULT NULL,
`client_contact` int(11) DEFAULT NULL,
`role` int(11) DEFAULT NULL,
`date_from` datetime DEFAULT NULL,
`hourly_rate` decimal(18,2) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `rate` (`id`, `client_company`, `client_group`,
`client_contact`, `role`, `date_from`, `hourly_rate`)
VALUES
(4, NULL, NULL, NULL, 3, '2012-07-30 14:48:16', 115.00),
(5, 3, NULL, NULL, 3, '2012-07-30 14:51:38', 110.00),
(6, 3, NULL, NULL, 3, '2012-07-30 14:59:20', 112.00);
This table stores chargeout rates for clients; the idea being that, when looking for the correct rate for a job role, we'd first look for a rate matching the given role and client contact, then if no rate was found, would try to match the role and the client group (or 'department'), then the client company, and finally looking for a global rate for just the role itself. Fine.
Rates can change over time, so the table may contain multiple entries matching any given combination of role, company, group and client contact: I want a query that will only return me the latest one for each distinct combination.
Given that I asked a near-identical question only days ago, and that this topic seems fairly frequent in various guises, I can only apologise for my slow-wittedness and ask once again for someone to explain why the query below is returning all three of the records above and not, as I want it to, only the records with IDs 4 and 6.
Is it something to do with my trying to join based on columns containing NULL?
SELECT
rate.*,
newest.id
FROM rate
LEFT JOIN rate AS newest ON(
rate.client_company = newest.client_company
AND rate.client_contact = newest.client_contact
AND rate.client_group = newest.client_group
AND rate.role= newest.role
AND newest.date_from > rate.date_from
)
WHERE newest.id IS NULL
FWIW, the problem WAS joining NULL columns. The vital missing ingredient was COALESCE:
SELECT
rate.*,
newest.id
FROM rate
LEFT JOIN rate AS newest ON(
COALESCE(rate.client_company,1) = COALESCE(newest.client_company,1)
AND COALESCE(rate.client_contact,1) = COALESCE(newest.client_contact,1)
AND COALESCE(rate.client_group,1) = COALESCE(newest.client_group,1)
AND COALESCE(rate.role,1) = COALESCE(newest.role,1)
AND newest.date_from > rate.date_from
)
WHERE newest.id IS NULL