Select from an explicit table in mysql - mysql

I am trying to do a join on data that does not exist in my database, and never changes.
I want to do:
SELECT val, campaign FROM values
LEFT JOIN (SELECT campaign, start, end FROM (
('Spring 2104', '2014-05-01', '2014-08-01'),
('Winter 2014', '2014-08-01', '2014-12-31')
) as campaign_table ON (
values.date > campaign_table.start AND
values.date < campaign_table.end
)
Is that possible? I could create a temporary table, but for what I am trying to do that does not actually work.

You could use union all to create the dummy set. This is a viable solution considering there are only a handful of rows in your dummy dataset.
SELECT val
,campaign
FROM
VALUES
LEFT JOIN (
SELECT 'Spring 2104' campaign
,'2014-05-01' start
,'2014-08-01' [end]
UNION ALL
SELECT 'Winter 2014'
,'2014-08-01'
,'2014-12-31'
) AS campaign_table ON
VALUES.DATE > campaign_table.start
AND
VALUES.DATE < campaign_table.[end]

Maybe you need this executing all queries at once:
CREATE TABLE IF NOT EXISTS `tempo`( `campaign_name` VARCHAR(100), `from` DATE, `to` DATE );
INSERT INTO tempo(campaign_name, `start`, `end`) VALUES ('Spring 2104', '2014-05-01', '2014-08-01'),('Winter 2014', '2014-08-01', '2014-12-31');
SELECT t1.val, t1.campaign, t2.campaign_name FROM `values` t1, `tempo` t2 WHERE t1.date BETWEEN t2.start AND t2.end;
DROP TABLE `tempo`;
Also you can make: CREATE TEMPORARY TABLE
Try!

Related

MySQL - optimize query with a sub-query

I am currently bit stuck at the query which needs a bit of optimization - I am looking for a way how to optimize (if possible) following query (I have no idea what to do here at the moment :/):
SELECT count(distinct(pj.id)) as qty
FROM `project_jobs` `pj`
JOIN `projects` `p` ON pj.project_id = p.id AND p.status NOT IN ("CANCELED","DELETED","ARCHIVED")
WHERE
(
(
(pj.job_type_service_id IN (SELECT id FROM job_type_services WHERE job_type_id IN (4,2,3)))
AND
(pj.new_status_id IN ("wip","completed","delivered"))
)
AND (pj.status<>'DELETED' AND pj.status<>'CANCELED')
)
AND
(pj.due_date >= '2010-04-01 00:00:00' AND pj.due_date <= '2018-05-09 23:59:59')
and exists
(SELECT * FROM project_job_parents pjp
WHERE pjp.project_job_id IN
(SELECT id FROM project_jobs WHERE job_type_id IN (1,24,7,8,32,34,33))
and
pjp.parent_id = pj.id
)
EXPLAIN gives following info:
Is there anything what can be done here to optimize and speed up the query?
EXISTS subqueries usually perform significantly better than IN subqueries, at least in most MySQL versions (transformed the query below).
You didn't provide the tables structure, so it will be hard to tell which indexes exist and which columns they contain. So, I'll just specify which indexes you should have.
Indexes to add:
ALTER TABLE `job_type_services` ADD INDEX `job_type_services_idx_id_id` (`job_type_id`,`id`);
ALTER TABLE `project_job_parents` ADD INDEX `project_job_parents_idx_id` (`parent_id`);
ALTER TABLE `project_job_parents` ADD INDEX `project_job_parents_idx_id` (`project_job_id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id_id_status_id` (`new_status_id`,`project_id`,`status`,`id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id` (`job_type_service_id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id` (`id`);
ALTER TABLE `project_jobs` ADD INDEX `project_jobs_idx_id_id` (`job_type_id`,`id`);
ALTER TABLE `projects` ADD INDEX `projects_idx_status_id` (`status`,`id`);
Transformed query:
SELECT
count(DISTINCT (`pj`.id)) AS qty
FROM
`project_jobs` `pj`
JOIN
`projects` `p`
ON `pj`.project_id = `p`.id
AND `p`.status NOT IN (
'CANCELED',
'DELETED',
'ARCHIVED')
WHERE
(
(
(
EXISTS (
SELECT
1
FROM
job_type_services
WHERE
job_type_services.job_type_id IN (
4, 2, 3
)
AND `pj`.job_type_service_id = job_type_services.id
)
)
AND (
`pj`.new_status_id IN (
'wip', 'completed', 'delivered'
)
)
)
AND (
`pj`.status <> 'DELETED'
AND `pj`.status <> 'CANCELED'
)
)
AND (
`pj`.due_date >= '2010-04-01 00:00:00'
AND `pj`.due_date <= '2018-05-09 23:59:59'
)
AND EXISTS (
SELECT
*
FROM
project_job_parents pjp
WHERE
EXISTS (
SELECT
1
FROM
project_jobs
WHERE
project_jobs.job_type_id IN (
1, 24, 7, 8, 32, 34, 33
)
AND pjp.project_job_id = project_jobs.id
)
AND pjp.parent_id = `pj`.id
)

MySql CASE Execute Query Returns Operand should contain 1 columns

The following code returns Operand should contain 1 columns.
SELECT
CASE WHEN
(SELECT COUNT(1) FROM `student` WHERE `join_date` > '2017-03-21 09:00:00') > 0
THEN
(SELECT * FROM `student` >= CAST(CAST('2017-03-21 09:00:00' AS DATE) AS DATETIME)
END
but the following works. Why?
SELECT
CASE WHEN
(SELECT COUNT(1) FROM `student` WHERE `join_date` > '2017-03-21 00:00:00') > 0
THEN
(SELECT `foo`)
ELSE
(SELECT `bar`)
END
How if i want to perform checking and execute 2 different queries according to the checking result.
I want to achieve following result (works fine in sql)
IF (SELECT COUNT(*) FROM table WHERE term LIKE "term") > 4000
EXECUTE (SELECT * FROM table1)
ELSE
EXECUTE (SELECT * FROM table2)
If you force your subselect tor return only a row the also the first select work
SELECT
CASE WHEN
(SELECT COUNT(1) FROM `student` WHERE `join_date` > '2017-03-21 00:00:00') > 0
THEN
(SELECT * FROM `student` order by your_column limit 1)
ELSE
(SELECT * FROM `teacher` order by your_column limit 1)
END
you should also add proper order by on the column your need (in the sample named your_column ) for obtain the valid first row
You can select from both tables using UNION ALL and excluding conditions.
SELECT * FROM `student`
WHERE EXISTS (SELECT * FROM `student` WHERE `join_date` > '2017-03-21 00:00:00')
UNION ALL
SELECT * FROM `teacher`
WHERE NOT EXISTS (SELECT * FROM `student` WHERE `join_date` > '2017-03-21 00:00:00')
Note that the table schemas should be the same.

How many different ways are there to get the second row in a SQL search?

Let's say I was looking for the second most highest record.
Sample Table:
CREATE TABLE `my_table` (
`id` int(2) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`value` int(10),
PRIMARY KEY (`id`)
);
INSERT INTO `my_table` (`id`, `name`, `value`) VALUES (NULL, 'foo', '200'), (NULL, 'bar', '100'), (NULL, 'baz', '0'), (NULL, 'quux', '300');
The second highest value is foo. How many ways can you get this result?
The obvious example is:
SELECT name FROM my_table ORDER BY value DESC LIMIT 1 OFFSET 1;
Can you think of other examples?
I was trying this one, but LIMIT & IN/ALL/ANY/SOME subquery is not supported.
SELECT name FROM my_table WHERE value IN (
SELECT MIN(value) FROM my_table ORDER BY value DESC LIMIT 1
) LIMIT 1;
Eduardo's solution in standard SQL
select *
from (
select id,
name,
value,
row_number() over (order by value) as rn
from my_table t
) t
where rn = 1 -- can pick any row using this
This works on any modern DBMS except MySQL. This solution is usually faster than solutions using sub-selects. It also can easily return the 2nd, 3rd, ... row (again this is achievable with Eduardo's solution as well).
It can also be adjusted to count by groups (adding a partition by) so the "greatest-n-per-group" problem can be solved with the same pattern.
Here is a SQLFiddle to play around with: http://sqlfiddle.com/#!12/286d0/1
This only works for exactly the second highest:
SELECT * FROM my_table two
WHERE EXISTS (
SELECT * FROM my_table one
WHERE one.value > two.value
AND NOT EXISTS (
SELECT * FROM my_table zero
WHERE zero.value > one.value
)
)
LIMIT 1
;
This one emulates a window function rank() for platforms that don't have them. It can also be adapted for ranks <> 2 by altering one constant:
SELECT one.*
-- , 1+COALESCE(agg.rnk,0) AS rnk
FROM my_table one
LEFT JOIN (
SELECT one.id , COUNT(*) AS rnk
FROM my_table one
JOIN my_table cnt ON cnt.value > one.value
GROUP BY one.id
) agg ON agg.id = one.id
WHERE agg.rnk=1 -- the aggregate starts counting at zero
;
Both solutions need functional self-joins (I don't know if mysql allows them, IIRC it only disallows them if the table is the target for updates or deletes)
The below one does not need window functions, but uses a recursive query to enumerate the rankings:
WITH RECURSIVE agg AS (
SELECT one.id
, one.value
, 1 AS rnk
FROM my_table one
WHERE NOT EXISTS (
SELECT * FROM my_table zero
WHERE zero.value > one.value
)
UNION ALL
SELECT two.id
, two.value
, agg.rnk+1 AS rnk
FROM my_table two
JOIN agg ON two.value < agg.value
WHERE NOT EXISTS (
SELECT * FROM my_table nx
WHERE nx.value > two.value
AND nx.value < agg.value
)
)
SELECT * FROM agg
WHERE rnk = 2
;
(the recursive query will not work in mysql, obviously)
You can use inline initialization like this:
select * from (
select id,
name,
value,
#curRank := #curRank + 1 AS rank
from my_table t, (SELECT #curRank := 0) r
order by value desc
) tb
where tb.rank = 2
SELECT name
FROM my_table
WHERE value < (SELECT max(value) FROM my_table)
ORDER BY value DESC
LIMIT 1
SELECT name
FROM my_table
WHERE value = (
SELECT min(r.value)
FROM (
SELECT name, value
FROM my_table
ORDER BY value DESC
LIMIT 2
) r
)
LIMIT 1

How to insert conditionally

I create a temporary table #tbl(account, last_update). I have following two inserts from different source (could be tables from different databases) to insert account with last update date. For example
create table #tbl ([account] numeric(18, 0), [last_update] datetime)
insert into #tbl(account , last_update)
select table1.account, max(table1.last_update)
from table1 join…
group by table1.account
insert into #tbl(account , last_update)
select table2.account, max(table2.last_update)
from table2 join…
group by table2.account
The problem is this could cause duplicate account in the table #tbl. I either have to avoid it during each insert or remove the duplicate after both insert. Also, if there is account with two different last_update, I want the #tbl have the latest last_update. How do I achieve this conditional insert? Which one will have better performance?
Do you think you could rewrite your query to something like:
create table #tbl ([account] numeric(18, 0), [last_update] datetime)
insert into #tbl(account , last_update)
select theaccount, MAX(theupdate) from
(
select table1.account AS theaccount, table1.last_update AS theupdate
from table1 join…
UNION ALL
select table2.account AS theaccount, table2.last_update AS theupdate
from table2 join…
) AS tmp GROUP BY theaccount
The UNION ALL will build you 1 unique table combining table1 + table2 records. From there, you can act as if was a regular table, which means that you are able to find the max last_update for each record using a "group by"
insert into #tbl(account , last_update)
select account, last_update
from
(
select a.* from #table1 a where
last_update in( select top 1 last_update from #table1 b
where
a.account = b.account
order by last_update desc)
UNION
select a.* from #table2 a where
last_update in( select top 1 last_update from #table2 b
where
a.account = b.account
order by last_update desc)
) AS tmp

Time interval calculation in time series using SQL

I have a MySQL table like this
CREATE TABLE IF NOT EXISTS `vals` (
`DT` datetime NOT NULL,
`value` INT(11) NOT NULL,
PRIMARY KEY (`DT`)
);
the DT is unique date with time
data sample:
INSERT INTO `vals` (`DT`,`value`) VALUES
('2011-02-05 06:05:00', 300),
('2011-02-05 11:05:00', 250),
('2011-02-05 14:35:00', 145),
('2011-02-05 16:45:00', 100),
('2011-02-05 18:50:00', 125),
('2011-02-05 19:25:00', 100),
('2011-02-05 21:10:00', 125),
('2011-02-06 00:30:00', 150);
I need to get something like this:
start|end|value
NULL,'2011-02-05 06:05:00',300
'2011-02-05 06:05:00','2011-02-05 11:05:00',250
'2011-02-05 11:05:00','2011-02-05 14:35:00',145
'2011-02-05 14:35:00','2011-02-05 16:45:00',100
'2011-02-05 16:45:00','2011-02-05 18:50:00',125
'2011-02-05 18:50:00','2011-02-05 19:25:00',100
'2011-02-05 19:25:00','2011-02-05 21:10:00',125
'2011-02-05 21:10:00','2011-02-06 00:30:00',150
'2011-02-06 00:30:00',NULL,NULL
I tried the following query:
SELECT T1.DT AS `start`,T2.DT AS `stop`, T2.value AS value FROM (
SELECT DT FROM vals
) T1
LEFT JOIN (
SELECT DT,value FROM vals
) T2
ON T2.DT > T1.DT ORDER BY T1.DT ASC
but it returns to many rows (29 instead of 9) in result and I cold not find any way to limit this using SQL. Is it Possible in MySQL?
Use a subquery
SELECT
(
select max(T1.DT)
from vals T1
where T1.DT < T2.DT
) AS `start`,
T2.DT AS `stop`,
T2.value AS value
FROM vals T2
ORDER BY T2.DT ASC
You can also use a MySQL specific solution employing variables
SELECT CAST( #dt AS DATETIME ) AS `start` , #dt := DT AS `stop` , `value`
FROM (SELECT #dt := NULL) dt, vals
ORDER BY dt ASC
But you need to do it precisely
the ORDER by must be present otherwise the variables don't roll properly
the variable needs to be NULLified within the query using a subquery to set it, otherwise if you run it twice in a row, the 2nd time it will not start with NULL
You can use a server-side variable to simulate it:
select #myvar as start, end, value, #myvar := end as next_rows_start
from vals
Variables are interpreted from left-right in sequence, so the two references to #myvar (start and next_rows_start) will output with two different values.
Just remember to reset #myvar to null before and/or after the query, otherwise the second and subsequent runs will have a wrong first row:
select #myvar := null
This would be easier if the table had a running ID column which corresponds to the times in DT (same order). If you don't want to change the table you can use a temp:
drop table if exists temp;
CREATE TABLE temp (
`id` INT(11) AUTO_INCREMENT,
`DT` datetime NOT NULL,
`value` INT(11) NOT NULL,
PRIMARY KEY (`id`)
);
insert into temp (DT,value) select * from vals order by DT asc;
select t1.DT as `start`, t2.DT as `end`, t2.value
from temp t2
left join temp t1 ON t2.id = t1.id + 1;