Creating summary VIEW from fields from multiple tables - mysql

I am trying to write a select query for creating a view in MySQL. Each row in the view should display weekly summary (sum, avg) for user values collected from multiple tables. The tables are similar to each-other but not identical. The view should include rows also in case other table doesn't have a values for that week. Something like this:
| week_year | sum1 | avg1 | sum2 | user_id |
| --------- | ---- | ---- | ---- | ------- |
| 201840 | | | 3 | 1 |
| 201844 | 45 | 55 | | 1 |
| 201845 | 55 | 65 | | 1 |
| 201849 | 65 | 75 | | 1 |
| 201849 | 75 | 85 | 3 | 2 |
The tables (simplified) are as follows:
CREATE TABLE IF NOT EXISTS `t1` (
`user_id` INT NOT NULL AUTO_INCREMENT,
`date` DATE NOT NULL,
`value1` int(3) NOT NULL,
`value2` int(3) NOT NULL,
PRIMARY KEY (`user_id`,`date`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `t2` (
`id` INT NOT NULL AUTO_INCREMENT,
`date` DATE NOT NULL,
`value3` int(3) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `t3` (
`t3_id` INT NOT NULL,
`user_id` INT NOT NULL
) DEFAULT CHARSET=utf8;
My current solution doesn't seem reasonable and I am not sure how it would perform in case of thousands of rows:
select ifnull(yearweek(q1.date1), yearweek(q1.date2)) as week_year,
sum(value1) as sum1,
avg(value2) as avg1,
sum(value3) as sum2,
q1.user_id
from (select t2.date as date2,
t1.date as date1,
ifnull(t3.user_id, t1.user_id) as user_id,
t1.value1,
t1.value2,
t2.value3
from t2
join t3 on t3.t3_id=t2.id
left join t1 on yearweek(t1.date) = yearweek(t2.date) and t1.user_id = t3.user_id
union
select t2.date as date2,
t1.date as date1,
ifnull(t3.user_id, t1.user_id) as user_id,
t1.value1,
t1.value2,
t2.value3
from t2
join t3 on t3.t3_id=t2.id
right join t1 on yearweek(t1.date) = yearweek(t2.date) and t1.user_id = t3.user_id) as q1
group by week_year, user_id;
DB Fiddle
Is the current solution okay performance wise or are there better options? In case of in the future third (or fourth) table is added, how would I manage the query? Should I consider creating a separate table, that is updated with triggers?
Thanks in advance.

Another way you can do it is to union all the data and then group it. You'll have to perf test to see which is better:
SELECT
yearweek(date),
SUM(value1) as sum1,
AVG(value2) as avg1,
SUM(value3) as sum2
FROM
(
SELECT user_id, date, value1, value2, CAST(null as INT) as value3 FROM t1
UNION ALL
SELECT user_id, date, null, null, value3 FROM t2 INNER JOIN t3 ON t2.id = t3.t3_id
)
GROUP BY
user_id,
yearweek(date)
Hopefully mysql won't take issue with casting null to an int..

Related

MySQL Update Table from Another Table Recursively

I have two tables, table1 and table2, where table1 is updated to fill in missing (null) values based on matching fields in table2 to create a more complete table1.
I have tried numerous queries such as
UPDATE table1 INNER JOIN table2...SET...
and
UPDATE table1 SET ... (SELECT)...
However my results are always incomplete. Note that I have a much larger dataset in both tables in terms of both columns and rows). I just used this as simpler example.
Rules:
1) The `keyword` from table2 looks for a match in `keyword` in table1 and must accept partial matches.
2) No values can be overwritten in table1 (update null values only)
3) The lookup order is per run_order in table2.
Specific example:
Table1:
|-----|-------------------------------|----------|-----|---------|-------|
|t1_id|keyword |category |month|age |skill |
|-----|-------------------------------|----------|-----|---------|-------|
| 1 |childrens-crafts-christmas |kids | | | |
| 2 |kids-costumes | | |tween | |
| 3 |diy-color-printable |printable | | |easy |
| 4 |toddler-halloween-costume-page | | | | |
|-----|-------------------------------|----------|-----|---------|-------|
Table2:
|-----|---------|---------|----------|-----|----------|-------|
|t2_id|run_order|keyword |category |month|age |skill |
|-----|---------|---------|----------|-----|----------|-------|
| 1 | 1 |children | | |4-11 yrs | |
| 2 | 2 |printable| | | |easy |
| 3 | 3 |costume |halloween | 10 |0-12 years| |
| 4 | 4 |toddler | | |1-3 years | |
| 5 | 5 |halloween|holiday | 10 | | |
|-----|---------|---------|----------|-----|----------|-------|
Result:
|-----|-------------------------------|----------|-----|---------|-------|
|t1_id|keyword |category |month|age |skill |
|-----|-------------------------------|----------|-----|---------|-------|
| 1 |childrens-crafts-christmas |kids | |4-11 yrs | |
| 2 |kids-costumes |halloween | 10 |tween | |
| 3 |diy-color-printable printable |printable | | |easy |
| 4 |toddler-halloween-costume-page |holiday | 10 |1-3 years| |
|-----|-------------------------------|----------|-----|---------|-------|
MySQL for schema and table data:
DROP TABLE IF EXISTS table1;
DROP TABLE IF EXISTS table2;
CREATE TABLE `table1` (
`t1_id` INT NOT NULL AUTO_INCREMENT,
`keyword` VARCHAR(200) NULL,
`category` VARCHAR(45) NULL,
`month` VARCHAR(45) NULL,
`age` VARCHAR(45) NULL,
`skill` VARCHAR(45) NULL,
PRIMARY KEY (`t1_id`));
CREATE TABLE `table2` (
`t2_id` INT NOT NULL AUTO_INCREMENT,
`run_order` INT NULL,
`keyword` VARCHAR(200) NULL,
`category` VARCHAR(45) NULL,
`month` INT NULL,
`age` VARCHAR(45) NULL,
`skill` VARCHAR(45) NULL,
PRIMARY KEY (`t2_id`));
INSERT INTO `table1` (`keyword`, `category`) VALUES ('childrens-crafts-christmas', 'kids');
INSERT INTO `table1` (`keyword`, `age`) VALUES ('kids-costumes', 'tween');
INSERT INTO `table1` (`keyword`, `category`, `skill`) VALUES ('diy-color-printable', 'printable', 'easy');
INSERT INTO `table1` (`keyword`) VALUES ('toddler-halloween-costume-page');
INSERT INTO `table2` (`run_order`, `keyword`, `age`) VALUES (1, 'children', '4-11 yrs');
INSERT INTO `table2` (`run_order`, `keyword`, `skill`) VALUES (2, 'printable', 'easy');
INSERT INTO `table2` (`run_order`, `keyword`, `category`, `month`, `age`) VALUES (3, 'costume', 'halloween', 10, '0-12 years');
INSERT INTO `table2` (`run_order`, `keyword`, `age`) VALUES (4, 'toddler', '1-3 years');
INSERT INTO `table2` (`run_order`, `keyword`, `category`, `month`) VALUES (5, 'halloween', 'holiday', 10);
You want to update empty values in table1 with the value in the corresponding column of the first matching record in table2, run_order wise. A typical solution is to use a combination of correlated subqueries to find the matching records in table2 and keep only the one with lowest run_order.
Here is a query that will update null categories with this logic:
update table1 t1
set category = (
select category
from table2 t2
where t2.run_order = (
select min(t22.run_order)
from table2 t22
where
t1.keyword like concat('%', t22.keyword, '%')
and t22.category is not null
)
)
where t1.category is null
This assumes that run_order is unique in table2 (which seems relevant in your use case).
You can extend the logic for more columns with coalesce(). Here is the solution for columns category and month:
update table1 t1
set
category = coalesce(
t1.category,
(
select category
from table2 t2
where t2.run_order = (
select min(t22.run_order)
from table2 t22
where
t1.keyword like concat('%', t22.keyword, '%')
and t22.category is not null
)
)
),
month = coalesce(
t1.month,
(
select month
from table2 t2
where t2.run_order = (
select min(t22.run_order)
from table2 t22
where
t1.keyword like concat('%', t22.keyword, '%')
and t22.month is not null
)
)
)
where t1.category is null or t1.month is null
Demo on DB Fiddle
You can use a join between the 2 tables using a like expression and the if nul() function to ensure you don't overwrite non null values.
UPDATE table1 t1
INNER JOIN table2 t2 ON t1.keyword like concat("%",t2.keyword,"%")
SET
t1.category = ifnull(t1.category,t2.category),
t1.age = ifnull(t1.age,t2.age),
t1.skill = ifnull(t1.skill,t2.skill);

MySql: JOIN and SUM from different tables

I need to sum-up the amount in 2 tables (c1, c2) linked n:1 to table a. The problem is: It would be greate if I could do it in just 1 query, because the real situation is a bit more complicated ;-) I brought it down to this testcase:
create table a (
`id` int(10) unsigned NOT NULL, KEY(id)
) ENGINE=InnoDB;
create table c1 (
`id` int(10) unsigned NOT NULL, KEY(id),
`a` int(10),
`amount` decimal(15,2) NOT NULL
) ENGINE=InnoDB;
create table c2 (
`id` int(10) unsigned NOT NULL, KEY(id),
`a` int(10),
`amount` decimal(15,2) NOT NULL
) ENGINE=InnoDB;
INSERT INTO a SET id=1;
INSERT INTO c1 SET a=1, amount = 2;
INSERT INTO c1 SET a=1, amount = 3;
INSERT INTO c2 SET a=1, amount = 1;
SELECT SUM(c1.amount), SUM(c2.amount)
FROM a
LEFT JOIN c1 ON c1.a = a.id
LEFT JOIN c2 ON c2.a = a.id
WHERE a.id = 1;
The result of course is:
+----------------+----------------+
| SUM(c1.amount) | SUM(c2.amount) |
+----------------+----------------+
| 5.00 | 2.00 |
+----------------+----------------+
because c1 is joined twice and doubles the amound in c2. But I need to get:
+----------------+----------------+
| SUM(c1.amount) | SUM(c2.amount) |
+----------------+----------------+
| 5.00 | 1.00 |
+----------------+----------------+
Any idea how to get to this?
One possible answer is:
SELECT (select SUM(c1.amount) from c1 where c1.a = a.id) as c1_amount,
(select SUM(c2.amount) from c2 where c2.a = a.id) as c2_amount
FROM a
WHERE a.id = 1;
Link to SQL Fiddle
BTW - Thanks for putting in the data and create scripts. That helped a lot.
SELECT a.*
, SUM(CASE WHEN b.source = 'c1' THEN amount END) c1_ttl
, SUM(CASE WHEN b.source = 'c2' THEN amount END) c2_ttl
FROM a
JOIN
(
SELECT *,'c1' source FROM c1 UNION SELECT *,'c2' FROM c2
) b
ON b.a = a.id;
+----+--------+--------+
| id | c1_ttl | c2_ttl |
+----+--------+--------+
| 1 | 5.00 | 1.00 |
+----+--------+--------+
Another solution.
SELECT *
FROM
(SELECT SUM(c1.amount) FROM c1 WHERE c1.a = 1) C1
INNER JOIN
(SELECT SUM(c2.amount) FROM c2 WHERE c2.a = 1) C2;

Mysql Query to delete duplicates

I have duplicate results like below where some column may have data and may not
| contact_info | icon | id | title | lastmodified_by |
+--------------+------+-----+---------------+------------------+
| 169 | 305 | 123 | Whakarewarewa | 2011100400305262 |
| NULL | NULL | 850 | Whakarewarewa | NULL |
+--------------+------+-----+---------------+----------------
| contact_info | icon | id | title | lastmodified_by |
+--------------+------+-----+---------------+------------------+
| NULL | NULL | 123 | Paris | NULL |
| NULL | NULL | 850 | Paris | NULL |
+--------------+------+-----+---------------+----------------
I want to delete record having less Data and if the all the Field values are exact same then delete any row.
There are thousand records like this.
Try this two-step solution:
Run this query to vew all duplicates - record having less Data -
SELECT t1.* FROM table t1
JOIN (
SELECT
title,
MIN(IF(contact_info IS NULL, 0, 1) + IF(contact_info IS NULL, 0, 1) + IF(lastmodified_by IS NULL, 0, 1)) min_value_data,
MAX(IF(contact_info IS NULL, 0, 1) + IF(contact_info IS NULL, 0, 1) + IF(lastmodified_by IS NULL, 0, 1)) max_value_data
FROM table GROUP BY title HAVING min_value_data <> max_value_data
) t2
ON t1.title = t2.title AND IF(t1.contact_info IS NULL, 0, 1) + IF(t1.contact_info IS NULL, 0, 1) + IF(t1.lastmodified_by IS NULL, 0, 1) <> t2.max_value_data
Rewrite it to DELETE statement and execute.
Then run this query to remove all duplicates except min ID:
DELETE t1 FROM table t1
JOIN (SELECT MIN(id) id, title FROM table GROUP BY title) t2
ON t1.id <> t2.id AND t1.title = t2.title;
Use this to select duplicates, feel free to alter this to a delete statement:
SELECT * FROM `test`,
(SELECT title, count( title ) AS ttl
FROM `test`
GROUP BY title
HAVING ttl >1) AS sub
WHERE test.title = sub.title
AND contact_info IS NULL AND lastmodified_by IS NULL
Main table = tes1
Create temp
CREATE TEMPORARY TABLE my_temp ( id INT(20) NOT NULL ) ENGINE=MEMORY;
Fill with id's to remove
INSERT INTO my_temp (id) SELECT id FROM tes1 AS main, ( SELECT title, count( title ) AS ttl FROM tes1 GROUP BY
title HAVING ttl >1 ) AS sub WHERE main.title = sub.title AND
main.contact_info IS NULL AND main.lastmodified_by IS NULL GROUP BY
main.contact_info, main.icon, main.title, main.lastmodified_by;
Delete!
DELETE FROM tes1 WHERE id IN (select id from my_temp);
Cleanup, note: do we really need this?
DROP TABLE my_temp;

Join two tables where table A has a date value and needs to find the next date in B below the date in A

I got this table "A":
| id | date |
===================
| 1 | 2010-01-13 |
| 2 | 2011-04-19 |
| 3 | 2011-05-07 |
| .. | ... |
and this table "B":
| date | value |
======================
| 2009-03-29 | 0.5 |
| 2010-01-30 | 0.55 |
| 2011-08-12 | 0.67 |
Now I am looking for a way to JOIN those two tables having the "value" column in "B" mapped to the dates in "A". The tricky part for me here is that table "B" only stores the change date and the new value. Now when I need this value in table "A" the SQL needs to look back what date is the next below the date it is asking the value for.
So in the end the JOIN of those tables should look like this:
| id | date | value |
===========================
| 1 | 2010-01-13 | 0.5 |
| 2 | 2011-04-19 | 0.55 |
| 3 | 2011-05-07 | 0.55 |
| .. | ... | ... |
How can I do this?
-- Create and fill first table
CREATE TABLE `id_date` (
`id` int(11) NOT NULL auto_increment,
`iddate` date NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO `id_date` VALUES(1, '2010-01-13');
INSERT INTO `id_date` VALUES(2, '2011-04-19');
INSERT INTO `id_date` VALUES(3, '2011-05-07');
-- Create and fill second table
CREATE TABLE `date_val` (
`mydate` date NOT NULL,
`myval` varchar(4) collate utf8_bin NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO `date_val` VALUES('2009-03-29', '0.5');
INSERT INTO `date_val` VALUES('2010-01-30', '0.55');
INSERT INTO `date_val` VALUES('2011-08-12', '0.67');
-- Get the result table as asked in question
SELECT iddate, t2.mydate, t2.myval
FROM `id_date` t1
JOIN date_val t2 ON t2.mydate <= t1.iddate
AND t2.mydate = (
SELECT MAX( t3.mydate )
FROM `date_val` t3
WHERE t3.mydate <= t1.iddate )
What we're doing:
for each date in the id_date table (your table A),
we find the date in the date_val table (your table B)
which is the highest date in the date_val table (but still smaller than the id_date.date)
You could use a subquery with limit 1 to look up the latest value in table B:
select id
, date
, (
select value
from B
where B.date < A.date
order by
B.date desc
limit 1
) as value
from A
I have been inspired by the other answers but ended with my own solution using common table expressions:
WITH datecombination (id, adate, bdate) AS
(
SELECT id, A.date, MAX(B.Date) as Bdate
FROM tableA A
LEFT JOIN tableB B
ON B.date <= A.date
GROUP BY A.id, A.date
)
SELECT DC.id, DC.adate, B.value FROM datecombination DC
LEFT JOIN tableB B
ON DC.bdate = B.bdate
The INNER JOIN return rows when there is at least one match in both tables. Try this.
Select A.id,A.date,b.value
from A inner join B
on A.date=b.date

MySQL join, empty rows in junction table

I have three tables I'd like to join in a way that produces all records from one table and any matching records or NULL from another table. It is imperative that all records from the first table be returned. I thought I had done this before but I can't remember when or where and MySQL just isn't playing ball.
SELECT VERSION();
5.0.51a-3ubuntu5.7
/* Some table that may or may not be needed, included to give context */
CREATE TABLE `t1` (
`a` int(4) NOT NULL,
`a_name` varchar(10) NOT NULL,
PRIMARY KEY (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `t2` ( /* "One table" */
`b` int(2) NOT NULL,
`b_name` varchar(10) NOT NULL,
PRIMARY KEY (`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `t3` ( /* "Another table" */
`a` int(4) NOT NULL,
`b` int(2) NOT NULL,
PRIMARY KEY (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO t1 VALUES (1, '1-one'),(2, '1-two'),(3, '1-three');
INSERT INTO t2 VALUES (1, '2-one'),(2, '2-two'),(3, '2-three'),
(4, '2-four'),(5, '2-five');
INSERT INTO t3 VALUES (1,1),(1,2),(1,3),(1,4),(2,2),(2,5);
t3 is a junction table for t1 and t2. The result set I'm looking for should look like this for any a=n:
n=1
b | b_name | a
-------------------
1 | 2-one | 1
2 | 2-two | 1
3 | 2-three | 1
4 | 2-four | 1
5 | 2-five | NULL
n=2
b | b_name | a
-------------------
1 | 2-one | NULL
2 | 2-two | 2
3 | 2-three | NULL
4 | 2-four | NULL
5 | 2-five | 2
n=7
b | b_name | a
-------------------
1 | 2-one | NULL
2 | 2-two | NULL
3 | 2-three | NULL
4 | 2-four | NULL
5 | 2-five | NULL
The value of a in the result set actually isn't important as long as it unambiguously reflects the presence or absence of records in t3 (does that make sense?).
The query
SELECT t2.b, t2.b_name, a
FROM t2
LEFT /* OUTER */ JOIN t3 ON t3.b = t2.b
WHERE (
a = 2
OR
a IS NULL
);
returns
b | b_name | a
-------------------
2 | 2-two | 2
5 | 2-five | 2
Can this be done?
SELECT t2.b, t2.b_name, MAX(IF(a=2, a, NULL)) AS a
FROM t2
LEFT OUTER JOIN t3
ON t3.b = t2.b
GROUP by t2.b
ORDER BY b ASC;
Something like this? If not, can you give an example of the output you'd like to get.
select * from t1
left join t3 using(a)
left join t2 using(b)