I am trying to update (reference) a column (oid) of one table with OID of another table's column with certain condition.
Example :
Customer Table :
------------------
CID name oid
-------------------
1 abc null
2 abc null
3 abc null
4 xyz null
--------------------
Order Table
--------------
OID name
--------------
10 abc
11 abc
12 abc
13 xyz
--------------
Ouput should be :
Customer Table :
------------------
CID name oid
-------------------
1 abc 10
2 abc 11
3 abc 12
4 xyz 13
--------------------
I have tried the following
UPDATE customer as c, order as o
SET c.oid = o.OID
WHERE c.name = o.name;
-----------------------------
update customer INNER JOIN order on customer.name=Order.name
SET customer.oid=Order.OID
where customer.oid IS null;
But the customer table is being updated as follows
Customer Table :
------------------
CID name oid
-------------------
1 abc 10
2 abc 10
3 abc 10
4 xyz 13
--------------------
The idea is to assign a row number to each of the entries in Customer table and Order table.
Thus when making an inner join between these two tables you have two conditions right now (whereas previously it was one i.e. only name).
One condition is name and another one is the row_number
You can go with this query:
UPDATE Customer CT
INNER JOIN (
SELECT
customerTable.CID,
orderTable.OID FROM
(
SELECT
*,
#rn1 := #rn1 + 1 AS row_number
FROM
Customer C
CROSS JOIN (SELECT #rn1 := 0) var
ORDER BY CID
) AS customerTable
INNER JOIN (
SELECT
*,
#rn2 := #rn2 + 1 AS row_number
FROM
`Order` O
CROSS JOIN (SELECT #rn2 := 0) var
ORDER BY OID
) AS orderTable ON customerTable. NAME = orderTable. NAME
AND customerTable.row_number = orderTable.row_number
) AS combinedTable ON CT.CID = combinedTable.CID
SET CT.oid = combinedTable.OID
Note: Since joining these two tables on matching name is not sufficient for what are you looking for. That's why besides matching name assign a row_number to each of the rows (both in Customer and Order table. Then make an inner join between these two tables on matching name and row number. Thus you are restricting one entry to be joined multiple times with other entries from another table.
TEST SCHEMA & DATA:
Couldn't add an sql fiddle
DROP TABLE IF EXISTS `customer`;
CREATE TABLE `customer` (
`CID` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) NOT NULL,
`oid` int(11) DEFAULT NULL,
PRIMARY KEY (`CID`)
);
INSERT INTO `customer` VALUES ('1', 'abc', null);
INSERT INTO `customer` VALUES ('2', 'abc', null);
INSERT INTO `customer` VALUES ('3', 'abc', null);
INSERT INTO `customer` VALUES ('4', 'xyz', null);
DROP TABLE IF EXISTS `order`;
CREATE TABLE `order` (
`OID` int(11) NOT NULL,
`name` varchar(100) NOT NULL
);
INSERT INTO `order` VALUES ('10', 'abc');
INSERT INTO `order` VALUES ('11', 'abc');
INSERT INTO `order` VALUES ('12', 'abc');
INSERT INTO `order` VALUES ('13', 'xyz');
See now, how does the Customer table look like:
SELECT
*
FROM Customer;
Output:
CID name oid
1 abc 10
2 abc 11
3 abc 12
4 xyz 13
This is very complicated to do. You need to assign a counter variable to each value -- which is a bit painful in an update statement. But something like this should work:
update customer c join
(select c.*,
(#rn := if(#n = name, #rn + 1,
if(#n := name, 1, 1)
)
) as rn
from customer c cross join
(select #n := '', #rn := 0) params
order by name, cid
) cc
on c.cid = cc.cid join
(select o.*,
(#rno := if(#no = name, #rno + 1,
if(#no := name, 1, 1)
)
) as rn
from orders o cross join
(select #no := ', #rno := 0) params
) o
on c.name = o.name and c.rn = o.rn
set c.oid = o.oid;
Related
The issue that we are trying to tackle is best shown with the following illustrative example:
CREATE TABLE table_1
(
id INT UNSIGNED AUTO_INCREMENT,
colA INT,
colB VARCHAR(10),
PRIMARY KEY(id)
);
CREATE TABLE table_2
(
id INT UNSIGNED AUTO_INCREMENT,
colY INT,
colZ VARCHAR(10),
PRIMARY KEY(id)
);
INSERT INTO table_1(colA, colB) VALUES(1, 'NPD5A6V9EI'), (2, 'ISO4IK42YQ'), (4, 'J12QAN4O42'), (6,'V8YTZFHCU4');
INSERT INTO table_2(colY, colZ) VALUES(3, 'RBUNWLO753'), (4, 'X2BCEY7O8B'), (5, 'BNUS7R4225'), (6, '72NOWCTH5G');
We would like to select our result based on the value of colA in table_1 but if that does not return a result , we would like to return our result based on the value of colY in table_2. In other words SELECTing from table_2 is the backup for SELECTing from table_1. The query returns NULL only if neither table satisfies the condition.
A pseudo SQL query could be:
SELECT colB FROM table_1 where colA = 3 OR SELECT colZ FROM table_2 where colY = 3;
The query should return output based on the following I/O table:
I O
= =
1 NPD5A6V9EI -- From table_1
2 ISO4IK42YQ -- From table_1
3 RBUNWLO753 -- From table_2
4 J12QAN4O42 -- From table_1 (has precedence over table_2 entry)
5 BNUS7R4225 -- From table_2
6 V8YTZFHCU4 -- From table_1 (has precedence over table_2 entry)
9 NULL
Kindly suggest solutions that:
make use of the latest DB features (for posterity)
work with MySQL version 5.6.51 (for our application)
Write a subquery that generates all the I rows that you want.
Then left join this with the two tables, and use IFNULL to take the matching value from table_1 in preference to table_2.
SELECT ids.id AS I, IFNULL(t1.colB, t2.colZ) AS O
FROM (SELECT 1 AS id UNION ALL SELECT 2 UNION ALL SELECT 3 ... UNION ALL SELECT 9) AS ids
LEFT JOIN table_1 AS t1 ON t1.colA = ids.id
LEFT JOIN table_2 AS t2 ON t2.colY = ids.id
ORDER BY ids.id
I simply don't kn ow where you get your last row.
also with Myql 8 you can ise the window function ROW_NUMBER
the rest is self explantory, the sorting comes from colA and Col1, when there are teh same numbers the second column orderby2 comes and sorts first for the first table
CREATE TABLE table_1
(
id INT UNSIGNED AUTO_INCREMENT,
colA INT,
colB VARCHAR(10),
PRIMARY KEY(id)
);
CREATE TABLE table_2
(
id INT UNSIGNED AUTO_INCREMENT,
colY INT,
colZ VARCHAR(10),
PRIMARY KEY(id)
);
INSERT INTO table_1(colA, colB) VALUES(1, 'NPD5A6V9EI'), (2, 'ISO4IK42YQ'), (4, 'J12QAN4O42'), (6,'V8YTZFHCU4');
INSERT INTO table_2(colY, colZ) VALUES(3, 'RBUNWLO753'), (4, 'X2BCEY7O8B'), (5, 'BNUS7R4225'), (6, '72NOWCTH5G');
SELECT #i := #i +1 AS I,
colB AS O
FROM
(SELECT colA as orderby1,colB,1 ordberby2 froM table_1
UNION
SELECT colY, colZ,2 froM table_2 ) t1,(SELECT #i := 0) t2
ORDER BY orderby1,ordberby2
I | O
-: | :---------
1 | NPD5A6V9EI
2 | ISO4IK42YQ
3 | RBUNWLO753
4 | J12QAN4O42
5 | X2BCEY7O8B
6 | BNUS7R4225
7 | V8YTZFHCU4
8 | 72NOWCTH5G
db<>fiddle here
I'm trying to find the count of posts grouped by branch and category. I'm not getting the categories with count 0.
CREATE TABLE branches
(`id` serial primary key, `name` varchar(7) unique)
;
INSERT INTO branches
(`id`, `name`)
VALUES
(1, 'branch1'),
(2, 'branch2'),
(3, 'branch3')
;
CREATE TABLE categories
(`id` serial primary key, `category` varchar(4) unique)
;
INSERT INTO categories
(`id`, `category`)
VALUES
(1, 'cat1'),
(2, 'cat2')
;
CREATE TABLE posts
(`id` serial primary key, `branch_id` int, `category_id` int, `title` varchar(6), `created_at` varchar(10))
;
INSERT INTO posts
(`id`, `branch_id`, `category_id`, `title`, `created_at`)
VALUES
(1, 1, 1, 'Title1', '2017-12-14'),
(2, 1, 2, 'Title2', '2018-01-05'),
(3, 2, 1, 'Title3', '2018-01-10')
;
Expected Output:
+---------+----------+----+----+
| branch | category | c1 | c2 |
+---------+----------+----+----+
| branch1 | cat1 | 1 | 0 |
| branch1 | cat2 | 0 | 1 |
| branch2 | cat1 | 0 | 1 |
| branch2 | cat2 | 0 | 0 |
+---------+----------+----+----+
Query tried:
SELECT b.name, x.c1, y.c2 FROM branches b
LEFT JOIN (
SELECT COUNT(id) c1 FROM posts WHERE created_at < '2018-01-01'
GROUP BY posts.branch_id, posts.category_id
) x x.branch_id = b.id
LEFT JOIN (
SELECT COUNT(id) c2 FROM posts WHERE created_at BETWEEN '2018-01-01' AND '2018-01-31'
GROUP BY posts.branch_id, posts.category_id
) y y.branch_id = b.id
GROUP BY b.id
You need to CROSS JOIN branches and categories first; then LEFT JOIN to posts and do conditional counts based on your WHERE criteria.
Generic format:
SELECT x.data, y.data
, COUNT(CASE WHEN conditionN THEN 1 ELSE NULL END) AS cN
FROM x CROSS JOIN y
LEFT JOIN z ON x.id = z.x_id AND y.id = z.y_id
GROUP BY x.data, y.data
;
Note: COUNT (and pretty much all aggregate functions) ignore NULL values.
It looks like this might do what you want.
Explanation: Get each possible combination of branch/category for branches which exists in posts. Do a conditional sum to get the counts by date range and branch/category. Then join back to branch.
SELECT b.b_id branch,
b.category,
COALESCE(Range_Sum.C1,0) C1,
COALESCE(Range_Sum.C2,0) C2
FROM ( SELECT b.id b_id,
c.id c_id,
c.category
FROM branches b,
categories c
WHERE EXISTS
( SELECT 1
FROM posts
WHERE b.id = posts.branch_id
)
) b
LEFT
JOIN (SELECT p.branch_id,
c.id c_id,
c.category,
SUM
( CASE WHEN p.created_at < '2018-01-01' THEN 1
ELSE 0
END
) C1,
SUM
( CASE WHEN p.created_at BETWEEN '2018-01-01' AND '2018-01-31' THEN 1
ELSE 0
END
) C2
FROM posts p
INNER
JOIN categories c
ON p.category_id = c.id
GROUP
BY p.branch_id,
c.category,
c.id
) Range_Sum
ON b.b_id = Range_Sum.branch_id
AND b.c_id = Range_Sum.c_id;
Also, just a thing for writing easily readable queries - NEVER use x and y as aliases. Choose anything else that could possibly be more informative.
Maybe a little contrived...
SELECT DISTINCT x.branch_id
, y.category_id
, COALESCE(z.created_at < '2018-01-01',0) c1
, COALESCE(z.created_at BETWEEN '2018-01-01' AND '2018-01-31',0) c2
FROM posts x
JOIN posts y
LEFT
JOIN posts z
ON z.branch_id = x.branch_id
AND z.category_id = y.category_id;
http://sqlfiddle.com/#!9/8aabf2/31
i have this query
select a.*,
GROUP_CONCAT(b.`filename`) as `filesnames`
from `school_classes` a
join classes_data.`classes_albums` b on a.`school_key` = b.`school_key`
and a.`class_key` = b.`class_key`
group by a.`ID`
the result of this query is
i want to add to it
ORDER BY b.added_date DESC LIMIT 2
so the output of filenames column only shows latest 2 files , ?
It is not clear from your question what your tables look like and how they relate but this might be what you are after.
drop table if exists school_classes;
create table school_classes (id int,school_key int, class_key int);
drop table if exists classes_albums;
create table classes_albums(id int,school_key int, class_key int, filename varchar(3),dateadded date);
insert into school_classes values
(1,1,1), (2,1,2),(3,1,3)
insert into classes_albums values
(1,1,1,'a','2017-01-01'),(1,1,1,'b','2017-02-01'),(1,1,1,'c','2017-03-01');
select a.*, b.filenames
from school_classes a
join
(
select c.school_key,c.class_key,group_concat(c.filename order by c.rn desc) filenames
from
(
select c.*,
if(concat(c.school_key,c.class_key) <> #p, #rn:=1, #rn:=#rn+1) rn,
#p:=concat(c.school_key,c.class_key) p
from classes_albums c, (select #rn:=0, #ck:=0,#sk:=0) rn
order by c.school_key,c.class_key, c.dateadded desc
) c
where c.rn < 3
group by c.school_key,c.class_key
) b on b.school_key = a.school_key and b.class_key = a.class_key
+------+------------+-----------+-----------+
| id | school_key | class_key | filenames |
+------+------------+-----------+-----------+
| 1 | 1 | 1 | b,c |
+------+------------+-----------+-----------+
1 row in set (0.02 sec)
Table 1:mydata_table
Multiple Columns. But I want to fetch 3 columns' distinct records.
Email_Office | Email_Personal1 | Email_Personal2
Table 2:unique_emails
Only 3 Columns. Email_Office | Email_Personal1 | Email_Personal2
I tried this code for all three columns of table 2 to insert distinct records from each column.
insert into unique_emails
(email_personal1)
select distinct email_personal1
from mydata_table
where Email_Personal1!=""
It inserts the records but the problem is if there are 100 rows in email_office column and 300 rows in email_personal1 then it will show first 100 rows with email_office and remaining two columns are empty, then from 101th row, it will show email_personal1's records and remaining two columns are empty. I want to insert all rows together. How can I do that?
Try to creare unique index for columns that needs to be unique, and use the first SQL proposed by #Sergey with ignore:
INSERT IGNORE INTO
unique_emails
(email_office, email_personal1, email_personal2)
SELECT DISTINCT
email_office,
email_personal1,
email_personal2
FROM
mydata_table
WHERE
email_office!=""
OR Email_Personal1!=""
OR Email_Personal2!=""
If you want to insert unique triples of e-mails, you should use
insert into unique_emails
(email_office, email_personal1, email_personal2)
select distinct email_office, email_personal1, email_personal2
from mydata_table
where email_office!="" OR Email_Personal1!="" OR Email_Personal2!=""
But in that case some emails may occur multiple times in table unique_emails.
Supposing Table 1 consist of records:
Email_Office | Email_Personal1 | Email_Personal2
------------------------------------------------
a#a.com b#b.com c#c.com
a#a.com b#b.com c#c.com
a#a.com b#b.com e#e.com
a#a.com c#c.com NULL
a#a.com d#d.com e#e.com
then result Table 2
Email_Office | Email_Personal1 | Email_Personal2
------------------------------------------------
a#a.com b#b.com c#c.com
c#c.com e#e.com
d#d.com
UPD.
Try this: (SQL Fiddle)
insert into unique_emails
(email_office, email_personal1, email_personal2)
select b.email_office, c.email_personal1, d.email_personal2
from (
select #i := #i + 1 AS pos from mydata_table, (select #i := 0) r) a
left join (
select t.*, #j := #j + 1 AS pos
from (select distinct email_office from mydata_table where email_office!="") t,
(select #j := 0) r) b on b.pos = a.pos
left join (
select t.*, #k := #k + 1 AS pos
from (select distinct email_personal1 from mydata_table where email_personal1!="") t,
(select #k := 0) r) c on c.pos = a.pos
left join (
select t.*, #l := #l + 1 AS pos
from (select distinct email_personal2 from mydata_table where email_personal2!="") t,
(select #l := 0) r) d on d.pos = a.pos
where b.email_office is not null or c.email_personal1 is not null or d.email_personal2 is not null
Not sure how to describe what I'm trying to get from this question, but here goes...
I have a table of customer purchases, 't1', with information about the purchase: customer id, date, boolean if the customer was alone, and the purchase amount. In a second table t2 is another list of the same customer ids with a date and a boolean saying whether they were alone.
I want to update the second table with the AVERAGE of values of the previous x purchases they did before that date with whether they were alone.
I setup the tables with:
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
CREATE TABLE t1 (cid INT, d DATE, i INT, v FLOAT);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-01', 0, 10);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-02', 1, 20);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-03', 1, 30);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-04', 1, 40);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-05', 0, 50);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-06', 0, 60);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-07', 0, 70);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-08', 1, 80);
INSERT INTO t1 (cid, d,i,v) VALUES (1,'2001-01-09', 0, 90);
INSERT INTO t1 (cid, d,i,v) VALUES (2,'2001-01-04', 1, 35);
CREATE TABLE t2 (cid INT, d DATE, i INT, av2 FLOAT, av3 FLOAT);
INSERT INTO t2 (cid, d,i) VALUES (1,'2001-01-07', 0);
INSERT INTO t2 (cid, d,i) VALUES (1,'2001-01-08', 1);
INSERT INTO t2 (cid, d,i) VALUES (2,'2001-01-08', 0);
INSERT INTO t2 (cid, d,i) VALUES (2,'2001-01-09', 1);
av2 and av3 are the columns where i want the average of the last 2 or 3 transactions. So i need an update statement (two statements really, one for av2 and one for av3) to say "when this customer came in on this date, and they came in alone or not, what was the average of their last x purchases.
So the resulting data should be:
cid d i av2 av3
1 2001-01-07 0 55 40
1 2001-01-08 1 35 40
2 2001-01-08 0 null null
2 2001-01-08 1 35 35
The closest I got was this:
UPDATE t2 SET av=(
SELECT AVG(tcol)
FROM (
SELECT v AS tcol FROM t1 LIMIT 2
) AS tt);
which seems to be going in the right direction (the limit 2 is the 2 or 3 from the av columns. But that just averages x prior purchases (regardless of customer or the boolean). As soon as I put in the WHERE clause to link the data, it chokes:
UPDATE t2 SET av=(
SELECT AVG(tcol)
FROM (
SELECT v AS tcol FROM t1 WHERE t1.d<t2.d and t1.i=t2.i LIMIT 2
) AS tt);
Any ideas? Is there a name for what I'm trying to do? Do I need to describe this differently? Any suggestions?
Thanks,
Philip
Try this one -
SET #r = 0;
SET #cid = NULL;
SET #i = NULL;
UPDATE t2 JOIN (
SELECT t.cid, t.d, t.i, AVG(IF(t.r < 3, t.v, NULL)) av2, AVG(IF(t.r < 4, t.v, NULL)) av3 FROM (
SELECT t.*, IF(#cid = t.cid AND #i = t.i, #r := #r + 1, #r := 1) AS r, #cid := t.cid, #i := t.i FROM (
SELECT t2.*, t1.v FROM t2
JOIN t1
ON t1.cid = t2.cid AND t1.d < t2.d AND t1.i = t2.i
ORDER BY t2.cid, t2.i, t1.d DESC
) t
) t
GROUP BY t.cid, t.i, t.d
) t
ON t2.cid = t.cid AND t2.i = t.i AND t2.d = t.d
SET t2.av2 = t.av2, t2.av3 = t.av3;
SELECT * FROM t2;
+------+------------+------+------+------+
| cid | d | i | av2 | av3 |
+------+------------+------+------+------+
| 1 | 2001-01-07 | 0 | 55 | 40 |
| 1 | 2001-01-08 | 1 | 35 | 30 |
| 2 | 2001-01-08 | 0 | NULL | NULL |
| 2 | 2001-01-09 | 1 | 35 | 35 |
+------+------------+------+------+------+
Note:
The value av3 for cid=1, d=2001-01-08, i=1 should be 30, rigth? ...not 40.