I'm trying to join distinct ID's from a subquery in a FROM onto a table which has the same ID's, but non-distinct as they are repeated to create a whole entity. How can one do this? All of my tries are continuously amounting to single ID's in the non-distinct-id-table.
For example:
Table 1
ID val_string val_int val_datetime
1 null 3435 null
1 bla null null
1 null null 2013-08-27
2 null 428 null
2 blob null null
2 null null 2013-08-30
etc. etc. etc.
Virtual "v_table" from SubQuery
ID
1
2
Now, if I create the query along the lines of:
SELECT t.ID, t.val_string, t.val_int, t.val_datetime
FROM table1 AS t
JOIN (subquery) AS v_table
ON t.ID = v_table.ID
I get the result:
Result Table:
ID val_string val_int val_datetime
1 null 3436 null
2 null 428 null
What I'd like is to see the whole of Table 1 based on this example. (Actual query has some more parameters, but this is the issue I'm stuck on).
How would I go about making sure that I get everything from Table 1 where the ID's match the ID's from a virtual table?
SELECT t.ID, t.val_string, t.val_int, t.val_datetime
FROM table1 AS t
LEFT JOIN (subquery) AS v_table
ON t.ID = v_table.ID
Sample fiddle
Related
Why the o/p is different if I self join the table using a.b instead of a.c? Moreover, why the o/p is Y is equal to 1 and not 2?
CREATE TABLE A (
B INT,
C CHAR(20)
);
INSERT INTO A VALUES (1,"X"),(2,"X"),(3,"Y"),(1,"T"),(2,"T");
SELECT
*
FROM
A;
SELECT
a.c, COUNT(a.c) AS c1
FROM
A a
JOIN
A a1 ON a.c = a1.c
GROUP BY a.c;
The reason you get a duplicate is because there is no GROUP BY or DISTINCT clause in your query reducing the results to distinct rows based on the column value. Your query returns a count for every row of the table, that is normal.
The reason your results are different is because your joining on a different column, so if you are joining on the number column, you are counting the numbers, not the letters.
TableA
If your raw table values are as follows:
SELECT * FROM tableA;
id | letter
-----------
1 | X
1 | T
2 | X
2 | T
3 | Y
Example 1
You can manually write a query in your select statement as follows. Effectively, a separate query is performed for every row returned.
EXPLAIN -- show query breakdown
SELECT
DISTINCT -- get distinct letter, no duplicate rows.
a1.letter
, (SELECT count(*) FROM tableA a2 WHERE a1.letter = a2.letter) letter_cnt
FROM
tableA a1
;
This query is less effective, requiring two separate queries.
id
select_type
table
type
possible_keys
key
ref
rows
filtered
Extra
1
PRIMARY
a1
ALL
NULL
NULL
NULL
5
100
NULL
2
DEPENDENT SUBQUERY
a2
ALL
NULL
NULL
NULL
5
20
Using where
letter | id
-----------
X | 2
Y | 1
T | 2
Example 2
EXPLAIN -- show query breakdown
SELECT
a.letter
, COUNT(a.letter) AS cnt
FROM
tableA a
GROUP BY
a.letter;
This method is more effective using one query and groups the first column giving you distinct letter rows.
id
select_type
table
type
possible_keys
key
ref
rows
filtered
Extra
1
SIMPLE
a
ALL
NULL
NULL
NULL
5
100
Using temporary; Using filesort
letter | letter
-----------
T | 2
X | 2
Y | 1
This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 3 years ago.
I have a log table in MySQL (5.7.14) with the following schema:
CREATE TABLE logs
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
entry_date DATE NOT NULL,
original_date DATE NOT NULL,
ref_no VARCHAR(30) NOT NULL
) Engine=InnoDB;
INSERT INTO logs VALUES
(1,'2020-01-01','2020-01-01','XYZ'),
(2,'2020-01-01','2020-01-01','ABC'),
(3,'2020-01-02','2020-01-01','XYZ'),
(4,'2020-01-02','2020-01-01','ABC'),
(5,'2020-01-03','2020-01-02','XYZ'),
(6,'2020-01-03','2020-01-01','ABC');
I want to return the first row for each unique (original_date, ref_no) pairing, where 'first' is defined as 'lowest id'.
For example, if I had the following data:
id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01 |XYZ
2 |2020-01-01|2020-01-01 |ABC
3 |2020-01-02|2020-01-01 |XYZ
4 |2020-01-02|2020-01-01 |ABC
5 |2020-01-03|2020-01-02 |XYZ
6 |2020-01-03|2020-01-01 |ABC
I would want the query to return:
id|entry_date|original_date|ref_no
--+----------+-------------+------
1 |2020-01-01|2020-01-01 |XYZ
2 |2020-01-01|2020-01-01 |ABC
5 |2020-01-03|2020-01-02 |XYZ
In other words:
Row 1 is returned because we haven't seen 2020-01-01,XYZ before.
Row 2 is returned because we haven't seen 2020-01-01,ABC before.
Row 3 is not returned because we have seen 2020-01-01,XYZ before (row 1).
Row 4 is not returned because we have seen 2020-01-01,ABC before (row 2).
Row 5 is returned because we haven't seen 2020-01-02,XYZ before.
Row 6 is not returned because we have seen 2020-01-01,ABC before (row 2).
Is there a way to do this directly in SQL? I've considered DISTINCT but I think that only returns the distinct columns, whereas I want the full row.
To avoid a correlated subquery you can do:
select l.*
from logs l
join (
select original_date, ref_no, min(id) as min_id
from logs
group by original_date, ref_no
) x on l.id = x.min_id
You can use a correlated subquery:
select l.*
from logs l
where l.id = (select min(l2.id)
from logs l2
where l2.original_date = l.original_date and
l2.ref_no = l.ref_no
);
For performance, you want an index on logs(original_date, ref_no, id).
Try this:
select t1.*
from logs AS t1
left join logs AS t2 on
(
t2.original_date = t1.original_date and
t2.ref_no = t1.ref_no and
t2.id < t1.id
)
where
t2.original_date is null and
t2.ref_no is null
Example I have table 1:
ID USER Password
1 name1 pass1
2 name2 pass2
Table 2 is blank but have columns:
ListID FNAME LNAME
null null null
null null null
I want to pass the record from table1 to table 2 or join them together so every record inserted to table 1 will be inserted to table 2 as well.
how about adding an relational id's in both table and truncate it..
I have a table like this:
userid | trackid | path
123 70000 ad
123 NULL abc.com
123 NULL Apply
345 70001 Apply
345 70001 Apply
345 NULL Direct
345 NULL abc.com
345 NULL cdf.com
And I want a query like this. When path='abc.com', num_website +1; when path='Apply', num_apply +1
userid | num_website | num_Apply | num_website/num_Apply
123 1 1 1
345 1 2 0.5
My syntax looks like this:
select * from
(select userid,count(path) as is_CWS
from TABLE
where path='abc.com'
group by userid
having count(path)>1) a1
JOIN
(select userid,count(userid) as Apply_num from
where trackid is not NULL
group by userid) a2
on a1.userid=a2.userid
My question is
1. how to have the field num_website/num_apply in term of my syntax above?
2. is there any other easier way to get the result I want?
Any spots shared will appreciate.
The simplest way to do it would be to change the select line:
SELECT a1.userid, a1.is_CWS, a2.Apply_num, a1.is_CWS/a2.Apply_num FROM
(select userid,count(path) as is_CWS
from TABLE
where path='abc.com'
group by userid
having count(path)>1) a1
JOIN
(select userid,count(userid) as Apply_num
from TABLE
where trackid is not NULL
group by userid) a2
on a1.userid=a2.userid
and then continue with the rest of your query as you have it. The star means "select everything." If you wanted to select only a few things, you would just list those things in place of the star, and if you wanted to select some other values based on those things, you would put those in the stars as well. In this case a1.is_CWS/a2.Apply_num is an expression, and MySql knows how to evaluate it based on the values of a1.is_CWS and a2.Apply_num.
In the same vein, you can do a lot of what those subqueries are doing in a single expression instead of a subquery. objectNotFound has the right idea. Instead of doing a subquery to retrieve the number of rows with a certain attribute, you can select SUM(path="abc.com") as Apply_num and you don't have to join anymore. Making that change gives us:
SELECT a1.userid,
SUM(path="abc.com") as is_CWS,
a2.Apply_num,
is_CWS/a2.Apply_num FROM
TABLE
JOIN
(select userid,count(userid) as Apply_num
FROM TABLE
where trackid is not NULL
group by userid) a2
on a1.userid=a2.userid
GROUP BY userid
Notice I moved the GROUP BY to the end of the query. Also notice instead of referencing a1.is_CWS I now reference just is_CWS (it's no longer inside the a1 subtable so we can just reference it)
You can do the same thing to the other subquery then they can share the GROUP BY clause and you won't need the join anymore.
to get you started ... you can build on top of this :
select
userid,
SUM(CASE WHEN path='abc.com'then 1 else 0 end ) as num_website,
SUM(CASE WHEN path='Apply' and trackid is not NULL then 1 else 0 end ) as Apply_Num
from TABLE
WHERE path='abc.com' or path='Apply' -- may not need this ... play with it
group by userid
I'm looking to create a sql statement that will update a large set of data.
What I have is a table like
id, transid, amount, narative1, narative 2, total, active
1 1234 23.2 NULL NULL NULL 1
2 1234 120.33 NULL NULL NULL 1
3 1235 98.00 NULL NULL NULL 1
When there are two rows with the same transid I need to total them put the result in the total column of the first one with that transid and put the second amount in naritive2 of the first instance as well as make the second one inactive. It should ignore single rows for a transid.
The result of what I want to do should be:
id, transid, amount, narative1, narative 2, total, active
1 1234 23.2 NULL 120.33 143.53 1
2 1234 120.33 NULL NULL NULL 0
3 1235 98.00 NULL NULL NULL 1
I know a bit of a thong twister but..
Ideally I'd like to do this in just a MySQL statements. So I don't mind having to do multiple sql statements but I want to avoid connecting it to PHP etc. Its a very large set of data.
This will update only those transactions that have exactly 2 rows (not 1 and not 3 or more).
UPDATE mytable mtu
JOIN (
SELECT minid, maxid, mtmin.amount AS minamt, mtmax.amount AS maxamt
FROM (
SELECT MIN(id) AS minid, MAX(id) AS maxid
FROM mytable mti
GROUP BY
transid
HAVING COUNT(*) = 2
) mt
JOIN mytable mtmin
ON mtmin.id = minid
JOIN mytable mtmax
ON mtmax.id = maxid
) mts
ON id IN (minid, maxid)
SET narative2 = CASE id WHEN minid THEN minamt ELSE NULL END,
total = CASE id WHEN minid THEN minamt + maxamt ELSE NULL END,
active = (id = minid)