SQL Select max value by grouping two columns - mysql

Here is my sql fiddle.
http://sqlfiddle.com/#!2/7f0780/1/0
I seem to have a problem that when I group two columns to get the max() value it returns the wrong associated data.
You will see the id's are incorrect.
Could someone please help me.
create table table1 (id int,id1 int, id2 int, version int);
insert into table1 values
(1,7,9,1),
(2,7,9,2),
(3,7,9,3),
(4,7,9,4),
(5,9,7,5),
(6,9,7,6);
SELECT max(version),id
FROM table1
group BY
id1,id2
MAX(VERSION) ID
4 1
6 5

Your SQL Query is:
SELECT max(version), id
FROM table1
group BY id1, id2
Note that you are grouping by two columns. But, you are selecting neither of them in the select statement. Instead, you have id. The value of id comes from an arbitrary row, as explained in the MySQL documentation. My advice is to never use this extension, unless you really, really understand what you are doing.
If you want the id associated with the maximum value, you can do it using not exists:
select *
from table1 t
where not exists (select 1
from table1 t1
where t1.id1 = t.id1 and
t1.id2 = t.id2 and
t1.version > t.version
);
That is, select all rows from table1 where the version for the id1/id2 pair has no larger value.
EDIT:
I should add that for performance reasons, an index on table1(id1, id2, version) will help this query a lot.

Related

Select column from selected column subquery [duplicate]

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).

Create a new variable in SQL by groupby

I have 2 sql table as follows:
First table t1:
Second table t2:
I need to calculate the count of "Number" column based on "Name" column from t1 and merge it with t2.
I wrote following code. But it seems not working
select *
from (
select Name, count(Number) as count
from t1
group by Name ) as a
join ( select *
from t2 ) as b
on a.Name = b.Name;
Can any one figure out what is wrong ? Thank you very much
I think you want to use SUM() instead of COUNT().
Because SUM() sums some integers, while COUNT() counts number of occurencies.
And as also stated in the comments, multiple columns with same names will create conflicts, so you have to select the wanted columns explicit (that is usually a good idea anyway).
You could obtain your wanted endgoal by this query:
select
SUM(Number),
t1.Name,
(select val1 FROM t2 WHERE t2.Name = t1.Name LIMIT 1) as val1
FROM t1
GROUP BY t1.Name
Example in sqlfiddle: http://sqlfiddle.com/#!9/04dddf/7

Error when I delcare my own variable in MySQL [duplicate]

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).

MySQL: selecting rows by two of three primary keys

I am searching for exactly the opposite of what Jonathan was searching in this example:
How to select multiple rows by multi-column primary key in MySQL?
Having 3 columns as a primary key (the 3rd is a date), I want to select all of them without the most recent one. And if there is no second entry for a combination of the first two primary values, I don't want to select it at all. Think of it as a kind of versioning. The table-structure contains more columns than those three and i want to select the whole rows.
Looks something like that:
{ID1 | ID2 | DATE} | more columns ...
Pseudocode:
SELECT * FROM table WHERE (first and second primary value are the same and exist more than once) AND NOT MAX(date)
:D
I want to output the data of all previous versions of the row, not including the most recent one.
Thanks in advance for any suggestions!
Break down the problem into steps:
Pseudo logic:
Get a data set with the records we want to exclude
now exclude that data set from the entire set
Step 1: Get a dataset of only those records having a max data for ID1, ID2
SELECT ID1, ID2, Max(date) date
FROM Table
GROUP BY ID1, ID2
Step 2: Now use that data set to identify/eliminate the records you don't want.. a not exists is likely the fastest.
Faster...
SELECT A.*
FROM TABLE A
WHERE NOT EXISTS
(SELECT 1
FROM (SELECT ID1, ID2, Max(date) date
FROM Table
GROUP BY ID1, ID2) B
WHERE A.ID1 = B.ID1
and A.ID2 = B.ID2
and A.Date = B.Date)
or as a self outer join on a subset, slower but gives you access to additional details on subset if needed. (not much use in this example but could be useful in other circumstances)
The left join to the data set shows those that match on the max date, so all other records would be null, which is the data set you're after...
SELECT A.*
FROM TABLE A
LEFT JOIN (SELECT ID1, ID2, Max(date) date
FROM Table
GROUP BY ID1, ID2) B
on A.ID1 = B.ID1
and A.ID2 = B.ID2
and A.Date = B.Date
WHERE B.ID1 is null

Copy rows if value exists x amount of times

I have two tables Board1 and Board2 with the identical structure. They both have a primary index column of id. I have a THIRD table called Table1, which has a non-indexed column board_id, where the same board_id occurs multiple times. board_id always corresponds to an id in Board1. Board2 is currently empty, and I want to add rows from Board1, but only where the same board_id occurs at least six times in Table1. Table1 will be changing periodically, so I'll be needing to do the query in the future, but without doubling id rows which are already in Board2.
So to recap:
There are three tables: Board1, Board2, and Table1. I want to copy rows from Board1 to Board2, but only where the id in the Board1 occurs (at least) six times in Table1 as `board_id'.
I'd appreciate any help!
EDIT: I'm dreadfully sorry, but I realized I made a huge mistake in my question. I've rewritten it to reflect what I actually needed. I'm truly sorry.
You can do it like this
INSERT INTO Table2
SELECT
id,
board_id
FROM (SELECT
b.id,
b.board_id,
bl.Count
FROM board as b
LEFT JOIN (SELECT
board_id,
COUNT(board_id) as `Count`
FROM board
GROUP BY board_id) as bl
on bl.board_id = b.board_id
group by b.id
having bl.Count >= 6) as L
If you need more columns you can select them in inner and outer queries.
Fiddle Demo for Select
Here is what you asked for, with fiddle
INSERT Table2
SELECT
*
FROM
Table1
JOIN
(
SELECT
Board_Id,
count(*) cnt
FROM
Table1
GROUP BY
Board_Id
) BoardIds
ON BoardIds.Board_Id = Table1.Board_Id
WHERE
BoardIds.cnt > 5
AND
NOT EXISTS (SELECT id FROM Table2 WHERE Table2.id = Table1.id)
Try something like the below:
Add your column names where specified (excluding any ID columns), as I'm assuming each row will have a unique ID, so you won't be able to GROUP and COUNT by doing SELECT * FROM Table1
You may need to test / validate this
INSERT INTO Board2 (Your Column Names)
SELECT (Your Column Names)
FROM Board1
WHERE id (IN (SELECT board_id
FROM Table1
GROUP BY (board_id)
HAVING (COUNT(*) >= 6))
AND board_id NOT IN(SELECT DISTINCT board_id FROM Board2)