I am having trouble converting MySQL query to Google Bigquery query. This is my MySQL query
SELECT id
FROM office_details
GROUP BY address
HAVING max(value)
ORDER BY id
This query runs perfectly on phpMyAdmin and with my php script. But when I convert it to bigquery
SELECT id
FROM Office_db.office_details
GROUP BY address
HAVING max(value)
ORDER BY id
It says column id is not in group by nor aggregated.
What I need is the ids of unique address where value is maximum. e.g
+-------------------------+
| id | address | value |
+-------------------------+
| 1 | a | 4 |
| 2 | a | 3 |
| 3 | b | 2 |
| 4 | b | 2 |
+-------------------------+
I need
+----+
| id |
+----+
| 1 |
| 3 |
+----+
#standardSQL
SELECT id FROM (
SELECT
id, address,
ROW_NUMBER() OVER(PARTITION BY address ORDER BY value DESC, id) AS flag
FROM office_details
)
WHERE flag = 1
Try this:
#standardSQL
SELECT ARRAY_AGG(id ORDER BY value DESC, id LIMIT 1)[OFFSET(0)] AS id
FROM office_details
GROUP BY address;
It's less prone to running out of memory than a solution using RANK will be (and may be faster), since it doesn't need to buffer all of the rows while computing ranks within a partition. As a working example:
#standardSQL
WITH office_details AS (
SELECT 1 AS id, 'a' AS address, 4 AS value UNION ALL
SELECT 2, 'a', 3 UNION ALL
SELECT 3, 'b', 2 UNION ALL
SELECT 4, 'b', 2
)
SELECT
address,
ARRAY_AGG(id ORDER BY value DESC, id LIMIT 1)[OFFSET(0)] AS id
FROM office_details
GROUP BY address
ORDER BY address;
This gives the result:
address | id
------------
a | 1
b | 3
A valid query might look as follows:
SELECT MIN(x.id) id
FROM office_details x
JOIN
( SELECT address
, MAX(value) value
FROM officer_details
GROUP
BY address
) y
ON y.address = x.address
AND y.value = x.value
GROUP
BY address
, value
Related
Lets say we have a table that looks like this:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 1 | JSDOAJD8 |2022-07-11 02:52:21|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
| 1 | DA8HWD8HHD |2022-07-11 02:51:49|
------------------------------------------------------
I want to select the last 3 entries into the table, however they must all have separate ID's.
Expected Result:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
------------------------------------------------------
I have already tried:
SELECT DISTINCT id FROM table ORDER BY time DESC LIMIT 3;
And:
SELECT MIN(id) as id FROM table GROUP BY time DESC LIMIT 3;
If you're not on MySQL 8, then I have two suggestions.
Using EXISTS:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
WHERE EXISTS
(SELECT ID
FROM mytable AS m2
GROUP BY ID
HAVING m1.ID=m2.ID
AND m1.time= MAX(time)
)
Using JOIN:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
JOIN
(SELECT ID, MAX(time) AS mxtime
FROM mytable
GROUP BY ID) AS m2
ON m1.ID=m2.ID
AND m1.time=m2.mxtime
I've not test in large data so don't know which will perform better (speed) however this should return the same result:
Here's a fiddle
Of course, this is considering that there will be no duplicate of exact same ID and time value; which seems to be very unlikely but still it's possible.
Using MySql 8 an easy solution is to assign a row number using a window:
select Id, random_string, time
from (
select *, Row_Number() over(partition by id order by time desc) rn
from t
)t
where rn = 1
order by time desc
limit 3;
See Demo
I am interested in MySQL of writing a query that looks through a list consisting of IDs and locations. Each ID represents a unique person who can be tied to multiple locations.
I have the following table to simplify things:
+----+----------+
| ID | Location |
+----+----------+
| 1 | Bldg#1 |
| 1 | Bldg#2 |
| 2 | Bldg#3 |
+----+----------+
I am looking to deduplicate the above table to only end up with ONE row per ID, but I would also like to add a conditional that preferences Bldg#1 for any given ID. In other words, given a table of multiple rows with potentially the same ID and multiple locations, I would like to write a query that outputs 1 row per ID, and if any of the rows associated with that ID also have a location of Bldg#1, I want to keep that row and drop the rest. Otherwise, I just want to keep one arbitrary location row for that ID.
For the above table, I would like the following as output:
+----+----------+
| ID | Location |
+----+----------+
| 1 | Bldg#1 |
| 2 | Bldg#3 |
+----+----------+
You can group by id and use conditional aggregation:
select id,
case
when max(location = 'Bldg#1') then 'Bldg#1'
else any_value(location)
end location
from tablename
group by id
See the demo.
Results:
| id | location |
| --- | -------- |
| 1 | Bldg#1 |
| 2 | Bldg#3 |
You can use row_number() with a case expression:
select id, location
from (select t.*,
row_number() over (partition by id
order by (case location when 'Bldg#1' then 1 when 'Bldg#2' then 2 when 'Bldg#3' then 3 else 4 end)
) as seqnum
from t
) t
where seqnum = 1;
This does not assume any particular ordering -- such as alphabetical ordering.
Is this you looking for?
Exmaple:
Query:
DROP TABLE IF EXISTS #TEST
CREATE TABLE #TEST (ID INT, Location NVARCHAR(10))
INSERT INTO #TEST
SELECT 1,'Bldg#1'
UNION
SELECT 1,'Bldg#2'
UNION
SELECT 2,'Bldg#3'
SELECT ID,Location FROM (
SELECT ID,Location, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID) AS RNM
FROM #TEST) T
WHERE RNM = 1
inner query will make sure the location is in order so that Bldg#1 is always the first for each id, so then the outer group by will pick the first record
SELECT * FROM
(
SELECT id, location
FROM location
ORDER BY id, location ASC
)a
GROUP BY id
I have a table with 2 columns, the first column is called ID and the second is called TRACKING. The ID column has duplicates, I want to to take all of those duplicates and consolidate them into one row where each value from TRACKING from the duplicate row is placed into a new column within the same row and I no longer have duplicates.
I have tried a few suggested things where all of the values would be concatenated into one column but I want these TRACKING values for the duplicate IDs to be in separate columns. The code below did not do what I intended it to.
SELECT ID, TRACKING =
STUFF((SELECT DISTINCT ', ' + TRACKING
FROM #t b
WHERE b.ID = a.ID
FOR XML PATH('')), 1, 2, '')
FROM #t a
GROUP BY ID
I am looking to take this:
| ID | TRACKING |
-----------------
| 5 | 13t3in3i |
| 5 | g13g13gg |
| 3 | egqegqgq |
| 2 | 14y2y24y |
| 2 | 42yy44yy |
| 5 | 8i535i35 |
And turn it into this:
| ID | TRACKING | TRACKING1 | TRACKING2 |
-----------------
| 5 | 13t3in3i | g13g13gg | 8i535i35 |
| 3 | egqegqgq | | |
| 2 | 14y2y24y | 42yy44yy | |
On (relatively) painful way to do this in MySQL is to use correlated subqueries:
select i.id,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 0
) as tracking_1,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 1
) as tracking_2,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 2
) as tracking_3
from (select distinct id from t
) i;
As bad as this looks, it will probably have surprisingly decent performance with an index on (id, tracking).
By the way, your original code with stuff() would put everything into one column:
select id, group_concat(tracking)
from t
group by id;
with test_tbl as
(
select 5 id, 'goog' tracking,'goog' tracking1
union all
select 5 id, 'goog1','goo'
union all
select 2 , 'yahoo','yah'
union all
select 2, 'yahoo1','ya'
union all
select 3,'azure','azu'
), modified_tbl as
(
select id,array_agg(concat(tracking)) Tracking,array_agg(concat(tracking1)) Tracking1 from test_tbl group by 1
)
select id, tracking[safe_offset(0)] Tracking_1,tracking1[safe_offset(0)] Tracking_2, tracking[safe_offset(1)] Tracking_3,tracking1[safe_offset(1)] Tracking_4 from modified_tbl where array_length(Tracking) > 1
I have a table like this:
+----+---------+------------+
| id | conn_id | read_date |
+----+---------+------------+
| 1 | 1 | 2010-02-21 |
| 2 | 1 | 2011-02-21 |
| 3 | 2 | 2011-02-21 |
| 4 | 2 | 2013-02-21 |
| 5 | 2 | 2014-02-21 |
+----+---------+------------+
I want the second highest read_date for particular 'conn_id's i.e. I want a group by on conn_id. Please help me figure this out.
Here's a solution for a particular conn_id :
select max (read_date) from my_table
where conn_id=1
and read_date<(
select max (read_date) from my_table
where conn_id=1
)
If you want to get it for all conn_id using group by, do this:
select t.conn_id, (select max(i.read_date) from my_table i
where i.conn_id=t.conn_id and i.read_date<max(t.read_date))
from my_table t group by conn_id;
Following answer should work in MSSQL :
select id,conn_id,read_date from (
select *,ROW_NUMBER() over(Partition by conn_id order by read_date desc) as RN
from my_table
)
where RN =2
There is an intresting article on use of rank functions in MySQL here : ROW_NUMBER() in MySQL
If your table design as ID - date matching (ie a big id always a big date), you can group by id, otherwise do the following:
$sql_max = '(select conn_id, max(read_date) max_date from tab group by 1) as tab_max';
$sql_max2 = "(select tab.conn_id,max(tab.read_date) max_date2 from tab, $sql_max
where tab.conn_id = tab_max.conn_id and tab.read_date < tab_max.max_date
group by 1) as tab_max2";
$sql = "select tab.* from tab, $sql_max2
where tab.conn_id = tab_max2.conn_id and tab.read_date = tab_max2.max_date2";
I am doing the next query:
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
Supposing table is like this:
| id | name | keyt |
+ ------------------------- +
| 1 | Hello | 21 |
| 3 | Katzet | 1 |
| 1 | Welcome | 1 |
| 2 | Two | 21 |
| 2 | Other | 1 |
It should return one of this pairs:
Hello | Welcome (id 1 in common)
Two | Other (id 2 in common)
So, the idea is:
Get one id, which has the keyt value set to 21
Then, get all the rows with this selected id (independently of all the other keyt values)
If I do as you suggested... I would get mixed id values, and all result rows must have the same id.
SELECT x.*
FROM my_table x
JOIN
( SELECT id
FROM my_table
WHERE keyt = 21
ORDER
BY RAND() LIMIT 1
) y
ON y.id = x.id;
The subquery in this query
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
would return only one record as it has LIMIT 1 added at the end.
Also, in your question, the table contains only 1 record for which
value of keyt = 21, due to which you're getting only one record.
If you want more records, you should remove the LIMIT. In that case you may rephrase your query as:
SELECT id, name, keyt
FROM table
WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())
Hope this is what you expected. As your actual goal is not very clear from the question.
Your table has two 21 in the keyt column so your subquery in the where clause returns 2 values if id that is 1 and 2.So what you need to do is instead of using an equal to operator "=" use IN operator in the where clause.
SELECT id, name, keyt FROM table WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())