MySQL GROUP BY order - mysql

Please consider the following table structure and data:
+--------------------+-------------+
| venue_name | listed_by |
+--------------------+-------------+
| My Venue Name | 1 |
| Another Venue | 2 |
| My Venue Name | 5 |
+--------------------+-------------+
I am currently using MySQL's GROUP BY function to select only unique venue names. However, this only returns the first occurance of My Venue Name, but I would like to return it based on a condition (in this case where the listed_by field has a value > 2.
Essentially here's some pseudo-code of what I'd like to achieve:
Select all records
Group by name
if grouped, return the occurance with the higher value in listed_by
Is there an SQL statement that will allow this functionality?
Edit: I should have mentioned that there are other fields involved in the query, and the listed_by field needs to be used elsewhere in the query, too. Here is the original query that we're using:
SELECT l1.field_value AS venue_name,
base.ID AS listing_id,
base.user_ID AS user_id,
IF(base.user_ID > 1, 'b', 'a') AS flag,
COUNT(img.ID) AS img_num
FROM ( listingsDBElements l1, listingsDB base )
LEFT JOIN listingsImages img ON (base.ID = img.listing_id AND base.user_ID = img.user_id and img.active = 'yes')
WHERE l1.field_name = 'venue_name'
AND l1.field_value LIKE '%name%'
AND base.ID = l1.listing_id
AND base.user_ID = l1.user_id
AND base.ID = l1.listing_id
AND base.user_ID = l1.user_id
AND base.active = 'yes'
GROUP BY base.Title ORDER BY flag desc,img_num desc

As long as you didn't mention other fields - here is the simplest solution:
SELECT venue_name,
MAX(listed_by)
FROM tblname
WHERE listed_by > 2
GROUP BY venue_name
With other fields it could look like (assuming there is no duplicates in venue_name + listed_by pairs):
SELECT *
FROM tblname t1
INNER JOIN (SELECT venue_name,
MAX(listed_by) max_listed_by
FROM tblname
WHERE listed_by > 2
GROUP BY venue_name) t2 ON t1.venue_name = t2.venue_name
AND t1.listed_by = t2.max_listed_by

Related

Writing more better SQL

I've got a query here that's painfully slow. Part of the problem may be that tableA in the sub-query has a quite substantial size in comparison to the other tables.
TABLES STRUCTURE
*-------------------*------------------*-------------------*
| ID_TABLE | DATA_TABLE | DATA_TABLE_EXT |
*-------------------*------------------*-------------------*
| id n<|>1 id 1<|>n owner_id |
| foreign_id | owner_id | information |
| foreign_id_source | date_field | ... |
| ... | ... | |
*-------------------*------------------*-------------------*
QUERY
SELECT ID_TABLE.foreign_id_source, count(ID_TABLE.id) as count
FROM DATA_TABLE
LEFT JOIN ID_TABLE ON DATA_TABLE.id = ID_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_id'
AND DATA_TABLE.date_field > 'some_date'
AND DATA_TABLE.id IN (
SELECT DATA_TABLE_EXT.owner_id FROM DATA_TABLE_EXT
JOIN DATA_TABLE ON DATA_TABLE_EXT.owner_id = DATA_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_id'
GROUP BY DATA_TABLE.id
HAVING SUM(ABS(DATA_TABLE_EXT.information)) <> 0
)
GROUP BY ID_TABLE.foreign_id_source
ORDER BY count ASC
REQUIRED RESULT
*-------------------*-------------*
| foreign_id_source | count |
*-------------------*-------------*
| source1 | 45 |
| source2 | 10 |
| ... | |
*-------------------*-------------*
Each id in DATA_TABLE may have multiple records in ID_TABLE.
many records in DATA_TABLE may have the same owner_id.
I'm looking for the number of records in data_table with a foreign_id_source, grouped by that foreign_id_source, where the record is after 'some_date' and it's DATA_TABLE_EXT records do not all have a value of 0 in the information field.
Short of creating indexes or other database manipulation is there a way to improve this query in terms of performance?
Any other suggestions are also welcome.
The point is: SUM(ABS(DATA_TABLE_EXT.information)) <> 0 can only be true if at least one DATA_TABLE_EXT.information is non-zero. So we don't have to sum() them, we only only need to check if a non-zero one exists.
[ I don't know if mysql is smart enough to handle the exists(), but in theory it is cheaper, and can be faster]
SELECT it.foreign_id_source, count(it.id) as count
FROM DATA_TABLE dt
LEFT JOIN ID_TABLE it ON dt.id = it.id
WHERE dt.owner_id = 'some_id'
AND dt.date_field > 'some_date'
AND EXISTS (
SELECT *
FROM DATA_TABLE_EXT x
JOIN DATA_TABLE dt2 ON x.owner_id = dt2.id
WHERE x.id =dt.id
AND dt2.owner_id = 'some_id'
AND x.information <> 0
)
GROUP BY it.foreign_id_source
ORDER BY count ASC
;
Often moving the subquery to the FROM will help:
SELECT ID_TABLE.foreign_id_source, count(DATA_TABLE.id) as count
FROM ID_TABLE LEFT JOIN
DATA_TABLE
ON DATA_TABLE.id = ID_TABLE.id JOIN
(SELECT DATA_TABLE.id
FROM DATA_TABLE_EXT JOIN
DATA_TABLE
ON DATA_TABLE_EXT.owner_id = DATA_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_value'
GROUP BY DATA_TABLE.id
HAVING SUM(ABS(DATA_TABLE_EXT.information)) <> 0
) xx
ON DATA_TABLE.id = xx.id
WHERE DATA_TABLE.owner_id = 'some_value' AND
DATA_TABLE.date_field > 'some_date'
GROUP BY x.field1
ORDER BY count ASC;
Then, you can think about indexes. These would be tableX(field2, fieldZ, field1, fieldX), tableI(field1), tableX(field2, field1, fieldB), andtableA(field1)`.

MySQL - select rows under an ID, group by column value that has the latest timestamp

Table:
----------------------------------------------------
ID | field_name | field_value | timestamp
----------------------------------------------------
2 | postcode | LS1 | 2016-11-09 16:45:15
2 | age | 34 | 2016-11-09 16:45:22
2 | job | Scientist | 2016-11-09 16:45:27
2 | age | 38 | 2016-11-09 16:46:40
7 | postcode | LS5 | 2016-11-09 16:47:05
7 | age | 24 | 2016-11-09 16:47:44
I wonder if anyone could give me a few pointers, based on the above data, I would like to query by ID 2, return a row for each unique field_name (if more than one row exists under the same id with the same field_name then just return the row with the latest timestamp).
I have managed to almost achieve this by grouping the field_name, which will return a list of unique rows but not necessarily the latest row.
SELECT * FROM fragment WHERE (id = :id) GROUP BY field_name
I would really be grateful for any pointers on what exactly I should do here, and how I could fit something along the lines of MAX(timestamp) in this query,
Many thanks!
Consider you first need a set of data for each ID, FieldName with the max time stamp. (generate that set) as an inline view (B below). Then, join this set (B) back to your base set allowing the inner join to eliminate the unwanted rows.
SELECT A.ID, A.field_name, A.field_value, A.timestamp
FROM Table A
INNER JOIN (SELECT ID, field_name, MAX(timestamp) TS
FROM table
GROUP BY ID, field_name) B
on A.ID = B.ID
and A.field_name = B.field_name
and A.timestamp = B.TS
Outside of MySQL this could be done using window/analytical functions as you would be able to assign a row number to each record and eliminate those > 1 something like....
SELECT B.*
FROM (SELECT A.ID
, A.field_name
, A.field_Vale
, A.timestamp
, Rownumber() over (Order by A.timestamp Desc) RN
FROM Table A ) B
WHERE B.RN = 1
or using a cross apply with a limit or top.
The Simpliest way to do:
SELECT *
FROM fragment fra1
WHERE (id = :id)
and timestamp = (select max(timestamp)
from fragment fra2
where fra2.id = fra1.id
and fra2.field_name = fra1.field_name)
GROUP BY field_name

select by priority

Table Structure :
Registration :
uuid | name | total
Rate :
uuid | type | rate
Registration_Rate :
registration | rate
Initial Request is :
select * from registration r
join registration_rate rr on rr.registration = r.uuid
join rate rt on rt.uuid = rr.rate
group by r.name, rt.type
My SQL result from two table (registration & rate ) is :
uuid | name | rate | type
1 | AAA | 15 | U
2 | BBB | 20 | U
3 | CCC | 300 | F
4 | AAA | 250 | F
I would like to have something like this (if a rate's type 'F' exists then display instead)
uuid | name | rate | type
2 | BBB | 20 | U
3 | CCC | 300 | F
4 | AAA | 250 | F
Thanks
Edited :
I have tried another solution which works
select uuid, name, rate, (case rt.type when 2 then 2 else 1 end ) as type
from registration r
join registration_rate rr on rr.registration = r.uuid
join rate rt on rt.uuid = rr.rate
group by r.name, rt.type
If it's an F row return it. Or, use NOT EXISTS to verify no other row with same name has an F.
select t1.*
from tablename t1
where type = 'F'
or not exists (select * from tablename t2
where t2.name = t1.name
and t2.type = 'F')
Alternative solution:
select t1.*
from tablename t1
join (select name, min(type) type
from tablename
group by name) t2
ON t1.name = t2.name and t1.type = t2.type
Try this (I suggest main idea)
SELECT t.uuid,
t.name,
IFNULL(MAX(t.F_type), MAX(t.not_F_type)) AS "type",
IFNULL(MAX(t.F_rate), MAX(t.not_F_rate)) AS "rate"
FROM
(
SELECT r.uuid,
r.name,
CASE rt.type WHEN 'F' THEN rt.type END AS F_type,
CASE WHEN rt.type <> 'F' THEN rt.type END AS not_F_type,
CASE rt.type WHEN 'F' THEN rt.rate END AS F_rate,
CASE WHEN rt.type <> 'F' THEN rt.rate END AS not_F_rate
FROM registration AS r
JOIN registration_rate AS rr ON rr.registration = r.uuid
JOIN rate AS rt ON rt.uuid = rr.rate
) as t
GROUP BY t.uuid, t.name;
So, you need to split appropriate columns ("rate", "type") according to your rule (if a rate's type 'F' exists then display instead of others) into two new separate columns using case statement: the first column contains value for F type and the second one contains value for others types. I did it for "type" and "rate" columns. Then I glued together these columns (and records) using group by, aggregation functions and IFNULL statement (you can use others statement here: case, IF, etc).
As I understand the question, this is what you need.

MySQL select rows where given date lies between the dates stored in table

Suppose I have some data like:
id status activity_date
--- ------ -------------
101 R 2014-01-12
101 Mt 2014-04-27
101 R 2014-05-18
102 R 2014-02-19
Note that for rows with id = 101 we have activity between 2014-01-12 to 2014-04-26 and 2014-05-18 to current date.
Now I need to select that data where status = 'R' and the date is the most current date as of a given date, e.g. if I search for 2014-02-02, I would find the status row created on 2014-01-12, because that was the status that was still valid at the time for entity ID 101.
If I understand correctly:
Step 1: Convert the start and end date rows into columns. For this, you must join the table with itself based on this criteria:
SELECT
dates_fr.id,
dates_fr.activity_date AS date_fr,
MIN(dates_to.activity_date) AS date_to
FROM test AS dates_fr
LEFT JOIN test AS dates_to ON
dates_to.id = dates_fr.id AND
dates_to.status = 'Mt' AND
dates_to.activity_date > dates_fr.activity_date
WHERE dates_fr.status = 'R'
GROUP BY dates_fr.id, dates_fr.activity_date
+------+------------+------------+
| id | date_fr | date_to |
+------+------------+------------+
| 101 | 2014-01-12 | 2014-04-27 |
| 101 | 2014-05-18 | NULL |
| 102 | 2014-02-19 | NULL |
+------+------------+------------+
Step 2: The rest is simple. Wrap the query inside another query and use appropriate where clause:
SELECT * FROM (
SELECT
dates_fr.id,
dates_fr.activity_date AS date_fr,
MIN(dates_to.activity_date) AS date_to
FROM test AS dates_fr
LEFT JOIN test AS dates_to ON
dates_to.id = dates_fr.id AND
dates_to.status = 'Mt' AND
dates_to.activity_date > dates_fr.activity_date
WHERE dates_fr.status = 'R'
GROUP BY dates_fr.id, dates_fr.activity_date
) AS temp WHERE '2014-02-02' >= temp.date_fr and ('2014-02-02' < temp.date_to OR temp.date_to IS NULL)
+------+------------+------------+
| id | date_fr | date_to |
+------+------------+------------+
| 101 | 2014-01-12 | 2014-04-27 |
+------+------------+------------+
SQL Fiddle
You can try
select id, status, activity_date
from TABLE
where status = "R" and activity_date = "2014-02-02"
where TABLE is name of your table
I think you need following ans
SELECT id,MAX(CAST(ACTIVITY_DATE AS date),MIN(CAST (ACTIVITY_DATE AS date)
FROM Table_Name WHERE CAST('2014-02-02' AS date)
BETWEEN MIN(CAST (ACTIVITY_DATE AS date) AND MAX(CAST(ACTIVITY_DATE AS date)
AND Status='R'
GROUP BY id
Try this:
select * from yourtable
where status='R' and activity_date= '2014-02-02'
You can make a query to effectively give you the most status as of a date, e.g.
SELECT
id,
substr(max(concat(activity_date, status)),11) as status,
max(activity_date) as activity_date
FROM table
WHERE activity_date <= '2014-02-02'
GROUP by id;
Then, similar to Salman's answer, you can use this result inside another query and look for all those results with a status of 'R'
SELECT * from (
SELECT
id,
substr(max(concat(activity_date, status)),11) as status,
max(activity_date) as activity_date
FROM table
WHERE activity_date <= '2014-02-02'
GROUP by id
) AS temp WHERE temp.status = 'R';
Edit: Rather than use the questionable method of sorting the statuses, you could identify the relevant maximum record with a sub-query, so the original query would become
SELECT join1.* FROM table AS join1
INNER JOIN (
SELECT id, max(activity_date) as max_activity_date
FROM table
WHERE activity_date < '2014-02-02'
GROUP BY id
) AS join2
ON join1.id = join2.id AND join1.activity_date = join2.max_activity_date;
and the full query
SELECT * from (
SELECT join1.* FROM table AS join1
INNER JOIN (
SELECT id, max(activity_date) as max_activity_date
FROM table
WHERE activity_date < '2014-02-02'
GROUP BY id
) AS join2
ON join1.id = join2.id AND join1.activity_date = join2.max_activity_date
) AS temp WHERE temp.status = 'R';
try the following
SELECT *
FROM your_relation
WHERE status='R'
AND activity_data="2014-02-02"
I completely agree with Salman's response, the table could be designed in a fashion that allows for greater query accuracy and extensibility. However, the question asked, with regards to a query selecting information based on status and date range can be expressed as.
SELECT * FROM Table_1
WHERE ((status = 'R')
AND ((activity_date BETWEEN '2014-01-12' AND '2014-04-26')
OR activity_date > CONVERT(DATETIME, '2014-05-17')))
This will select all data with a status of 'R' and will use the BETWEEN operator for the range desired; moreover, the conversion of the final operator is because the expression is evaluated as a mathematical expression and requires explicit conversion.

MySQL SELECT combining 3 SELECTs INTO 1

Consider following tables in MySQL database:
entries:
creator_id INT
entry TEXT
is_expired BOOL
other:
creator_id INT
entry TEXT
userdata:
creator_id INT
name VARCHAR
etc...
In entries and other, there can be multiple entries by 1 creator. userdata table is read only for me (placed in other database).
I'd like to achieve a following SELECT result:
+------------+---------+---------+-------+
| creator_id | entries | expired | other |
+------------+---------+---------+-------+
| 10951 | 59 | 55 | 39 |
| 70887 | 41 | 34 | 108 |
| 88309 | 38 | 20 | 102 |
| 94732 | 0 | 0 | 86 |
... where entries is equal to SELECT COUNT(entry) FROM entries GROUP BY creator_id,
expired is equal to SELECT COUNT(entry) FROM entries WHERE is_expired = 0 GROUP BY creator_id and
other is equal to SELECT COUNT(entry) FROM other GROUP BY creator_id.
I need this structure because after doing this SELECT, I need to look for user data in the "userdata" table, which I planned to do with INNER JOIN and select desired columns.
I solved this problem with selecting "NULL" into column which does not apply for given SELECT:
SELECT
creator_id,
COUNT(any_entry) as entries,
COUNT(expired_entry) as expired,
COUNT(other_entry) as other
FROM (
SELECT
creator_id,
entry AS any_entry,
NULL AS expired_entry,
NULL AS other_enry
FROM entries
UNION
SELECT
creator_id,
NULL AS any_entry,
entry AS expired_entry,
NULL AS other_enry
FROM entries
WHERE is_expired = 1
UNION
SELECT
creator_id,
NULL AS any_entry,
NULL AS expired_entry,
entry AS other_enry
FROM other
) AS tTemp
GROUP BY creator_id
ORDER BY
entries DESC,
expired DESC,
other DESC
;
I've left out the INNER JOIN and selecting other columns from userdata table on purpose (my question being about combining 3 SELECTs into 1).
Is my idea valid? = Am I trying to use the right "construction" for this?
Are these kind of SELECTs possible without creating an "empty" column? (some kind of JOIN)
Should I do it "outside the DB": make 3 SELECTs, make some order in it (let's say python lists/dicts) and then do the additional SELECTs for userdata?
Solution for a similar question does not return rows where entries and expired are 0.
Thank you for your time.
This should work (assuming all creator_ids appear in the userdata table.
SELECT userdata.creator_id, COALESCE(entries_count_,0) AS entries_count, COALESCE(expired_count_,0) AS expired_count, COALESCE(other_count_,0) AS other_count
FROM userdata
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS entries_count_
FROM entries
GROUP BY creator_id) AS entries_q
ON userdata.creator_id=entries_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS expired_count_
FROM entries
WHERE is_expired=0
GROUP BY creator_id) AS expired_q
ON userdata.creator_id=expired_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS other_count_
FROM other
GROUP BY creator_id) AS other_q
ON userdata.creator_id=other_q.creator_id;
Basicly, what you are doing looks correct to me.
I would rewrite it as follows though
SELECT entries.creator_id
, any_entry
, expired_entry
, other_entry
FROM (
SELECT creator_id, COUNT(entry) AS any_entry,
FROM entries
GROUP BY creator_id
) entries
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS expired_entry,
FROM entries
WHERE is_expired = 1
GROUP BY creator_id
) expired ON expired.creator_id = entries.creator_id
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS other_entry
FROM other
GROUP BY creator_id
) other ON other.creator_id = entries.creator_id
How about
SELECT creator_id,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 0) AS entries,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 1) as expired,
(SELECT COUNT(*)
FROM other
WHERE other.creator_id = main.creator_id) AS other,
FROM entries main
GROUP BY main.creator_id;