MySQL group by all columns except one - mysql

I'm looking for a (cleaner?) way to do the following:
Let's say I have a table, main, with ~15 columns that looks something like this, with one row per id:
main:
id start end col4 ... col15
666 2014-01-01 2014-06-30 ... ... ...
1234 2015-03-05 2015-05-02 ... ... ...
9876 2014-09-01 2015-01-01 ... ... ...
...(etc)
Then I have another table, events, which may have 0, 1, or many rows per id:
events:
id date code
666 2014-01-20 "code_a"
1234 2015-05-01 "code_b"
666 2014-01-25 "code_c"
666 2014-02-09 "code_z"
... (etc)
and finally I have a table, codes, which has one row per code, giving a description for the code as well as a type (0,1, or 2):
codes:
code desc type
"code_a" "something" 0
"code_b" "somethn else" 1
"code_c" "another thing" 0
"code_d" "one more" 2
(no code z)
and what I want as a result is main's 15 columns plus three additional columns which contain comma separated lists of event codes which happened between the start and end dates for that id by type (first column is type 0, second type 1, third type 2), so:
id start end ... col15 type_0 type_1 type_2
666 2014-01-01 2014-06-30 ... ... "code_a,code_c"
1234 2015-03-05 2015-05-02 ... ... "code_b"
...(etc)
my solution is
select m.*
, group_concat(c0.code) as type_0
, group_concat(c1.code) as type_1
, group_concat(c2.code) as type_2
from main m
left join events e on m.id = e.id and e.date between m.start and m.end
left join codes c0 on c0.code = e.code and c0.type = 0
left join codes c1 on c1.code = e.code and c1.type = 1
left join codes c2 on c2.code = e.code and c2.type = 2
group by m.id
, m.start
, m.end
, m.col4
, m.col5
, m.col6
, m.col7
, m.col8
, m.col9
, m.col10
, m.col11
, m.col12
, m.col13
, m.col14
, m.col15
But to me that's pretty nasty looking. Is there a more elegant way to do this (especially avoiding the 15 columns listed in the group by)?

In MySQL, you can just use GROUP BY m.id. Unless you enable the ONLY_FULL_GROUP_BY option, it allows you to use non-aggregate columns that aren't in the GROUP BY clause. This could produce unprectable results if you selected columns that were not uniquely identified by the grouping column, but that's not the case here -- you're grouping by a column that's the unique ID for the m table, and all the non-aggregate columns are from that same table.
In strict SQL, you would have to do it by doing the GROUP_CONCATs in a subquery, which you then join with the main table.
SELECT *
FROM (SELECT m.id,
, group_concat(c0.code) as type_0
, group_concat(c1.code) as type_1
, group_concat(c2.code) as type_2
FROM main m
left join events e on m.id = e.id and e.date between m.start and m.end
left join codes c0 on c0.code = e.code and c0.type = 0
left join codes c1 on c1.code = e.code and c1.type = 1
left join codes c2 on c2.code = e.code and c2.type = 2
GROUP BY m.id
) t1
JOIN main m ON t1.id = m.id

With the "one row per id" specification, you can take advantage of the MySQL extension to the GROUP BY which allows you to include non-aggregates in the SELECT list. The only change required to your query would be to just
GROUP BY m.id
Other databases would throw an error with that. We can get MySQL to throw an error too, if we include ONLY_FULL_GROUP_BY in the sql_mode for the session.
Another alternative would be to avoid a GROUP BY operation on m, using an inline view. You still need to do a GROUP BY, but you can do that in the inline view, where the other columns from main aren't returned, we only return the unique id value. We need that for the join in the outer query.
Also seems like you only need one join to the codes table; you could use a conditional test inside the GROUP_CONCAT to conditionally return the value of the code.
For example:
SELECT m.*
, g.type_0
, g.type_1
, g.type_2
FROM main m
LEFT
JOIN ( SELECT a.id
, GROUP_CONCAT(IF(c.type=0,c.code,NULL)) AS type_0
, GROUP_CONCAT(IF(c.type=1,c.code,NULL)) AS type_1
, GROUP_CONCAT(IF(c.type=2,c.code,NULL)) AS type_2
FROM main a
LEFT
JOIN events e
ON e.id = a.id
AND e.date BETWEEN a.start AND a.end
LEFT
JOIN codes c
ON c.code = e.code
AND c.type IN (0,1,2)
GROUP BY a.id
) g
ON g.id = m.id
I'm not sure that either of those qualifies as "a more elegant way" or not. (Both of these depend on the id column being UNIQUE in main. The second query also relies on id being non-NULL.)
You might want to consider adding an ORDER BY inside the GROUP_CONCAT, for a more deterministic result. It's also possible to include the DISTINCT keyword inside the GROUP_CONCAT, if there's no reason to return "duplicate" values of code in the list, e.g.
GROUP_CONCAT(DISTINCT IF(c.type=0,c.code,NULL) ORDER BY 1)\
Also be aware that the maximum length of the value returned from GROUP_CONCAT is limited to group_concat_max_len.

Another shorter version would be like below by getting the grouping first and then join with it.
select m.*
, XX.type_0
, XX.type_1
, XX.type_2
from main m
left join events e on m.id = e.id and e.date between m.start and m.end
left join (
select code, GROUP_CONCAT(case when type = 0 then code else null end SEPARATOR ', ') AS type_0,
GROUP_CONCAT(case when type = 1 then code else null end SEPARATOR ', ') AS type_1,
GROUP_CONCAT(case when type = 2 then code else null end SEPARATOR ', ') AS type_2
from codes
group by <some_column> )XX ON XX.code = e.code;

Related

Contiditional WHERE with two joins on same table

I'm trying to create an effective query but can't get it working.
Tables:
- one table containing types of objects
- one table containing objects
Conditions:
- there can be single objects of a type
- there can be child objects of a type
- parent and child objects don't need to be of the same type
- objects can be published
- types can be published
- the results should only get pulled from a specific pool of object IDs. So i need to add AND (o.id IN (1,2,3,4)
I want a simple result list that shows how many types are published and the number of objects assigned to these types.
types
id | title | published
---------------------
1 type1 1
2 type2 1
3 type3 1
4 type4 1
5 type5 1
6 type6 0
7 type7 1
objects
id |title | type | parent | published
---------------------------------------
1 a 1 0 1
2 b 1 0 1
3 c 3 2 1
4 d 2 0 1
5 e 2 2 1
6 f 4 0 0
7 g 5 6 1
8 h 6 0 1
9 i 3 8 1
10 j 3 8 0
11 k 7 8 1
Results should be:
type1 (#2) (two singles)
type2 (#2) (one single + one child of id 2)
type3 (#3) (one child of id 2 + one published child of id 8)
type4 (#0) (one single not published)
type5 (#0) (because it's parent id 6 is not published)
type6 (#0) (because type6 is not published)
I tried this one (type publishing not included):
SELECT o.type, t.title, COUNT(t.id) AS cnt
FROM types AS t
LEFT JOIN objects AS o ON o.type = t.id
LEFT JOIN objects AS o2 ON o.id = o2.parent
WHERE o.published = 1 AND o2.published = 1
GROUP BY o.type
The conditions in the WHERE clause negate the "outerness" of the left joins.
Move those conditions to the ON clauses. The WHERE clause can be dropped.
Also, reference columns from t, the driving table, and count non-NULL expressions from the outer joined tables.
That will allow the query to return zero counts.
I didn't fully delve into the specification, but it looks like we want to count matching rows from o and o2.
I think something like this will get a resultset consistent with one interpretation of the specification... child o2 rows get counted under parent o type, regardless of the type on the child o2 row.
This is not tested, and I'm not fully understanding the specification...
SELECT t.id AS `type`
, t.title AS `title`
, COUNT(DISTINCT o.id)
+ COUNT(DISTINCT o2.id) AS `cnt`
-- , COUNT(DISTINCT o.id) AS `cnt_o`
-- , COUNT(DISTINCT o2.id) AS `cnt_o2`
FROM types t
LEFT
JOIN objects o
ON o.type = t.id
AND o.published = 1
AND o.parent = 0
AND t.published = 1
LEFT
JOIN objects o2
ON o2.parent = o.id
AND o2.published = 1
GROUP
BY t.id
, t.title
Not clear in the spec...
Do child rows (from o2) get omitted from the count if the type on the o2 row matches a row in types that is published=0 ?
If we are "grouping" by type on the o2 rows , then we'd need to something different,
EDIT
we could get the count from the parent and the child separately, in two separate SELECT, and then combine the two resultsets with a UNION ALL set operator, and then total up the counts.
something along these lines:
SELECT c.type
, c.title
, SUM(c.cnt) AS cnt
FROM (
SELECT t.id AS `type`
, t.title AS `title`
, COUNT(o.id) AS `cnt`
FROM types t
LEFT
JOIN objects o
ON o.type = t.id
AND o.published = 1
AND o.parent = 0
AND t.published = 1
GROUP
BY t.id
, t.title
UNION ALL
SELECT tc.id AS `type`
, tc.title AS `title`
, COUNT(oc.id) AS `cnt`
FROM types tc
JOIN objects oc
ON oc.type = t.id
AND oc.published = 1
AND t.published = 1
JOIN objects op
ON op.id = oc.parent
AND op.published = 1
JOIN types pt
ON pt.id = op.type
AND pt.published = 1
GROUP
BY tc.id
, tc.title
) c
GROUP
BY c.type
, c.title
again, untested, and without a full understanding of the spec.
the count of the parent o is straightforward. we use an outer join, with t as the driving table, so we get all types, and can get zero counts.
the count of the child oc, we can do inner joins. since the previous SELECT is getting us all the types, missing rows in the second SELECT won't cause a problem.
note that we join the child o2 rows by type, and then we join to parent (to make sure parent is published), and join to parent type (to check that type is published) ...
How do we distinguish "parent" rows, do we check parent=0 ?
Is this a hierarchy, can a "child" also be the "parent" of another row ?
FOLLOWUP
Another way to think about it (maybe this was the approach of the OP query) ... we are counting rows from o, parents and children. What's important is that the type is published type, and that o is published.
Additionally, either
o is not a child (i.e. there isn't a row in objects op that has an id value equal to `o.parent)
or
if o does have a parent row (a row in objects op with an id value equal to o.parent, the [parent op is published and the parent type is published.
We could approach it like this:
SELECT t.id AS `type`
, t.title AS `title`
, COUNT(o.id) AS `cnt`
FROM types t
LEFT
JOIN objects o
ON o.type = t.id
AND o.published = 1
AND t.published = 1
LEFT
JOIN objects op
ON op.id = o.parent
LEFT
JOIN types pt
ON pt.id = op.type
WHERE -- this not a child (there is no parent)
op.id IS NULL
OR -- parent is published and parent type is published
( op.published = 1 AND pt.published = 1 )
GROUP
BY t.id
, t.title

How to make query

review table has store_idx, user_idx etc...
I want to create a query sentence that gets information about the store to which the user has bookmarked with the user_id value entered.
The query sentence I made is
select A.store_name
, A.store_img
, count(B.store_idx) as review_cnt
from board.store A
Left
Join board.review B
On A.store_idx is B.store_idx
where store_idx is (select A.store_idx from bookmark where user_id = ?)
However, nothing came out as a result.
Help me..
Please use below Query:
SELECT store_name
, store_img
, SUM(review_cnt) AS review_cnt
FROM
( SELECT DISTINCT A.store_name
, A.store_img
, CASE WHEN B.store_idx IS NULL THEN 0 ELSE 1 END AS review_cnt
FROM bookmark br
JOIN board.store A
ON A.store_idx = br.store_idx
LEFT
JOIN board.review B
ON A.store_idx = B.store_idx
WHERE br.user_id = ?
)T
The WHERE clause is obviously filtering out all rows. We can't do much about that. But your query is also lacking a GROUP BY, the table aliases can be improved, and the join condition is not correct.
So, try this version:
select s.store_name, s.store_img, count(b.store_idx) as review_cnt
from board.store s left join
board.review r
on s.store_idx = r.store_idx
where b.store_idx in (select b.store_idx
from bookmark b
where b.user_id = ?
);

SELECT group by twice

I'm not strong in DB at all and I need your help.
I need SQL request with GROUP by twice.
Example of my data in table
<table border="1" style="border-collapse:collapse">
<tr><th>id</th><th>market_id</th><th>price</th><th>low</th><th>high</th><th>symbol</th><th>created_at</th></tr>
<tr><td>1</td><td>1</td><td>5773.8</td><td>5685</td><td>6020</td><td>btcusd</td><td>2017-10-27 16:46:10</td></tr>
<tr><td>2</td><td>1</td><td>0.4274</td><td>0.39</td><td>0.43983</td><td>iotusd</td><td>2017-10-27 16:46:11</td></tr>
<tr><td>3</td><td>1</td><td>0.20026</td><td>0.1986</td><td>0.20352</td><td>xrpusd</td><td>2017-10-27 16:46:12</td></tr>
<tr><td>4</td><td>2</td><td>5771</td><td>5685</td><td>6020</td><td>btcusd</td><td>2017-10-27 16:46:18</td></tr>
<tr><td>5</td><td>2</td><td>0.4274</td><td>0.39</td><td>0.43983</td><td>iotusd</td><td>2017-10-27 16:46:18</td></tr>
<tr><td>6</td><td>2</td><td>0.20026</td><td>0.1986</td><td>0.20352</td><td>xrpusd</td><td>2017-10-27 16:46:19</td></tr>
<tr><td>7</td><td>1</td><td>5773.1</td><td>5685</td><td>6020</td><td>btcusd</td><td>2017-10-27 16:46:25</td></tr>
<tr><td>8</td><td>1</td><td>0.4274</td><td>0.39</td><td>0.43983</td><td>iotusd</td><td>2017-10-27 16:46:25</td></tr>
<tr><td>9</td><td>1</td><td>0.20026</td><td>0.1986</td><td>0.20352</td><td>xrpusd</td><td>2017-10-27 16:46:26</td></tr>
<tr><td>10</td><td>2</td><td>5773.1</td><td>5685</td><td>6020</td><td>btcusd</td><td>2017-10-27 16:46:32</td></tr>
<tr><td>11</td><td>2</td><td>0.42741</td><td>0.39</td><td>0.43983</td><td>iotusd</td><td>2017-10-27 16:46:32</td></tr>
<tr><td>12</td><td>2</td><td>0.20026</td><td>0.1986</td><td>0.20352</td><td>xrpusd</td><td>2017-10-27 16:46:33</td></tr></table>
I would like to get latest data for every market_id and symbol
That's mean I need somethind like that in the end :
- id market_id symbol
- 7 1 btcusd
- 8 1 iotusd
- 9 1 xrpusd
- 10 2 btcusd
- 11 2 iotusd
- 12 2 xrpusd
Really need help, a little bit blocked.
You are almost there. Try this
SELECT c.*
FROM CRYPTO as C
JOIN (
SELECT market_id, symbol, MAX(id) as maxid
FROM CRYPTO
GROUP BY market_id, symbol
) AS C2
ON C2.maxid = C.id and C.market_id = c2.market_id and c.symbol = c2.symbol
Along these lines...
SELECT MAX(id), market_id, symbol
FROM crypto
GROUP BY market_id, symbol
Here's my comment stated as SQL.
SELECT A.ID, A.MarketID, A.Symbol, A.Price, A.Low, A.High
FROM CRYPTO A
INNER JOIN (SELECT max(Created_at) MCA, Market_ID, Symbol
FROM crypto
GROUP BY Market_ID, Symbol) B
on A.Created_At = B.MCA
and A.market_ID = B.Market_ID
and A.Symbol = B.Symbol
What this does:
The derived table (aliased B) generates 1 line for each market_ID and symbol having the max created_at time. It then uses this derived table set to join back to the base set (aliased A) to limit the data to just those having the max created_at. this allows us to show the whole record from A for each unique market_Id and symbol; but only for records having the max created_at.
Other engines would allow you to use a cross apply or an analytic to obtain the desired results.
I tried these requests
SELECT * FROM CRYPTO as C3
JOIN (
SELECT MAX(id) as max
FROM CRYPTO as C1
GROUP BY symbol
) AS C2
ON C2.max = C3.id
SELECT M.id, M.name, R.symbol FROM MARKET AS M
JOIN (
SELECT DISTINCT C.symbol, C.market_id
FROM CRYPTO as C
) as R
ON M.id = R.market_id
But finally I did not find the good combination.

MySQL UNION DISTINCT - exclude

I have query like this:
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE type != 51
AND idportal = 1
UNION DISTINCT
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE shared_with_users LIKE '%i:2;%'
AND idportal = 1
ORDER BY add_date DESC
LIMIT 6
The problem is following:
Regular user can't see certain types of events (for now it is type 51) and he can see only things which are shared with him.
shared_with_users column can be null or have value - this column have value only for one type of event (type = 50) and for other events it is null.
I need to perform following:
User can access all events except event with type 51 and if the the event is type of 50, I need to check if the event is shared with him (shared_with_users column), and collect that also. Is it possible to make this kind of query?
Try this
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE type != 51 o or (type = 50 and shared_with_users LIKE '%i:2;%')
AND idportal = 1
ORDER BY add_date DESC
LIMIT 6
I think you can do this as a single query, with logic in the WHERE clause:
SELECT e.*, f.name, f.extension, u.first_name, u.last_name
FROM cs_event e LEFT JOIN
cs_file f
ON e.idfile = f.idfile LEFT JOIN
cs_user u
ON e.iduser = u.iduser
WHERE idportal = 1 AND
(type <> 51 OR shared_with_users LIKE '%i:2;%');
Some notes:
I don't think the LEFT JOINs are necessary. The WHERE clause may be turning them into inner joins anyway, but it is hard to tell without qualified column names.
I added table aliases so the query is easier to write and to read.
The logic for shared_with_users suggests that you have stored a list of values in a string. That is a bad choice.

Select field based on value in same record

I want to SELECT a field based on a ID value.
Products
PRODUCT_ID Name
19 Chair
20 Table
Product_fields
ID PRODUCT_ID TYPE DESCRIPTION
1 19 C White
2 19 S Modern
3 20 C Black
4 20 S Classic
I need a result like:
Product Type_C Type_S
Chair White Modern
Table Black Classic
I am able to produce this using two LEFT JOINs on the product_fields table but this slows down the query too much. Is there a better way?
Slows down the query how much? What is acceptable?
If you really don't want to use joins (you must have one join), then use views or nested queries. But I don't think they will be any faster, though you can give it a try.
See views at sqlfiddle
select p.PRODUCT_ID, p.Name, f.CDescription, f.SDescription
from Products p
join(
SELECT PRODUCT_ID, Max( CDescription ) as CDescription,
Max( SDescription ) as SDescription
FROM(
select PRODUCT_ID,
case Type when 'C' then Description end as CDescription,
case Type when 'S' then Description end as SDescription
from Fields
) x
group by PRODUCT_ID
) f
on f.PRODUCT_ID = p.PRODUCT_ID;
The complete statement is:
SELECT
NL.product_name,
PRD.product_sku AS product_sku,
CF.virtuemart_product_id AS virtuemart_product_id,
GROUP_CONCAT(distinct CFA.customsforall_value_name
ORDER BY CFA.customsforall_value_name ASC
separator ' | ' ) AS Name_exp_3,
ROUND((((prices.product_price * CALC.calc_value) / 100) + prices.product_price),
2) AS Prijs,
VMCF_L.custom_value AS latijn,
VMCF_T.custom_value AS THT
VMCF_B.custom_value AS Batch
from j25_virtuemart_products AS PRD
LEFT join j25_virtuemart_product_custom_plg_customsforall AS CF ON CF.virtuemart_product_id = PRD.virtuemart_product_id
join j25_virtuemart_product_prices AS prices ON PRD.virtuemart_product_id = prices.virtuemart_product_id
join j25_virtuemart_calcs AS CALC ON prices.product_tax_id = CALC.virtuemart_calc_id
join j25_virtuemart_products_nl_nl AS NL ON NL.virtuemart_product_id = PRD.virtuemart_product_id
LEFT join j25_virtuemart_product_customfields AS VMCF ON VMCF.virtuemart_product_id = PRD.virtuemart_product_id
LEFT join j25_virtuemart_custom_plg_customsforall_values AS CFA ON CFA.customsforall_value_id = CF.customsforall_value_id
LEFT JOIN j25_virtuemart_product_customfields AS VMCF_L ON VMCF.virtuemart_product_id = VMCF_L.virtuemart_product_id AND VMCF_L.virtuemart_custom_id = 16
LEFT JOIN j25_virtuemart_product_customfields AS VMCF_T ON VMCF.virtuemart_product_id = VMCF_T.virtuemart_product_id AND VMCF_T.virtuemart_custom_id = 3
LEFT JOIN j25_virtuemart_product_customfields AS VMCF_B ON VMCF.virtuemart_product_id = VMCF_B.virtuemart_product_id AND VMCF_B.virtuemart_custom_id = 18
WHERE
PRD.product_sku like '02.%'
group by PRD.virtuemart_product_id
order by NL.product_name;
Where the three SELECT results named 'Latijn', 'THT', and 'Batch' are the ones which I compared earlier as the black/white and classic/modern values.
Hope this makes any sense.
As you can see this involves a Virtuemart installation, so I cannot fiddle about to much with the schema.
When I exclude the bottom 3 JOINS and there related FIELDS, the query takes approx 0,5 seconds. With the JOINS and FIELDS included, the query takes almost 19 seconds.
I have created a view from this complete query which I query from my labeling application.
Thanks everyone! With your input I created:
select
NL.product_nameASproduct_name,
PRD.product_skuASproduct_sku,
CF.virtuemart_product_idASvirtuemart_product_id,
group_concat(distinctCFA.customsforall_value_name
order byCFA.customsforall_value_nameASC
separator ' | ') ASName_exp_3,
round((((prices.product_price*CALC.calc_value) / 100) +prices.product_price),
2) ASPrijs,
f.LatijnASLatijn,
f.THTASTHT,
f.BatchASBatch
from
(((((((j25_virtuemart_productsPRD
left joinj25_virtuemart_product_custom_plg_customsforallCFON ((CF.virtuemart_product_id=PRD.virtuemart_product_id)))
joinj25_virtuemart_product_pricespricesON ((PRD.virtuemart_product_id=prices.virtuemart_product_id)))
joinj25_virtuemart_calcsCALCON ((prices.product_tax_id=CALC.virtuemart_calc_id)))
joinj25_virtuemart_products_nl_nlNLON ((NL.virtuemart_product_id=PRD.virtuemart_product_id)))
left joinj25_virtuemart_product_customfieldsVMCFON ((VMCF.virtuemart_product_id=PRD.virtuemart_product_id)))
left joinj25_virtuemart_custom_plg_customsforall_valuesCFAON ((CFA.customsforall_value_id=CF.customsforall_value_id)))
left joinvw_batch_Latijn_THT_groupedfON ((f.virtuemart_product_id=PRD.virtuemart_product_id)))
where
(PRD.product_skulike '02.%')
group byPRD.virtuemart_product_id
order byNL.product_name``
Which takes 1.4 seconds to execute, a whole lot faster then the 19 seconds I started with.