I have an issue where I need to aggregate and concatenate multiple row data into single row output. I understand the tables are a problem in that their is no unique index, but I need to do this at the query level instead of the scripting level and I can't touch the database structure.
Here we go:
table: characteristics
id code_a code_b
-------------------------------
2201 CHAU AIRS
2201 CHAU PELC
2201 PROX AUTO
2201 PROX HOP`
table: characteristics_types
code description
-------------------------------
CHAU Heating System
PROX Nearby
table: characteristics_sub_types
code_a code description
-------------------------------
CHAU AIRS Forced Air
CHAU PELC Baseboard
PROX AUTO Highway
PROX HOP Hospital
Result required:
id Heating System Nearby
--------------------------------------------------------
2201 Forced Air, Baseboard Highway, Hospital
Not working:
SELECT id,
(case when C.code_a='CHAU' THEN GROUP_CONCAT(STC.description) ELSE NULL END) AS Heating System,
(case when C.code_a='PROX' THEN GROUP_CONCAT(STC.description) ELSE NULL END) AS Nearby
from characteristics C
inner join characteristics_types TC on C.code_a=TC.`code`
inner join characteristics_sub_types STC on C.code_a=STC.code_a and C.code_b=STC.`code`
GROUP BY C.id,C.code_a
I am getting the following results:
id Heating System Nearby
--------------------------------------------------------
2201 Forced Air, Baseboard NULL
2201 NULL Highway, Hospital
Any direction would be greatly appreciated!
GROUP_CONCAT accepts a DISTINCT keyword which is useful in fan-out-queries. You can also use ORDER BY within a GROUP_CONCAT if you need your results to be ordred. See Documentation. Using this, we can write your intended query like below:
SELECT
c.id,
GROUP_CONCAT(DISTINCT ct.description) as 'Heating System',
GROUP_CONCAT(DISTINCT cst.description) as 'Nearby'
FROM characteristics c
LEFT JOIN characteristics_types ct ON ct.id = c.id
LEFT JOIN characteristics_sub_types cst
ON cst ON cst.code_a = c.code_a AND cst.code = c.code_b
GROUP BY 1
I would write separate subqueries, each one preforming the group_concat that you need, and join them together. Individually, I wrote them like this:
SELECT c.id, GROUP_CONCAT(cst.description) AS 'Heating System'
FROM characteristics c
JOIN characteristics_sub_types cst ON cst.code_a = c.code_a AND cst.code = c.code_b AND c.code_a = 'CHAU'
GROUP BY c.id;
And the join like this:
SELECT c.id, t1.`Heating System`, t2.`Nearby`
FROM(
SELECT c.id, GROUP_CONCAT(cst.description) AS 'Heating System'
FROM characteristics c
JOIN characteristics_sub_types cst ON cst.code_a = c.code_a AND cst.code = c.code_b AND c.code_a = 'CHAU'
GROUP BY c.id) t1
JOIN(
SELECT c.id, GROUP_CONCAT(cst.description) AS 'Nearby'
FROM characteristics c
JOIN characteristics_sub_types cst ON cst.code_a = c.code_a AND cst.code = c.code_b AND c.code_a = 'PROX'
GROUP BY c.id) t2 ON t1.id = t2.id;
Here is a working SQL Fiddle.
You can try this way:
SELECT c.id,
GROUP_CONCAT(STC.description_heating) AS `Heating System`,
GROUP_CONCAT(STC.description_nearby) AS Nearby
from characteristics C
inner join characteristics_types TC on C.code_a=TC.`code`
inner join (
SELECT code_a, code,
CASE WHEN code_a = 'CHAU' THEN description ELSE null END as descriptio_heating,
CASE WHEN code_a = 'PROX' THEN description ELSE null END as descriptio_nearby,
FROM characteristics_sub_types
) as STC on C.code_a=STC.code_a and C.code_b=STC.`code`
GROUP BY C.id
SELECT c.id,
GROUP_CONCAT(STC.description_heating) AS `Heating System`,
GROUP_CONCAT(STC.description_nearby) AS Nearby
from characteristics C
inner join characteristics_types TC on C.code_a=TC.`code`
inner join (
SELECT code_a, code,
CASE WHEN code_a = 'CHAU' THEN description ELSE null END as description_heating,
CASE WHEN code_a = 'PROX' THEN description ELSE null END as description_nearby
FROM characteristics_sub_types
) as STC on C.code_a=STC.code_a and C.code_b=STC.`code`
GROUP BY C.id
Related
Yes, this is an assignment. So the task was to output two columns of 'first name' and 'last name' with conditions:
-A u (B ∩ -C ∩ -(A ∩ -( B u D)))
A: All consumers that didn't shop on Monday and Friday
(time_by_day.the_day)
B: All consumers who bought 'Non-Consumable'
(product_class.product_family)
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
D: Female consumers from Canada (consumer.gender, consumer.country)
This is what I got so far
SELECT
c.fname,
c.lname
FROM
customer AS c
INNER JOIN sales_fact_1997 AS s ON c.customer_id = s.customer_id
INNER JOIN time_by_day AS t ON s.time_id = t.time_id
INNER JOIN product AS p ON s.product_id = p.product_id
INNER JOIN product_class AS pc ON p.product_class_id = pc.product_class_id
Where
NOT t.the_day in ('Monday', 'Friday') OR
(
pc.product_family = 'Non-Consumable' AND
NOT SUM(s.unit_sales) > 10 AND
NOT (
t.the_day in ('Monday', 'Friday') AND
NOT (
pc.product_family = 'Non-Consumable' OR
(c.country = 'Canada' AND c.gender = 'F')
)
)
)
GROUP BY concat(c.customer_id, s.time_id)
That ended up with an error
#1111 - Invalid use of group function
But I don't know which part of the code is wrong. I'm pretty sure that it's probably the WHERE part. But I don't know what I did wrong.
Condition C is where I'm really struggling. I manage just fine making a query of C
SELECT
t.time_id,
c.customer_id,
c.fullname,
round(SUM(s.unit_sales),0) as tot
FROM
customer as c
INNER JOIN sales_fact_1997 as s ON c.customer_id = s.customer_id
INNER JOIN time_by_day as t on s.time_id=t.time_id
GROUP BY concat(c.customer_id, s.time_id)
ORDER BY c.customer_id, t.time_id
But trying to incorporate it into the main code is hard for me.
Reading online I assume that I should probably use HAVING instead of WHERE.
I would really appreciate it if someone can point me in the right direction.
This is the database that I used.
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
You should use COUNT not SUM.
SELECT time_id,
count(*)
FROM sales_fact_1997
GROUP BY time_id
HAVING COUNT(*)>=10 ;
count(*) is not needed, I let just to show the results
Can you try if it helps:
SELECT c.lname,
c.fname
FROM customer c
INNER JOIN
(
SELECT time_id,customer_id,product_id
FROM sales_fact_1997
GROUP BY time_id,customer_id,product_id
HAVING COUNT(*)>=10
) as s on c.customer_id=s.customer_id
INNER JOIN
(
SELECT time_id,the_day
FROM time_by_day
WHERE the_day
NOT IN ('Monday','Friday')
) as t on s.time_id=t.time_id
INNER JOIN
(
SELECT product_family,product_id
FROM product_class
INNER JOIN product
on product_class.product_class_id=product.product_class_id
WHERE product_family='Non-Consumable'
) pc on s.product_id=pc.product_id
where c.country='Canada' and c.gender ='F' ;
I am pretty new to MYSQL and in general writing queries.
the most important things in this query is cycle_count_id and location. all information should be based on this two factors.
now, how can I improve this query:
SELECT c2.code_cycle_count, c2.location, c2.last_cyclecount, c2.second_recent_cyclecount ,i.uid,c2.po_number, i.cost, i.uid
, (SELECT
CASE WHEN MAX(cc.updated_at) = MAX(ccc.updated_at) THEN u.username ELSE NULL END
from oms_live_ir.wms_cycle_count ccc
INNER JOIN oms_live_ir.ims_user u ON u.id_user=ccc.fk_user
WHERE ccc.fk_location = cc.fk_location AND ccc.id_cycle_count=cc.id_cycle_count) AS cycle_count_user
, (SELECT COUNT(cci.fk_cycle_count_item_status)
from oms_live_ir.wms_cycle_count_item cci
WHERE cci.fk_cycle_count_item_status = 1 AND cc.id_cycle_count=cci.fk_cycle_count) AS location_updated
, (SELECT COUNT(cci.fk_cycle_count_item_status)
from oms_live_ir.wms_cycle_count_item cci
WHERE cci.fk_cycle_count_item_status = 2 AND cc.id_cycle_count=cci.fk_cycle_count) AS status_updated
, (SELECT COUNT(cci.fk_cycle_count_item_status)
from oms_live_ir.wms_cycle_count_item cci
WHERE cci.fk_cycle_count_item_status = 4 AND cc.id_cycle_count=cci.fk_cycle_count) AS found
, (SELECT COUNT(cci.fk_cycle_count_item_status)
from oms_live_ir.wms_cycle_count_item cci
WHERE cci.fk_cycle_count_item_status = 6 AND cc.id_cycle_count=cci.fk_cycle_count) AS lost
, c5.id_pick, c5.date_pick
FROM oms_live_ir.wms_cycle_count cc
INNER JOIN oms_live_ir.wms_inventory i on i.fk_location=cc.fk_location
INNER join
(SELECT l.description AS location
,MAX(cccc.id_cycle_count) AS code_cycle_count
,MAX(cccc.updated_at) AS last_cyclecount, i.po_number
,(SELECT MAX(c1.updated_at) FROM oms_live_ir.wms_cycle_count c1 WHERE c1.updated_at<>MAX(cccc.updated_at)
AND c1.fk_location=cccc.fk_location) as second_recent_cyclecount
FROM oms_live_ir.wms_cycle_count cccc
INNER JOIN oms_live_ir.wms_inventory i ON i.fk_location=cccc.fk_location
INNER JOIN oms_live_ir.ims_location l ON l.id_location=cccc.fk_location
WHERE year(cccc.updated_at)>=2018 AND month(cccc.updated_at)>=1
AND LEFT(i.po_number,2) LIKE 'M1%' or LEFT(i.po_number,2) LIKE 'S1%'
GROUP by cccc.fk_location
) c2 on c2.code_cycle_count= cc.id_cycle_count
INNER JOIN
(SELECT cc5.fk_location,
(SELECT CASE WHEN ih2.updated_at=MIN(ih2.updated_at) THEN ih2.sales_order_item_id ELSE NULL END
FROM oms_live_ir.wms_cycle_count cc2
INNER JOIN oms_live_ir.wms_inventory_history ih2 ON ih2.fk_location=cc2.fk_location
WHERE ih2.sales_order_item_id=ih5.sales_order_item_id AND ih2.fk_location=cc5.fk_location AND cc2.id_cycle_count=cc5.id_cycle_count
AND year(ih2.updated_at)>=2018 AND MONTH(ih2.updated_at)>=1 AND ih2.sales_order_item_id IS NOT NULL
GROUP BY ih2.id_inventory) id_pick
,(SELECT CASE WHEN ih2.updated_at=MIN(ih2.updated_at) THEN ih5.updated_at ELSE NULL END
FROM oms_live_ir.wms_cycle_count cc2
INNER JOIN oms_live_ir.wms_inventory_history ih2 ON ih2.fk_location=cc2.fk_location
WHERE ih2.sales_order_item_id=ih5.sales_order_item_id AND ih2.fk_location=cc5.fk_location AND cc2.id_cycle_count=cc5.id_cycle_count
AND year(ih2.updated_at)>=2018 AND MONTH(ih2.updated_at)>=1 AND ih2.sales_order_item_id
GROUP BY ih2.id_inventory) date_pick
FROM oms_live_ir.wms_cycle_count cc5
INNER JOIN oms_live_ir.wms_inventory_history ih5 ON ih5.fk_location=cc5.fk_location
WHERE year(ih5.updated_at)>=2018 AND MONTH(ih5.updated_at)>=1 AND ih5.sales_order_item_id IS NOT NULL) c5 ON c5.fk_location=cc.fk_location
WHERE year(cc.updated_at)>=2017 AND month(cc.updated_at)>=1
AND LEFT(i.po_number,2) LIKE 'M1%' or LEFT(i.po_number,2) LIKE 'S1%'
group by c2.code_cycle_count, c2.location;
(Note: I do not have access to create temporary tables or create index for tables.)
Thanks in advance.
I have following tables
contacts
contact_id | contact_slug | contact_first_name | contact_email | contact_date_added | company_id | contact_is_active | contact_subscribed | contact_last_name | contact_company | contact_twitter
contact_campaigns
contact_campaign_id | contact_id | contact_campaign_created | company_id | contact_campaign_sent
bundle_feedback
bundle_feedback_id | bundle_id, contact_id | company_id | bundle_feedback_rating | bundle_feedback_favorite_track_id | bundle_feedback_supporting | campaign_id
bundles
bundle_id | bundle_name | bundle_created | company_id | bundle_is_active
tracks
track_id | company_id | track_title
I wrote this query, but it works slowly, how can I optimize this query to make it faster ?
SELECT SQL_CALC_FOUND_ROWS c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks/sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.bundle_feedback_supporting, 0) AS feedbackSupporting
FROM contacts AS c
LEFT JOIN
(SELECT c.contact_id,
count(cc.contact_campaign_id) AS sendCampaignsCount
FROM contacts AS c
LEFT JOIN contact_campaigns AS cc ON cc.contact_id = c.contact_id
WHERE c.company_id = '876'
AND c.contact_is_active = '1'
AND cc.contact_campaign_sent = '1'
GROUP BY c.contact_id) AS icc ON icc.contact_id = c.contact_id
LEFT JOIN
(SELECT bf.contact_id,
count(*) AS countfeedbacks,
bf.bundle_feedback_supporting
FROM bundle_feedback bf
JOIN bundles b
JOIN contacts c
LEFT JOIN tracks t ON bf.bundle_feedback_favorite_track_id = t.track_id
WHERE bf.bundle_id = b.bundle_id
AND bf.contact_id = c.contact_id
AND bf.company_id='876'
GROUP BY bf.contact_id) AS ibf ON ibf.contact_id = c.contact_id
WHERE c.company_id = '876'
AND contact_is_active = '1'
ORDER BY percentFeedback DESC LIMIT 0, 25;
I have done 2 improvements
1) Removed the contacts which is getting joined unnecessarily twice and put the condition at the final where condition.
2) Removed as per SQL_CALC_FOUND_ROWS
Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)
SELECT c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks/sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.bundle_feedback_supporting, 0) AS feedbackSupporting
FROM contacts AS c
LEFT JOIN
(SELECT cc.contact_id,
count(cc.contact_campaign_id) AS sendCampaignsCount
FROM contact_campaigns
WHERE cc.contact_campaign_sent = '1'
GROUP BY cc.contact_id) AS icc ON icc.contact_id = c.contact_id
LEFT JOIN
(SELECT bf.contact_id,
count(*) AS countfeedbacks,
bf.bundle_feedback_supporting
FROM bundle_feedback bf
JOIN bundles b
LEFT JOIN tracks t ON bf.bundle_feedback_favorite_track_id = t.track_id
WHERE bf.bundle_id = b.bundle_id
GROUP BY bf.contact_id) AS ibf ON ibf.contact_id = c.contact_id
WHERE c.company_id = '876' and c.contact_is_active = '1'
First, you are not identifying any indexes you have to optimize the query. That said, I would ensure you have at least the following composite / covering indexes.
table index
contacts ( company_id, contact_is_active )
contact_campaigns ( contact_id, contact_campaign_sent )
bundle_feedback ( contact_id, bundle_feedback_supporting )
Next, as noted in other answer, unless you really need how many rows qualified, remove the "SQL_CALC_FOUND_ROWS".
In your first left-join (icc), you do a left-join on contact_campaigns (cc), but then throw into your WHERE clause an "AND cc.contact_campaign_sent = '1'" which turns that into an INNER JOIN. At the outer query level, these would result in no matching record and thus NULL for your percentage calculations.
In your second left-join (ibf), you are doing a join to the tracks table, but not utilizing anything from it. Also, you are joining to the bundles table but not using anything from there either -- unless you are getting multiple rows in the bundles and tracks tables which would result in a Cartesian result and possibly overstate your "CountFeedbacks" value. You also do not need the contacts table as you are not doing anything else with it, and the feedback table has the contact ID basis your are querying for. Since that is only grouped by the contact_id, your "bf.bundle_feedback_supporting" is otherwise wasted. If you want counts of feedback, just count from that table per contact ID and remove the rest. (also, the joins should have the "ON" clauses instead of within the WHERE clause for consistency)
Also, for your supporting feedback, the data type and value are unclear, so I implied as a Yes or No and have a SUM() based on how many are supporting. So, a given contact may have 100 records but only 37 are supporting. This gives you 1 record for the contact having BOTH values 100 and 37 respectively and not lost in a group by based on the first entry found for the contact.
I would try to summarize your query to below:
SELECT
c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks / icc.sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.SupportCount, 0) AS feedbackSupporting
FROM
contacts AS c
LEFT JOIN
( SELECT
c.contact_id,
count(*) AS sendCampaignsCount
FROM
contacts AS c
JOIN contact_campaigns AS cc
ON c.contact_id = cc.contact_id
AND cc.contact_campaign_sent = '1'
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
GROUP BY
c.contact_id) AS icc
ON c.contact_id = icc.contact_id
LEFT JOIN
( SELECT
bf.contact_id,
count(*) AS countfeedbacks,
SUM( case when bf.bundle_feedback_supporting = 'Y'
then 1 else 0 end ) as SupportCount
FROM
contacts AS c
JOIN bundle_feedback bf
ON c.contact_id = bf.contact_id
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
GROUP BY
bf.contact_id) AS ibf
ON c.contact_id = ibf.contact_id
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
ORDER BY
percentFeedback DESC LIMIT 0, 25;
I have 3 tables one with company details, one with officer details and one that connects those two Company_Officer by ID so I can tell which officer works for which company and he can also work for multiple companies and a company can have multiple workers.
I am trying to create a query that would give me ID of the company that Officer works for company_Id, officers name and his role. The company he works for must have company_index set to FTSE 100 his status officer_resigned must be set to 0 and also he must work for more than 1 company.
Something like:
Company_ID|Company_Name|Officer_Name|Officer_Role
--------------------------------------------------
1 | Apple PLC |Millis, John|Director
1 | Apple PLC |DLAMINI, Bob|Secretary
2 | Google PLC |Millis, Johm|Secretary
Company_Details:
Officer_Details:
Company_Officer:
I have started fiddling with sql but it does not make much sense to me when it comes to relational databases. I understand that I need to use join. Is it all possible to achieve with one query?
Another sql for extra constraint of "getting only those officers which work for more than 1 company".
SELECT cd.company_id,
cd.company_name,
od.officer_name,
co.officer_role
FROM COMPANY_DETAILS cd
inner join COMPANY_OFFICER co
ON cd.company_id = co.company_id
inner join OFFICER_DETAILS od
ON co.officer_id = od.officer_id
WHERE cd.company_index = 'FTSE 100' AND
od.officer_resigned = '0' AND
co.officer_id IN
( SELECT officer_id
FROM COMPANY_OFFICER
GROUP BY officer_id
HAVING Count( DISTINCT company_id ) > 1
);
SELECT
CD.company_id,
CD.company_name,
OD.officer_name,
CO.officer_role
FROM
company_details CD
INNER JOIN company_officer CO
ON CD.company_id = CO.company_id
INNER JOIN officer_details OD
ON CO.officer_id = OD.officer_id
WHERE CD.company_index='FTSE 100' AND
OD.officer_resigned='0';
Do you even need to join?
SELECT DISTINCT c.Company_ID, c.Company_Name, o.Officer_Name, o.Officer_Role
FROM Company_Details c, Officer_Details o, Company_Officer co
WHERE Company_Index = 'FTSE 100' AND Officer_Resigned = 0 AND co.Officer_ID = o.Officer_ID AND co.Company_ID = c.Company_ID
Simply use inner join between the 3 table
select
Company_Details.Company_ID
, Company_Details.Company_Name
, Officer_Details.Officer_Name
, Company_Officer.Officer_Role
from Company_Details
INNER JOIN Officer_Details on Officer_Details.Officer_ID = Company_Officer.Officer_ID
INNER JOIN Company_Officer on Company_Officer.Company_ID = Company_Details.Company_ID;
i am retrieving the result from above 4 table using following query
SELECT
(SELECT SUM(CASE when c.Training_Id=1 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es
ON c.Course_Id = es.Course_Id
) STEM,
(SELECT SUM(CASE when c.Training_Id=2 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
) MA,
c.* FROM campus c;
The problem with this query is, two(2) students are in STEM and one(1) Student in MA against Campus_Id 3, but its repeating records against all campuses. i want if campus has no students than there should be '0' Zero.
You need to filter your subselects by Campus_Id. But first you have to use distinct table aliases. Change your last line to ca.* FROM campus ca. Then you can use a where clause in your subselects (WHERE c.Campus_Id = ca.Campus_Id).
SELECT
(SELECT SUM(CASE when c.Training_Id=1 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es
ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id -- line added
) STEM,
(SELECT SUM(CASE when c.Training_Id=2 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id -- line added
) MA,
ca.* FROM campus ca; -- line changed
This should solve your problem.
To improve the performance you can also filter your subselects by Training_Id. In the first subselect you only need the rows with Training_Id=1. So you can change your where clause to:
WHERE c.Campus_Id = ca.Campus_Id
AND c.Training_Id = 1
Doing that you can also use COUNT instead of SUM. So your subselect would look like:
SELECT COUNT(1)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id AND c.Training_Id = 1
To prevent code duplication (your subselects are almost equal) you can join all needed tables and group by Campus_Id:
select
COUNT(co.Training_Id=1 OR NULL) STEM,
COUNT(co.Training_Id=2 OR NULL) MA,
ca.Campus_Id
from campus ca
left join courses co on co.Campus_Id = ca.Campus_Id
left join enrolled_students es on es.Course_Id = co.Course_Id
where co.Training_Id in (1, 2)
group by ca.Campus_Id