I have two tables that I would like to join in MYSQL and I'm looking for the most optimized way to do this. Here's the problem:
I want to count the number of records based on a field (call it customer) in each table then join the results together (using customer) to produce a summary table. Note that all customers must be returned even if one table does not include a customer
TABLE A
Customer
--------
1
1
4
4
5
6
and
TABLE B
Customer
--------
4
5
5
5
6
6
7
7
7
into a summary table
SUMMARY
Customer CountA CountB
-----------------------
1 2 0
4 2 1
5 1 3
6 1 2
7 0 3
Any ideas on how to do something like this?
SELECT customer,SUM(source = 'a') cnta, SUM(source = 'b') cntb FROM
(
SELECT 'a' source,customer FROM customer_a
UNION ALL
SELECT 'b',customer FROM customer_b
) n
GROUP
BY customer;
Related
Imagine I have the following tables:
Numbers PK
1
2
3
4
5
6
7
8
9
10
Numbers FK 1
Numbers FK 2
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
1
10
1
8
8
10
4
7
3
4
9
1
6
3
9
4
6
5
6
I have the following tables: "Numbers PK" as Primary key and another 2 tables that are related one with each other that are foreign keys of Numbers.
I am trying to make a query to select the number(s) from the table "NumbersFK2" that are related with all the numbers of "Numbers PK".
As you can see in this example the solution would be 1 as 1 is related with 1-10 in the tables "Numbers FK1" and "Numbers FK2"
I have tried to solve and after some days I need some help as I don't know how could I do it. I appreciate the help. Thanks
We use dense_rank() to count the Numbers_PK in case they're not consecutive. Then we left join, group by and count(distinct Numbers_PK).
with t3 as (
select Numbers_PK
,dense_rank() over(order by Numbers_PK) as dns_rnk
from t
)
select Numbers_FK_2
from t3 left join t2 on t2.Numbers_FK_1 = t3.Numbers_PK
group by Numbers_FK_2
having count(distinct Numbers_PK) = max(dns_rnk)
Numbers_FK_2
1
Fiddle
I have a database with a table called BOOKINGS containing the following values
main-id place-id start-date end-date
1 1 2018-8-1 2018-8-8
2 2 2018-6-6 2018-6-9
3 3 2018-5-5 2018-5-8
4 4 2018-4-4 2018-4-5
5 5 2018-3-3 2018-3-10
5 1 2018-1-1 2018-1-6
4 2 2018-2-1 2018-2-10
3 3 2018-3-1 2018-3-28
2 4 2018-4-1 2018-4-6
1 5 2018-5-1 2018-5-15
1 3 2018-6-1 2018-8-8
1 4 2018-7-1 2018-7-6
1 1 2018-8-1 2018-8-18
1 2 2018-9-1 2018-9-3
1 5 2018-10-1 2018-10-6
2 5 2018-11-1 2018-11-5
2 3 2018-12-1 2018-12-25
2 2 2018-2-2 2018-2-19
2 4 2018-4-4 2018-4-9
2 1 2018-5-5 2018-5-23
What I need to do is for each main-id I need to find the largest total number of days for every place-id. Basically, I need to determine where each main-id has spend the most time.
This information must then be put into a view, so unfortunately I can't use temporary tables.
The query that gets me the closest is
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT `BOOKINGS`.`main-id`, `BOOKINGS`.`place-id`, SUM(DATEDIFF(`end-date`, `begin-date`)) AS `total`
FROM `BOOKINGS`
GROUP BY `BOOKINGS`.`main-id`,`RESERVATION`.`place-id`
Which yields:
main-id place-id total
1 1 24
1 2 18
1 5 5
2 1 2
2 2 20
2 4 9
3 1 68
3 2 24
3 3 30
4 1 5
4 2 10
4 4 1
5 1 19
5 2 4
5 5 7
What I need is then the max total for each distinct main-id:
main-id place-id total
1 1 24
2 2 20
3 1 68
4 2 10
5 1 19
I've dug through a large amount of similar posts that recommend things like self joins; however, due to the fact that I have to create the new field total using an aggregate function (SUM) and another function (DATEDIFF) rather than just querying an existing field, my attempts at implementing those solutions have been unsuccessful.
I am hoping that my query that got me close will only require a small modification to get the correct solution.
Having hyphen character - in column name (which is also minus operator) is a really bad idea. Do consider replacing it with underscore character _.
One possible way is to use Derived Tables. One Derived Table is used to determine the total on a group of main id and place id. Another Derived Table is used to get maximum value out of them based on main id. We can then join back to get only the row corresponding to the maximum value.
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b1.main_id, b1.place_id, b1.total
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS b1
JOIN
(
SELECT dt.main_id, MAX(dt.total) AS max_total
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS dt
GROUP BY dt.main_id
) AS b2
ON b1.main_id = b2.main_id AND
b1.total = b2.max_total
MySQL 8+ solution would be utilizing the Row_Number() functionality:
CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b.main_id, b.place_id, b.total
FROM
(
SELECT dt.main_id,
dt.place_id,
dt.total
ROW_NUMBER() OVER (PARTITION BY dt.main_id
ORDER BY dt.total DESC) AS row_num
FROM
(
SELECT `main-id` AS main_id,
`place-id` AS place_id,
SUM(DATEDIFF(`end-date`, `begin-date`)) AS total
FROM BOOKINGS
GROUP BY main_id, place_id
) AS dt
GROUP BY dt.main_id
) AS b
WHERE b.row_num = 1
I'm trying to get the query below to show for each item for each store the amount of each of 4 items we have.
It works great, and I created the temporary table to try to increase speed but my problem is that if the table has no rows for a certain product that product does not show up at all.
I'd like to show all four products(prodNo) regardless of if there is actually any of rows for that specific store.
I researched this site and could not find something similar enough for me to figure it out.
CREATE TEMPORARY TABLE IF NOT EXISTS temp_invoice_dates AS
(
SELECT Invoice_detail.del_date,invoice_Detail.StoreNo,mast_stores.SDesc, invoice_Detail.ProdNo,sold_qty,retn_price,retn_qty,sold_price FROM Invoice_detail
LEFT JOIN mast_stores on invoice_detail.StoreNO=mast_stores.Snum
LEFT JOIN invoice on invoice_detail.Del_Date=invoice.Del_Date and invoice_detail.Invoice_No=invoice.Invoice_No
WHERE Cnum IN ('200','210') AND invoice_detail.Del_Date >= "2016-03-01" AND invoice_detail.Del_Date < "2016-04-01"
);
SELECT
temp_invoice_dates.StoreNo,
temp_invoice_dates.SDesc,
DATE_FORMAT(temp_invoice_dates.Del_Date,'%Y') as Year,
DATE_FORMAT(temp_invoice_dates.Del_Date,'%M') as Month,
temp_invoice_dates.ProdNo,
mast_items.IDesc,
SUM(sold_qty) as TotalIn,
SUM(retn_qty) as TotalOut,
ROUND(SUM((sold_qty*sold_price)-(retn_qty*retn_price)),2) as NetSales,
CONCAT(ROUND(SUM(retn_qty)/SUM(sold_qty),2)*100,'%') as StalePerc
FROM mast_Items
LEFT JOIN temp_invoice_dates on temp_invoice_dates.ProdNo=mast_items.Inum
WHERE mast_items.Inum in ('3502','3512','4162','4182')
GROUP BY temp_invoice_dates.StoreNo, ProdNo
ORDER BY temp_invoice_dates.StoreNo, ProdNo;
Drop table temp_invoice_dates;
Results are similar to:
StoreNo Product Count....
1 1 1
1 2 5
1 3 2
1 4 1
2 1 14
2 2 1
2 4 4
3 2 33
3 3 3
Where as I'd like it to be
StoreNo Product Count ....
1 1 1
1 2 5
1 3 2
1 4 1
2 1 14
2 2 1
2 3 0
2 4 4
3 1 0
3 2 33
3 3 3
3 4 0
Something like this should work.
SELECT sp.StoreNo, sp.ProdNo
, ...stuff...
, sp.IDesc, sp.SDesc
, ...more stuff...
FROM (
SELECT i.Inum AS ProdNo, s.Snum AS StoreNo
, i.IDesc, s.SDesc
FROM mast_Items AS i, mast_stores AS s
WHERE i.Inum IN ('3502','3512','4162','4182')
) AS sp
LEFT JOIN temp_invoice_dates AS tid
ON sp.ProdNo = tid.ProdNo
AND sp.StoreNo = tid.StoreNo
GROUP BY sp.StoreNo, sp.ProdNo
ORDER BY sp.StoreNo, sp.ProdNo
;
Normally I recommend against cross joins (as seen in the subquery) but in this case it is exactly what is needed. If the query is slow, you can instead insert the subquery results into a temp table beforehand, index that, and then use the temp table in place of the subquery.
(Edit: should use sp fields when available for grouping and results)
I have a table of data like this:
id user_id A B C
=====================
1 15 1 2 3
2 15 1 2 5
3 20 1 3 9
4 20 1 3 7
I need to remove duplicate user ids and keep the record that sorts lowest when sorting by A then B then C. So using the above table, I set up a temp query (qry_temp) that simply does the sort--first on user_id, then on A, then on B, then on C. It returns the following:
id user_id A B C
====================
1 15 1 2 3
2 15 1 2 5
4 20 1 3 7
3 20 1 3 9
Then I wrote a Totals Query based on qry_temp that just had user_id (Group By) and then id (First), and I assumed this would return the following:
user_id id
===========
15 1
20 4
But it doesn't seem to do that--instead it appears to be just returning the lowest id in a group of duplicate user ids (so I get 1 and 3 instead of 1 and 4). Shouldn't the Totals query use the order of the query it's based upon? Is there a property setting in the query that might impact this or another way to get what I need? If it helps, here is the SQL:
SELECT qry_temp.user_id, First(qry_temp.ID) AS FirstOfID
FROM qry_temp
GROUP BY qry_temp.user_id;
You need a different type of query, for example:
SELECT tmp.id,
tmp.user_id,
tmp.a,
tmp.b,
tmp.c
FROM tmp
WHERE (( ( tmp.id ) IN (SELECT TOP 1 id
FROM tmp t
WHERE t.user_id = tmp.user_id
ORDER BY t.a,
t.b,
t.c,
t.id) ));
Where tmp is the name of your table. First, Last, Min and Max are not dependent on a sort order. In relational databases, sort orders are quite ephemeral.
I have the following 2 tables:
1) Companies
ID CompanyName Abbreviation Notes
1 CompanyA CA ...
2 CompanyB CB ...
3 CompanyC CC ...
2) PlannedDeployments
ID CompanyID TypeID DepDate NumDeployed
1 1 2 09/2010 5
2 1 2 10/2010 5
3 1 3 09/2010 3
4 1 3 10/2010 3
5 1 4 10/2010 4
6 2 2 12/2010 10
7 2 4 10/2010 1
8 3 2 11/2010 6
Note that TypeID is a number between 1 and 5 describing what type of person is being deployed. For the purposes of this query, I'm interested in Type2 employees for each company and then the sum of Types 3 & 4 for each date. What I eventually want to end up with is a crosstab that looks like the following:
Crosstab
Date/Company CompanyA CompanyB CompanyC SumOfTypes3and4
09/2010 5 3
10/2010 5 8
11/2010 6
12/2010 10
The problem is that final column - the sum of Type 3 and Type 4 employees. The current crosstab that I have includes everything except that sum column and looks like the following:
TRANSFORM Sum(PlannedDeployments.NumDeployed) AS ["NumDeployed"]
SELECT PlannedDeployments.DepDate
FROM PlannedDeployments LEFT JOIN Companies ON Companies.ID=PlannedDeployments.CompanyID
WHERE PlannedDeployments.TypeID=2 AND (PlannedDeployments.DepDate Between FormFieldValue("Form", "Control") AND FormFieldValue("Form", "Control"))
GROUP BY PlannedDeployments.DepDate
PIVOT Companies.CompanyName;
The second part of that WHERE clause is just limiting the data by some form controls. Anyway - I'm having a lot of trouble getting that final column. Anyone have any ideas?
Edit: Building on the solution provided by Remou below, here's what the final query ended up looking like:
TRANSFORM Sum(PlannedDeployments.NumDeployed) AS ["NumDeployed"]
SELECT PlannedDeployments.DepDate, q.SumOfNumDeployed
FROM (SELECT PlannedDeployments.DepDate, Sum(PlannedDeployments.NumDeployed) AS SumOfNumDeployed
FROM PlannedDeployments
WHERE (((PlannedDeployments.[TypeID]) In (3,4)))
GROUP BY PlannedDeployments.DepDate) AS q
RIGHT JOIN (PlannedDeployments
INNER JOIN Companies ON PlannedDeployments.CompanyID = Companies.ID)
ON q.DepDate = PlannedDeployments.DepDate
WHERE PlannedDeployments.TypeID=2
AND (PlannedDeployments.DepDate Between FormFieldValue("Form", "Control")
AND FormFieldValue("Form", "Control"))
GROUP BY PlannedDeployments.DepDate, q.SumOfNumDeployed
PIVOT Companies.CompanyName;
You can use a subquery:
TRANSFORM Sum(PlannedDeployments.NumDeployed) AS ["NumDeployed"]
SELECT PlannedDeployments.DepDate, Sum(q.SumOfNumDeployed) AS SumOfSumOfNumDeployed
FROM (SELECT PlannedDeployments.DepDate, Sum(PlannedDeployments.NumDeployed) AS SumOfNumDeployed
FROM PlannedDeployments
WHERE (((PlannedDeployments.[TypeID]) In (3,4)))
GROUP BY PlannedDeployments.DepDate) AS q
RIGHT JOIN (PlannedDeployments
INNER JOIN Companies ON PlannedDeployments.CompanyID = Companies.ID)
ON q.DepDate = PlannedDeployments.DepDate
WHERE PlannedDeployments.TypeID=2
AND (PlannedDeployments.DepDate Between FormFieldValue("Form", "Control")
AND FormFieldValue("Form", "Control"))
GROUP BY PlannedDeployments.DepDate
PIVOT Companies.CompanyName;