I have got two tables:
booking(advertiser_id, name, bookings, on_date)
click(advertiser_id, name, clicks, on_date)
I need to find bookings/clicks for each name, advertiser_id-wise and date-wise. I was doing the following to achieve this:
select b.name, b.advertiser_id, max(b.bookings)/max(c.clicks), b.on_date from click c inner join booking b on
c.advertiser_id = b.advertiser_id and
c.name = b.name and
c.on_date = b.on_date
group by
b.name,
b.advertiser_id,
b.on_date
I need to return 0, if there are no bookings (no entry in booking table) for that specific click. How can I achieve this?
Example:
click table:
name clicks on_date advertiser_id
uk 123 2018-05-01 12
us 123 2018-05-02 12
us 123 2018-05-01 12
booking table:
advertiser_id name bookings on_date
12 uk 1200 2018-05-07
12 us 123 2018-05-07
12 uk 123 2018-05-01
12 us 123 2018-05-01
Result:
name advertiser_id max(b.bookings)/max(c.clicks) on_date
uk 12 1.0000 2018-05-01
us 12 1.0000 2018-05-01
Expected:
name advertiser_id max(b.bookings)/max(c.clicks) on_date
uk 12 1.0000 2018-05-01
us 12 1.0000 2018-05-01
us 12 0 2018-05-02
Note
I used max as to use columns which were not present in group by.
Something to think about...
DROP TABLE IF EXISTS click;
CREATE TABLE click
(name CHAR(2) NOT NULL
,clicks INT NOT NULL
,on_date DATE NOT NULL
,advertiser_id INT NOT NULL
,PRIMARY KEY(name,on_date,advertiser_id)
);
INSERT INTO click VALUES
('uk',123,'2018-05-01',12),
('us',123,'2018-05-02',12),
('us',123,'2018-05-01',12);
DROP TABLE IF EXISTS booking;
CREATE TABLE booking
(advertiser_id INT NOT NULL
,name CHAR(2) NOT NULL
,bookings INT NOT NULL
,on_date DATE NOT NULL
,PRIMARY KEY(advertiser_id,name,on_date)
);
INSERT INTO booking VALUES
(12,'uk',1200,'2018-05-07'),
(12,'us', 123,'2018-05-07'),
(12,'uk', 123,'2018-05-01'),
(12,'us', 123,'2018-05-01');
select c.name
, c.advertiser_id
, COALESCE(b.bookings,0)
, c.clicks
, c.on_date
from click c
left
join booking b
on c.advertiser_id = b.advertiser_id
and c.name = b.name
and c.on_date = b.on_date;
+------+---------------+------------------------+--------+------------+
| name | advertiser_id | COALESCE(b.bookings,0) | clicks | on_date |
+------+---------------+------------------------+--------+------------+
| uk | 12 | 123 | 123 | 2018-05-01 |
| us | 12 | 123 | 123 | 2018-05-01 |
| us | 12 | 0 | 123 | 2018-05-02 |
+------+---------------+------------------------+--------+------------+
You can do this using LEFT JOIN, So your query should look like,
select C.name,c.advertiser_id, IFNULL(max(b.bookings)/max(c.clicks), 0),c.on_date
FROM click c
LEFT join booking b on
c.advertiser_id = b.advertiser_id and
c.name = b.name and
c.on_date = b.on_date
group by
b.name,
b.advertiser_id,
b.on_date
And it's working perfectly.
Open LINK to see the output.
Related
I am working on a tool to administer customer and payment data.
I use MySQL and have the following tables: customers and payments:
customers:
ID | invoiceID | supreme_invoiceID
1 123 a123
2 124 a123
3 103 a103
4 110 a110
payments:
ID | supreme_invoiceID | amount | date
1 a123 10 10.10.2010
2 a103 105 10.11.2017
3 a123 5 11.10.2010
And my result should look like this:
view_complete:
ID | supreme_invoideID | number_invoices | GROUP_CONCAT(invoiceID) | SUM(payments.amount) | GROUP_CONCAT(payments.amount)
1 a123 2 123;124 15 10;15
Unfortunately, I cannot get it directly into one table. Instead I create 2 views and query the payments table separately for aggregate data on payments.
First, I create an auxiliary view:
CREATE VIEW precomplete as
SELECT *, COUNT(supreme_invoiceID) as number_invoices FROM customers
GROUP BY supreme_invoiceID;
Then, a second one:
Then I take a second VIEW
CREATE VIEW complete AS
SELECT precomplete.*, SUM(payments.amount)
LEFT JOIN payments p ON precomplete.supreme_invoiceID = p.supreme_invoiceID
GROUP BY precomplete.supreme_invoiceID;
And the concatenated Values I receive in an additional query. But I would like to receive my data all in one query and hopefully, without such view hierarchy. PhpMyAdmin is already pretty slow in loading my views even with few entries.
Any help is greatly appreciated.
Thanks!
The db design forces an approach which builds the aggregates separately to avoid duplicates before joining on a common field for example
drop table if exists c,p;
create table c(ID int, invoiceID int, supreme_invoiceID varchar(4));
insert into c values
(1 , 123 , 'a123'),
(2 , 124 , 'a123'),
(3, 103 , 'a103'),
(4 , 110 , 'a110');
create table p(ID int, supreme_invoiceID varchar(4), amount int, date varchar(10));
insert into p values
(1 , 'a123' , 10 , '10.10.2010'),
(2 , 'a103' , 105 , '10.11.2017'),
(3 , 'a123' , 5 , '11.10.2010');
select c.*,p.*
from
(select min(c.id) minid,count(*) nofinvoices,group_concat(c.invoiceid) gciid, max(supreme_invoiceid) maxsid
from c
group by supreme_invoiceid
) c
join
(select group_concat(supreme_invoiceid) gcsid, sum(amount),group_concat(amount),max(supreme_invoiceid) maxsid
from p
group by supreme_invoiceid
) p
on p.maxsid = c.maxsid
order by minid
;
+-------+-------------+---------+--------+-----------+-------------+----------------------+--------+
| minid | nofinvoices | gciid | maxsid | gcsid | sum(amount) | group_concat(amount) | maxsid |
+-------+-------------+---------+--------+-----------+-------------+----------------------+--------+
| 1 | 2 | 123,124 | a123 | a123,a123 | 15 | 10,5 | a123 |
| 3 | 1 | 103 | a103 | a103 | 105 | 105 | a103 |
+-------+-------------+---------+--------+-----------+-------------+----------------------+--------+
2 rows in set (0.15 sec)
Much like your view approach. Note there doesn't appear to be a customer in the customer table
I would like to find out the average number of days between orders grouping by account_id in the database.
Let's say I have the following table named 'orders' with this data.
id account_id account_name order_date
1 555 Acme Fireworks 2015-06-15
2 342 Kent Brewery 2015-09-12
3 555 Acme Fireworks 2015-09-15
4 342 Kent Brewery 2015-10-12
5 342 Kent Brewery 2015-11-12
6 342 Kent Brewery 2015-12-12
7 555 Acme Fireworks 2015-12-15
8 900 Plastic Inc. 2015-12-20
I would like a query to produce the following results
account_id account_name average_days_between_orders
342 Kent Brewery 30.333
555 Acme Fireworks 91.5
900 Plastic Inc. (unsure of what value would go here since there's 1 order only)
I checked the following questions to get an idea, but still couldn't figure out the problem:
Average difference between two dates, grouped by a third field?
Thanks!
You need a query that produces the difference between the previous purchase for a given (null if there is no previous purchase) and take the average of these values.
I would self-join the above table to get for each order the maximum order date of any previous order in a subquery. In the avg() function calculate the difference between the calculated date and the current order date:
SELECT o3.account_id, o3.account_name, avg(diff) as average_days_between_orders
FROM
(select o1.id,
o1.account_id,
o1.account_name,
datediff(o1.order_date, max(o2.order_date)) as diff
from orders o1
left join orders o2 on o1.account_id=o2.account_id and o1.id>o2.id
group by o1.id, o1.account_id, o1.account_name, o1.order_date) o3
GROUP BY o3.account_id, o3.account_name
As an alternative to joins, you can use a user defined variable in the subquery or a correlated subquery in the select list to calculate the differences. You can check mysql running total solutions to get a hang of this solution, such as this SO topic. Specifically, check out the solution provided by Andomar.
If your orders table is huge, then the alternative aprroaches described in that topic may be better from a performance point of view.
Note: Please test it carefully and use it as you wish. I couldn't find an easy query for it. I don't guarantee to work for all cases :) If you just want the answer, the complete query is shown in the end.
The goal is that I'll try to get a table with start and end dates in one row, and then I'll simply calculate average difference between two dates. Something like this.
id | account_id | account_name | start_date | end_date
------------------------------------------------------------
1 | 342 | Kent Brewery | 2015-09-12 | 2015-10-12
2 | 342 | Kent Brewery | 2015-10-12 | 2015-11-12
3 | 342 | Kent Brewery | 2015-11-12 | 2015-12-12
4 | 555 | Acme Fireworks | 2015-06-15 | 2015-09-15
5 | 555 | Acme Fireworks | 2015-09-15 | 2015-12-15
I'll create few temporary tables to make it a bit more clear. First query for start_date:
QUERY:
create temporary table uniq_start_dates
select (#sid := #sid + 1) id, tmp_uniq_start_dates.*
from
(select distinct o1.account_id, o1.account_name, o1.order_date start_date
from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date
order by o1.account_id, o1.order_date) tmp_uniq_start_dates
join (select #sid := 0) AS sid_generator
OUTPUT: temporary table - uniq_start_dates
id | account_id | account_name | start_date
-----------------------------------------------
1 | 342 | Kent Brewery | 2015-09-12
2 | 342 | Kent Brewery | 2015-10-12
3 | 342 | Kent Brewery | 2015-11-12
4 | 555 | Acme Fireworks | 2015-06-15
5 | 555 | Acme Fireworks | 2015-09-15
Do the same thing for end_date:
QUERY:
create temporary table uniq_end_dates
select (#eid := #eid + 1) id, tmp_uniq_end_dates.*
from
(select distinct o2.account_id, o2.account_name, o2.order_date end_date
from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date
order by o2.account_id, o2.order_date) tmp_uniq_end_dates
join (select #eid := 0) AS eid_generator
OUTPUT: temporary table - uniq_end_dates
id | account_id | account_name | end_date
-----------------------------------------------
1 | 342 | Kent Brewery | 2015-10-12
2 | 342 | Kent Brewery | 2015-11-12
3 | 342 | Kent Brewery | 2015-12-12
4 | 555 | Acme Fireworks | 2015-09-15
5 | 555 | Acme Fireworks | 2015-12-15
If you notice, I created new auto id for each view so that I can join them back to one table (like the very first table). Let's join uniq_start_dates and uniq_end_dates.
QUERY:
create temporary table uniq_start_end_dates
select uniq_start_dates.*, uniq_end_dates.end_date
from uniq_start_dates
join uniq_end_dates using (id)
OUTPUT: temporary table - uniq_start_end_dates
(the same one as the first table)
Now it's an easy part. Just aggregate and get average date time difference.
QUERY:
select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from uniq_start_end_dates
group by account_id, account_name
OUTPUT:
account_id | account_name | average_days
--------------------------------------------
342 | Kent Brewery | 30.3333
555 | Acme Fireworks | 91.5000
If you may notice, Plastic Inc. is not in the result. If you care about "null" average_days. Here it is:
QUERY:
select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from
(select distinct account_id, account_name from orders) all_accounts
left join
(select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from uniq_start_end_dates
group by account_id, account_name) accounts_with_average_days
using (account_id, account_name)
OUTPUT:
account_id | account_name | average_days
--------------------------------------------
342 | Kent Brewery | 30.3333
555 | Acme Fireworks | 91.5000
900 | Plastic Inc. | null
Here is a complete messy query:
select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from
(select distinct account_id, account_name from orders) all_accounts
left join
(select uniq_start_dates.account_id, uniq_start_dates.account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from
(select (#sid := #sid + 1) id, tmp_uniq_start_dates.*
from
(select distinct o1.account_id, o1.account_name, o1.order_date start_date from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o1.account_id, o1.order_date) tmp_uniq_start_dates join (select #sid := 0) AS sid_generator
) uniq_start_dates
join
(select (#eid := #eid + 1) id, tmp_uniq_end_dates.*
from
(select distinct o2.account_id, o2.account_name, o2.order_date end_date from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o2.account_id, o2.order_date) tmp_uniq_end_dates join (select #eid := 0) AS eid_generator
) uniq_end_dates
using (id)
group by uniq_start_dates.account_id, uniq_start_dates.account_name) accounts_with_average_days
using (account_id, account_name)
I have 2 tables, ord_tbl and pay_tbl with these data:
ord_tbl
invoice | emp_id | prod_id | amount
123 | 101 | 1 | 1000
123 | 101 | 2 | 500
123 | 101 | 3 | 500
124 | 101 | 2 | 300
125 | 102 | 3 | 200
pay_tbl
invoice | new_invoice | amount
123 | 321 | 300
123 | 322 | 200
124 | 323 | 300
125 | 324 | 100
I would like the selection statement to give me this result
invoice | emp_id | orig_amt | balance | status
123 | 101 | 2000 | 1500 | unsettled
The invoice that has 0 balance will not be included anymore. This is what I tried so far...
;WITH CTE as
(SELECT ot.invoice, MAX(ot.emp_id) as emp_id, SUM(ot.amount) as origAmt FROM ord_tbl ot GROUP BY ot.invoice),
CTE2 as
(SELECT pt.invoice, SUM(pt.amountt) as payAmt FROM pay_tbl GROUP BY pt.invoice)
SELECT CTE.invoice, CTE.emp_id, CTE.origAmt, CTE.origAmt-CTE2.payAmt as bal, 'UNSETTLED' as status
FROM
CTE LEFT JOIN CTE2 ON CTE.invoice=CTE2.invoice
WHERE CTE.emp_id='101' AND CTE.origAmt-CTE2.payAmt>0 OR CTE2.patAmt IS NULL
This has been taught to me here and it works in sql server. What I need now is to have this run in ms access. I tried this code but ms access gives me an error saying "Invalid SQL statement; expected 'DELETE','INSERT', 'SELECT', or 'UPDATE'."
Can you help? Thanks.
MS ACCESS sql is poor and ACCESS doesn't know WITH instruction. I created tables (all fields int type). I rewrote query and this query works:
SELECT CTE.invoiceCTE,
CTE.emp_idCTE,
CTE.origAmtCTE,
CTE.origAmtCTE-CTE2.payAmtCTE2 as bal,
'UNSETTLED' as status
FROM
(SELECT invoice as invoiceCTE,
MAX(emp_id) as emp_idCTE,
SUM(amount) as origAmtCTE
FROM ord_tbl
GROUP BY invoice) as CTE
LEFT JOIN
( SELECT invoice as invoiceCTE2,
SUM(amount) as payAmtCTE2
FROM pay_tbl
GROUP BY invoice) as CTE2
ON CTE.invoiceCTE=CTE2.invoiceCTE2
WHERE CTE.emp_idCTE=101
AND (CTE.origAmtCTE-CTE2.payAmtCTE2>0 OR CTE2.payAmtCTE2 IS NULL)
I don't know about emp_id. If it is some kind of customer id you'd have only one per invoice_id and you'd need this SQL:
SELECT
ord_tbl.invoice,
First(ord_tbl.emp_id) AS ErsterWertvonemp_id,
Sum(ord_tbl.amount) AS origAmt,
Sum([ord_tbl].[amount])-Sum([pay_tbl].[amount]) AS bal,
"unsettled" AS status
FROM
ord_tbl LEFT JOIN pay_tbl
ON ord_tbl.invoice = pay_tbl.invoice
GROUP BY ord_tbl.invoice
HAVING (((Sum([ord_tbl].[amount])-Sum([pay_tbl].[amount]))>0));
If you want to select only the ones with emp_id=101 you'd need this:
SELECT
ord_tbl.invoice,
ord_tbl.emp_id,
Sum(ord_tbl.amount) AS origAmt,
Sum([ord_tbl].[amount])-Sum([pay_tbl].[amount]) AS bal,
"unsettled" AS status
FROM
ord_tbl LEFT JOIN pay_tbl
ON ord_tbl.invoice = pay_tbl.invoice
GROUP BY
ord_tbl.invoice,
ord_tbl.emp_id
HAVING (
((ord_tbl.emp_id)=101)
AND
((Sum([ord_tbl].[amount])-Sum([pay_tbl].[amount]))>0)
);
I have created the following query to use in a view
SELECT
*
FROM
customers c
JOIN
customer_business cb
ON
c.customer_id = cb.customer_id
union
SELECT
*
FROM
customers c
LEFT JOIN
customer_business
ON
business_id=NULL;
It makes his work perfectly. It shows all customers with the business associated, and at the end, shows all customers with the info of the business in null.
customer_id | business_id
--------------------------------
1 | 1
2 | 1
2 | 2
1 | NULL
2 | NULL
3 | NULL
But the problem es that the UNION makes the view has very poor performace.
I tryed to do it with LEFT JOIN but doesnt shows al the customers with business in null, just the ones without any businesses associated
I know that the solution to speed up my view is to remove that UNION, but i cant figure out how.
Can anyone help me?
Thanks
EDIT
Here's an example
Customer Table
customer_id | name
--------------------------------
1 | test1
2 | test2
3 | test3
Customer_business Table
customer_business_id | customer_id | business_id
----------------------------------------------------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 2 | 2
Expected query result:
name | customer_id | business_id
----------------------------------------------------------
test1 | 1 | 1
test1 | 1 | 2
test1 | 1 | 3
test2 | 2 | 1
test2 | 2 | 2
test1 | 1 | NULL
test2 | 2 | NULL
test3 | 3 | NULL
Updating it based on the comments below and the output you want.
Note that I have used UNION ALL which is faster than UNION as UNION uses DISTINCT to get unique records which in your case doesn't apply. Also, make sure customer_id is PK in Customer table and try adding non-unique index on customer_id in Customer_Business table and it should help with performance.
SELECT name,
C.customer_id,
business_id
FROM Customer C
INNER JOIN Customer_Business CB
ON C.customer_id = CB.customer_id
UNION ALL
SELECT name,
C.customer_id,
NULL
FROM Customer C
Excluding the union which we know that is not performant the other thing that slows down you query is the statement in the second query ON idbusiness = NULL.
I propose to edit you query like this and see the performance as a view:
SELECT c.customer_id, idbusiness
FROM customers c
JOIN customer_business cb ON c.customer_id = cb.customer_id
UNION
SELECT customer_id, NULL
FROM customers c
EDIT:
Looking for an alternative you could try this, it should return the same output (i've changed null values with 0) but i don't think it's faster:
SELECT c.customer_id, idbusiness
FROM customers c
INNER JOIN (
SELECT customer_id, idbusiness
FROM customer_business
UNION
SELECT 0 , 0
)b ON ( c.customer_id = b.customer_id )
OR (
b.idbusiness =0
)
Eventually you could try to put into a view only the subquery b or delete the union by putting the values 0,0 as a record in table customer_business.
I have a table say :
id| AccID | Subject | Date
1 | 103 | Open HOuse 1 | 11/24/2011 9:00:00 AM
2 | 103 | Open HOuse 2 | 11/25/2011 10:00:00 AM
3 | 72 | Open House 3 | 11/26/2011 1:10:28 AM
4 | 82 | OPen House 4 | 11/27/2011 5:00:29 PM
5 | 82 | OPen House 5 | 11/22/2011 5:00:29 PM
From the above table, i need all the unique values for the Accid. But say, if there are two or more columns with the same Accid, then i need the one which has the smaller date (among the columns which have the same Accid)
So, from the above table, the o/p should be :
1
3
5
Can any1 please help me in this ? Thanks
SELECT t1.*
FROM [MyTable] t1
INNER JOIN
(
SELECT AccID, MIN(Date) Date
FROM [MyTable]
GROUP BY AccID
) t2 ON t1.AccID = t2.AccID AND t1.Date = t2.Date
More than just the AccID but...
WITH SEL
AS
(
SELECT AccID, MIN(DATE)
FROM table
GROUP BY AccID
)
SELECT table.*
FROM table
JOIN SEL ON SEL.AccID = table.AccID