Calculating acceptance_ratio with LEFT JOIN and SELF JOIN and aggregate function - mysql

Trying to calculate daily acceptance ratios from the 'connecting' table which has 4 fields with sample values:
date action sender_id recipient_id
'2017-01-05', 'request_link', 'frank', 'joe'
'2017-01-06', 'request_link', 'sally', 'ann'
'2017-01-07', 'request_link', 'bill', 'ted'
'2017-01-07', 'accept_link', 'joe', 'frank'
'2017-01-06', 'accept_link', 'ann', 'sally'
'2017-01-06', 'accept_link', 'ted', 'bill'
Because there are 0 accepts and 1 request on 01-05, its daily acceptance ratio should be 0/1 = 0. Similarly, the ratio for 01-06 should be 2/1, and it should be 1/1 for 01-07.
It is important however that each accept_link has a corresponding request_link where the sender_id of the request_link = the recipient_id of the accept_link (and vice versa). So here a self-join is required I believe to ensure that Joe accepts Frank's request, regardless of the date.
How can the below query be corrected so that the aggregation works correctly while retaining the required join conditions? Will the query calculate correctly as is if the two WHERE conditions are removed, or are they necessary?
SELECT f1.date,
SUM(CASE WHEN f2.action = 'accept_link' THEN 1 ELSE 0 END) /
SUM(CASE WHEN f2.action = 'request_link' THEN 1 ELSE 0 END) AS acceptance_ratio
FROM connecting f1
LEFT JOIN connecting f2
ON f1.sender_id = f2.recipient_id
LEFT JOIN connecting f2
ON f1.recipient_id = f2.sender_id
WHERE f1.action = 'request_link'
AND f2.action = 'accept_link'
GROUP BY f1.date
ORDER BY f1.date ASC
Expected output should look something like:
date acceptance_ratio
'2017-01-05' 0.0000
'2017-01-06' 2.0000
'2017-01-07' 1.0000
Thanks in advance.

Once again, I don't think you need to be using a self join here. Instead, just use conditional aggregation over the entire table, and count the number of requests and accepts which happened on each day:
SELECT t.date,
CASE WHEN t.num_requests = 0
THEN 'No requests available'
ELSE CAST(t.num_accepts / t.num_requests AS CHAR(50))
END AS acceptance_ratio
FROM
(
SELECT c1.date,
SUM(CASE WHEN c1.action = 'accept_link' AND c2.action IS NOT NULL
THEN 1 ELSE 0 END) AS num_accepts,
SUM(CASE WHEN c1.action = 'request_link' THEN 1 ELSE 0 END) AS num_requests
FROM connecting c1
LEFT JOIN connecting c2
ON c1.action = 'accept_link' AND
c2.action = 'request_link' AND
c1.sender_id = c2.recipient_id AND
c2.recipient_id = c1.sender_id
GROUP BY c1.date
) t
ORDER BY t.date
Note here that I use a CASE expression to handle divide by zero, which could occur should a certain day no requests. I also assume here that the same invitation will not be sent out more than once.

Related

MySQL "AND ONLY HAS" type operator/function?

MySQL here. I have the following data model:
[applications]
===
id : PK
status : VARCHAR
...lots of other fields
[invoices]
===
id : PK
application_id : FK to applications.id
status : VARCHAR
... lot of other fields
It is possible for the same application to have 0+ invoices associated with it, each with a different status. I am trying to write a query that looks for applications that:
have a status of "Pending"; and
have only invoices whose status is "Accepted"
My best attempt at such a query is:
SELECT a.id,
i.id,
a.status,
i.status
FROM application a
INNER JOIN invoice i ON a.id = i.application_id
WHERE a.status = "Pending"
AND i.status = "Accepted"
The problem here is that this query does not exclude applications that are associated with non-Accepted invoices. Hence it might return a row of, say:
+--------+--------+-----------+-----------+
| id | id | status | status |
+--------+--------+-----------+-----------+
| 123 | 456 | Pending | Accepted |
+--------+--------+-----------+-----------+
However, when I query the invoice table for any invoices tied to application id = 123, there are many non-Accepted invoices that come back in the results. Its almost as if I wished SQL support some type of "AND ONLY HAS" so I could make my clause: "AND ONLY HAS i.status = 'Accepted'"
So I'm missing the clause that excludes results for applications with 1+ non-Accepted invoices. Any ideas where I'm going awry?
You can use the following logic:
SELECT *
FROM application
WHERE status = 'pending'
AND EXISTS (
SELECT 1
FROM invoice
WHERE invoice.application_id = application.id
HAVING SUM(CASE WHEN invoice.status = 'accepted' THEN 1 ELSE 0 END) > 0 -- count of accepted invoices > 0
AND SUM(CASE WHEN invoice.status = 'accepted' THEN 0 ELSE 1 END) = 0 -- count of anyother invoices = 0
)

How to display mySQL rows even if there are not SUM results?

I am trying to display a report from mySQL. Here is my current query:
SELECT *,
Sum(CASE
WHEN alerts_data_status = 'goal' THEN 1
ELSE 0
END) AS goal,
Sum(CASE
WHEN alerts_data_status = 'delivered' THEN 1
ELSE 0
END) AS delivered,
Sum(CASE
WHEN alerts_data_status = 'closed' THEN 1
ELSE 0
END) AS closed
FROM alerts_data
WHERE alerts_data.company_id = 1
GROUP BY alerts_data.alerts_data_id
the thing is that if a alerts_data.id has 0 goal, 0 delivered, 0 closed, it won't be shown in the results. The query shows only the alerts_data.id with at least 1 goal or 1 delivered or 1 closed.
How can I achieve this?
Example output
company ---- id --- goal --- delivered --- closed
1 ---- 32 --- 1 ------ 4 ----- 10
1 ---- 11 --- 0 ------ 1 ----- 1
Thank you
I think the issue that you are having is that there are no rows in the table for the company. Use an aggregation query with no GROUP BY:
SELECT 1 as company_id,
COALESCE(SUM(alerts_data_status = 'goal'), 0) AS goal,
COALESCE(SUM(alerts_data_status = 'delivered'), 0) AS delivered,
COALESCE(SUM(alerts_data_status = 'closed'), 0) AS closed
FROM alerts_data ad
WHERE ad.company_id = 1;
This no GROUP BY, this is guaranteed to return one row -- even if the WHERE clause filters out all rows. A GROUP BY returns one row per group, so if all rows are filtered out, then there are no groups and no rows in the result set.
If you wanted to support multiple company ids, you could use a LEFT JOIN:
SELECT company_id,
COALESCE(SUM(alerts_data_status = 'goal'), 0) AS goal,
COALESCE(SUM(alerts_data_status = 'delivered'), 0) AS delivered,
COALESCE(SUM(alerts_data_status = 'closed'), 0) AS closed
FROM (SELECT 1 as company_id UNION ALL
SELECT 2 as company_id
) c LEFT JOIN
alerts_data ad
USING (company_id)
GROUP BY company_id;
The LEFT JOIN guarantees that there are rows for each company, so each will be in the result set.
You can also phrase this as:
SELECT 1 as company_id,
COALESCE(SUM(alerts_data_status = 'goal'), 0) AS goal,
COALESCE(SUM(alerts_data_status = 'delivered'), 0) AS delivered,
COALESCE(SUM(alerts_data_status = 'closed'), 0) AS closed
FROM alerts_data ad
WHERE ad.company_id = 1;
This no GROUP BY, this is guaranteed to return one row -- even if the WHERE clause filters out all rows. A GROUP BY returns one row per group, so if all rows are filtered out, then there are no groups and no rows in the result set.
If you wanted to support multiple company ids, you could use a LEFT JOIN:
SELECT c.company_id,
COALESCE(SUM(ad.alerts_data_status = 'goal'), 0) AS goal,
COALESCE(SUM(ad.alerts_data_status = 'delivered'), 0) AS delivered,
COALESCE(SUM(ad.alerts_data_status = 'closed'), 0) AS closed
FROM companies c LEFT JOIN
alerts_data ad
on c.company_id = ad.company_id
WHERE c.company_id IN (1) -- or a longer list
GROUP BY c.company_id;
You need a LEFT JOIN of alerts_list to alerts_data:
SELECT t.alerts_id,
SUM(a.alerts_data_status = 'goal') AS goal,
SUM(a.alerts_data_status = 'delivered') AS delivered,
SUM(a.alerts_data_status = 'closed') AS closed
FROM alerts_list AS t LEFT JOIN alerts_data AS a
ON a.alerts_data_id = t.alerts_id AND a.company_id = t.company_id
WHERE t.company_id = 1
GROUP BY t.alerts_data_id

MySQL Select User id based on multiple AND conditions on same table

I want to select a userid from a single table based on multiple and condition.
UserID FieldID Value
-----------------------------------
1 51 Yes
1 6 Dog
2 6 Cat
1 68 TX
1 69 78701
2 68 LA
What I'm trying to get in simple words:
if user search for texas or 78701,
Select userId where (68 = TX OR 69=78701) AND (51=yes) AND (6=Dog)
This should return user id 1.
This is what I tried, but returns null.
SELECT user_id FROM `metadata`
WHERE ( (`field_id` = '68' AND value LIKE '%TX%')
OR (`field_id` = '69' AND value LIKE '%78701%') )
AND `field_id` = '51' AND value = 'Yes'
AND `field_id` = '6' AND value = 'Dog'
You can use GROUP BY with a HAVING clause that makes use of multiple conditional aggregates:
SELECT UserID
FROM metadata
GROUP BY UserID
HAVING SUM(field_id = '68' AND value LIKE '%TX%' OR
field_id = '69' AND value LIKE '%78701%') >= 1
AND
SUM(field_id = '51' AND value = 'Yes') >= 1
AND
SUM(field_id = '6' AND value = 'Dog') >= 1
Demo here
Explanation: In MysQL a boolean expression, like
field_id = '51' AND value = 'Yes'
returns 1 when true, 0 when false.
Also, each predicate of HAVING clause is applied to the whole group of records, as defined by GROUP BY.
Hence, predicate:
SUM(field_id = '51' AND value = 'Yes') >= 1
is like saying: return only those UserID groups having at least one (>=1) record with
field_id = '51' AND value = 'Yes' -> true
Your table structure resembles attribute+value modelling, which essentially splits up the columns of a row into individual pairs, and has the side effect of very weak typing.
As you've noted, this can also make things tricky to query, since you have to reason over multiple rows in order to make sense of the original data model.
One approach could be to take an opinion of a 'primary' criterion, and then apply additional criteria by reasoning over the shredded data, joined back by user id:
SELECT DISTINCT m.user_id
FROM `metadata` m
WHERE ((`field_id` = '68' AND value LIKE '%TX%')
OR (`field_id` = '69' AND value LIKE '%78701%'))
AND EXISTS
(SELECT 1
FROM `metadata` m2
WHERE m2.user_id = m.user_id AND m2.field_id = '51' AND m2.value = 'Yes')
AND EXISTS
(SELECT 1
FROM `metadata` m3
WHERE m3.user_id = m.user_id AND m3.field_id = '6' AND m3.value = 'Dog');
However, IMO, it may be better to attempt to remodel the table like so (and ideally choose better descriptions for the attributes as columns):
UserID Field51 Field6 Field68 Field69
----------------------------------------
1 Yes Dog TX 78701
2 No Cat LA NULL
This will make things much easier to query.
This approach is typically slower than simply LEFT JOINing that table on each criterion, but it can make the problem simpler to comprehend...
SELECT userid
, MAX(CASE WHEN fieldid = 51 THEN value END) smoker
, MAX(CASE WHEN fieldid = 6 THEN value END) favourite_pet
, MAX(CASE WHEN fieldid = 68 THEN value END) state
, MAX(CASE WHEN fieldid = 69 THEN value END) zip
FROM eav
GROUP
BY userid;
You can use HAVING, or bundle this into a subquery to get the desired results.
SELECT user_id FROM metadata
WHERE
(field_id = '68' AND value LIKE '%TX%')
OR (field_id = '69' AND value LIKE '%78701%')
AND (field_id = '51' AND value = 'Yes')
AND (field_id = '6' AND value = 'Dog');
I have little bit changed your query and tried with the same,it gives output as, user_id is 1

mySQL Select where results from column "b" have column "a" in common

This one is kinda hard to explain, I'll give it a shot.
I have this table where one of the columns is the type column. The salesperson will insert records that will contain a b_id and also an action_id.
with the following code I retrieve some info,
SELECT entry_type, COUNT(DISTINCT(b_name)) AS '# of prospects',
SUM(case when entries.out_id = '1' then 1 else 0 end) 'No Interest',
SUM(case when entries.out_id = '2' then 1 else 0 end) 'Needs Follow Up',
SUM(case when entries.out_id = '3' then 1 else 0 end) 'Appointment Booked'
FROM entries
LEFT JOIN outcomes on outcomes.out_id = entries.out_id
LEFT JOIN type on type.type_id = entries.type_id
LEFT JOIN business on entries.b_id = business.b_id
LEFT JOIN users on users.user_id = entries.user_id
WHERE b_name LIKE 'July%' AND (entries.type_id = 1 OR entries.type_id = 2 OR entries.type_id = 14)
GROUP BY entry_type;
The result is the following
ACTION # OF PROSPECTS NO INTEREST NEEDS FOLLOW UP APP. BOOKED
Call 4 1 2 1
Follow Up Contact 2 0 0 2
Walk In 1 1 0 0
The thing is, There are 2 possible initial actions, "Call" or "Walk In". "Follow Up Contact" is used if necessary after a initial call or walk in. As you can see, I have 2 appointments booked originated from this follow up. Here is the question. How do I know if this follow up contact is related to an initial call or an initial walk in?
I need to be able to generate a report specifying how many appointments were originated from each type of approach ( call or walk in ).
Thanks in advance
Use a self-join:
SELECT e1.type AS original_type, COUNT(e2.b_id) AS count
FROM entries AS e1
LEFT JOIN entries AS e2 ON e2.b_id = e1.b_id AND e2.entry_type = 'Follow Up Contact'
WHERE e1.entry_type IN ('Call', 'Walk In')
GROUP BY original_type

mysql conditional field update

I have 3 columns in CATEGORY TABLE for storing pre-calculated counts of records for it in another table PRODUCTS.
CATEGORY(c_id,name,c30,c31,c32)
c30=count for New Products (value 30)
c31 count for used products (value 31)
c32 count for Damaged products (value 32)
PRODUCT(p_id,c_id,name,condition)
condition can be 30,31 or 32.
I am thinking to write a single UPDATE statement so, it will update respective category count.
Althogh below statement is syntactically wrong, but i am looking for similar type of solution.
select case product.condition
when 30 then update category set category.c30=category.c30+1 where category.c_id=product.category3
when 31 then update category set category.c31=category.c31+1 where category.c_id=product.category3
when 32 then update category set category.c32=category.c32+1 where category.c_id=product.category3
end case
from product
where product.c_id=12
Any suggestion!
You can do this:
UPDATE CATEGORY c
INNER JOIN
(
SELECT
c_id,
SUM(CASE WHEN `condition` = 30 THEN 1 ELSE 0 END) c30,
SUM(CASE WHEN `condition` = 31 THEN 1 ELSE 0 END) c31,
SUM(CASE WHEN `condition` = 32 THEN 1 ELSE 0 END) c32
FROM product
GROUP BY c_id
) p ON c.c_id = p.c_id
SET c.c30 = p.c30,
c.c31 = p.c31,
c.c32 = p.c32;
SQL Fiddle Demo
You can join both the tables and then update the value in same join query.