I have created a query that should, I believe, return all email addresses from table 1 regardless.
If I go SELECT COUNT(email), COUNT(DISTINCT email) contacts.sid208 I get 200,000 and 175000.
With this in mind, by using left joins the count of email from the following query result should be the same no?
SELECT
COUNT(email), COUNT(DISTINCT email)
FROM
(SELECT
co.email,
env.env_medium,
CAST(MIN(co.created) AS DATE) AS first_contact,
MIN(CASE
WHEN my.my_id = 581 THEN my.data
END) AS Created,
MIN(CASE
WHEN my.my_id = 3347 THEN my.data
END) AS Upgraded
FROM
contacts.sid208 co
LEFT JOIN contacts.my208 my ON co.id = my.eid
LEFT JOIN contacts.env208 env ON env.eid = co.id
WHERE
my_id = 581 OR my_id = 3347
GROUP BY email) b1
But the results here, if I keep things proportionate, are 150000 and 150000.
I expected the results to be 175000.
My understanding of LEFT JOIN was that all records from contacts.sid208 would be maintained, regardless of whether or not they appear in my208 or env208.
Is my understanding flawed here? Hope my query makes sense to folk, if there's any more info I can add to make my question clearer let me know.
For a left join, move the conditions to the join as well:
SELECT
COUNT(email), COUNT(DISTINCT email)
FROM
(SELECT
co.email,
env.env_medium,
CAST(MIN(co.created) AS DATE) AS first_contact,
MIN(CASE
WHEN my.my_id = 581 THEN my.data
END) AS Created,
MIN(CASE
WHEN my.my_id = 3347 THEN my.data
END) AS Upgraded
FROM
contacts.sid208 co
LEFT JOIN contacts.my208 my
ON co.id = my.eid
AND (my_id = 581 OR my_id = 3347)
LEFT JOIN contacts.env208 env ON env.eid = co.id
GROUP BY email) b1
If you don't do so, you will first perform the join, resulting in all rows from sid208, regardless, with null values for missing emails. But then the filtering in the where clause kicks in and those records are removed anyway.
When you move all those conditions to the join, you get all rows, and the emails are only joined when they have the matching contact id, and their own id is either 581 or 2247.
Related
I have got the result form a complex query below
SELECT o_items.sku,
o_items.name AS 'title',
o_items.qty_ordered AS 'quantity',
s_orders.base_amount_paid AS 'paid/unpaid'
FROM sales_order_payment s_orders
INNER JOIN (SELECT s.sku, s.name, s.qty_ordered, s.order_id
FROM sales_order_item s
INNER JOIN (SELECT p.entity_id
FROM catalog_product_entity AS p
INNER JOIN catalog_product_entity_int AS a
ON p.row_id = a.row_id
WHERE VALUE >= 0
AND
a.attribute_id =
(SELECT attribute_id
FROM eav_attribute
WHERE attribute_code = 'is_darkhorse')) as q
ON s.product_id = q.entity_id
WHERE s.created_at BETWEEN '2019-01-14' AND '2019-01-16') o_items
ON
s_orders.parent_id = o_items.order_id
this is the order data those have been paid or not paid yet. Amount is representing paid and Null representing unpaid status
I am trying to generate below result but couldn't succeed and need help. Actually this result is showing how may quantity of a product has been paid and how many not paid yet. This would be result of above fetched data.
Please guide me how can i proceed to achieve these result.
Use this. ... represent existing code.
select .... , sum(case when s_orders.base_amount_paid is not null
then o_items.qty_ordered
else 0
end) as paid,
sum(case when s_orders.base_amount_paid is null
then o_items.qty_ordered
else 0
end) as unpaid
From .......
You can use if and ifnull functions together(presuming you're using mysql as DBMS)
and GROUP BY expression
SELECT c.sku, c.name,
sum(if(ifnull(base_amount_paid,0)=0,0,1)) as paid,
sum(if(ifnull(qty_ordered,0)=0,0,1)) as unpaid
FROM catalog_prod_ent_derived c
GROUP BY c.sku, c.name
where catalog_prod_ent_derived represents your whole query as a subquery.
In below query (Mentors) are 13 which shows me 26, while (SchoolSupervisor) are 5 which shows me 10 which is wrong. it is because of the Evidence which having 2 evidance, because of 2 evidence the Mentors & SchoolSupervisor values shows me double.
please help me out.
Query:
select t.c_id,t.province,t.district,t.cohort,t.duration,t.venue,t.v_date,t.review_level, t.activity,
SUM(CASE WHEN pr.p_association = "Mentor" THEN 1 ELSE 0 END) as Mentor,
SUM(CASE WHEN pr.p_association = "School Supervisor" THEN 1 ELSE 0 END) as SchoolSupervisor,
(CASE WHEN count(file_id) > 0 THEN "Yes" ELSE "No" END) as evidence
FROM review_m t , review_attndnce ra
LEFT JOIN participant_registration AS pr ON pr.p_id = ra.p_id
LEFT JOIN review_files AS rf ON rf.training_id = ra.c_id
WHERE 1=1 AND t.c_id = ra.c_id
group by t.c_id, ra.c_id order by t.c_id desc
enter image description here
You may perform the aggregations in a separate subquery, and then join to it:
SELECT
t.c_id,
t.province,
t.district,
t.cohort,
t.duration,
t.venue,
t.v_date,
t.review_level,
t.activity,
pr.Mentor,
pr.SchoolSupervisor,
rf.evidence
FROM review_m t
INNER JOIN review_attndnce ra
ON t.c_id = ra.c_id
LEFT JOIN
(
SELECT
p_id,
COUNT(CASE WHEN p_association = 'Mentor' THEN 1 END) AS Mentor,
COUNT(CASE WHEN p_association = 'School Supervisor' THEN 1 END) AS SchoolSupervisor,
FROM participant_registration
GROUP BY p_id
) pr
ON pr.p_id = ra.p_id
LEFT JOIN
(
SELECT
training_id,
CASE WHEN COUNT(file_id) > 0 THEN 'Yes' ELSE 'No' END AS evidence
FROM review_files
GROUP BY training_id
) rf
ON rf.training_id = ra.c_id
ORDER BY
t.c_id DESC;
Note that this also fixes another problem your query had, which was that you were selecting many columns which did not appear in the GROUP BY clause. Under this refactor, there is nothing wrong with your current select, because the aggregation take place in a separate subquery.
try adding this to the WHERE part of your query
AND pr.p_id IS NOT NULL AND rf.training_id IS NOT NULL
You can add a group by pr.p_id to remove the duplicate records there. Since, the group by on pr is not present as of now, there might be multiple records of same p_id for same ra
group by t.c_id, ra.c_id, pr.p_id order by t.c_id desc
I'm using a CASE statement in a SELECT statement that arguably should be broken up into smaller parts but I'm one field off completing this task and have passed the point of no return.
Without dropping the whole query in here, the gist of the issue is this:
SELECT
Dogs,
Cats,
COUNT(DISTINCT CASE WHEN my.my_id = 2765 THEN my.data END) AS CountAccounts,
COUNT(DISTINCT CASE WHEN my.my_id = 3347 THEN my.data END) AS CountUpgradedAccounts
my refers to a table that is a lookup table of name value pairs. When my.my_id = 3347 that signifies an "upgraded" account and the data point is the date that the account was upgraded. When my.my_id = 2765 that signifies account creation and the corresponding data point is the accountID.
my It looks like this:
UserID | my_id | Data
374 | 2765 | 8826
487 | 3347 | 2013-09-01
662 | 2765 | 8826
321 | 2765 | 9213
722 | 3347 | 2014-10-14
852 | 2765 | 8826
487 | 2765 | 9213
When my_id = 2765, I'd like the distinct number of accountIDs that it relates to. In the table above that is 2: Accounts 8826 and 9213.
I know this would be really simple if I was pulling data from my only. But my is woven into my query in a way that complicates things.
In fact, here is the query, perhaps the problem will be easier to see. Note the last field being selected in the SELECT statement is the problem. I don;t want to count distinct dates, I want to count distinct accountIDs that have upgraded.:
SELECT
sub.name AS ARName,
sub.desc AS ARDescription,
m.name AS MessageName,
m.subj AS MessageDescription,
clk.type AS EventType,
COUNT(DISTINCT clk.eid) AS CountAdmins,
COUNT(DISTINCT CASE WHEN my.my_id = 3347 THEN clk.eid END) AS CountUpgradeAdmins,
COUNT(DISTINCT CASE WHEN my.my_id = 2765 THEN my.data END) AS CountAccounts,
COUNT(DISTINCT CASE WHEN my.my_id = 3347 THEN my.data END) AS CountUpgradedAccounts # <-- THIS LINE IS THE PROBLEM
FROM
bata.sseq seq
INNER JOIN bata.messages m
ON m.id = seq.mid
INNER JOIN bm_arc.clicks208 clk
ON clk.camp = seq.camp
INNER JOIN bemails.cid cid
ON cid.id = clk.eid
INNER JOIN bonfig.sub
ON sub.id = seq.sid
LEFT JOIN bemails.my208 my
ON cid.id = my.eid AND (my_id = 3347 OR my_id = 2765) # only return people who upgraded and accountIDs
WHERE
seq.cid = 208
AND
sub.desc REGEXP '^Home pg free trail (A|B)'
GROUP BY
ARName,
ARDescription,
MessageName,
MessageDescription,
EventType
I've found trying to word this question challenging so sorry if what I'm asking is not clear. If there's any more info I can add let me know.
Following discussion, what I'm asking for in other words:
"For each instance of 3347 get the corresponding instances of UserID
and with those UserIDs the count of distinct corresponding datapoints
in my.data WHERE my.my_id = 2765"
Maybe try changing the problem line to, select UserId, not data:
SELECT
sub.name AS ARName,
sub.desc AS ARDescription,
m.name AS MessageName,
m.subj AS MessageDescription,
clk.type AS EventType,
COUNT(DISTINCT clk.eid) AS CountAdmins,
COUNT(DISTINCT CASE WHEN my.my_id = 3347 THEN clk.eid END) AS CountUpgradeAdmins,
COUNT(DISTINCT CASE WHEN my.my_id = 2765 THEN my.data END) AS CountAccounts,
COUNT(DISTINCT my2.data ) AS CountUpgradedAccounts
FROM
bata.sseq seq
INNER JOIN bata.messages m
ON m.id = seq.mid
INNER JOIN bm_arc.clicks208 clk
ON clk.camp = seq.camp
INNER JOIN bemails.cid cid
ON cid.id = clk.eid
INNER JOIN bonfig.sub
ON sub.id = seq.sid
LEFT JOIN bemails.my208 my
ON cid.id = my.eid AND (my.my_id = 3347 OR my.my_id = 2765) # only return people who upgraded and
LEFT JOIN bemails.my208 my2
ON my2.my_id = 2765 and my2.userID = my.userID and my.my_id=3347 #get the accounts that the user belongs to
WHERE
seq.cid = 208
AND
sub.desc REGEXP '^Home pg free trail (A|B)'
GROUP BY
ARName,
ARDescription,
MessageName,
MessageDescription,
EventType
The following query returns lots of records and the field DT_Cancelled has all values populated with a timestamp. This is expected (DT_Created and Upgraded are all null right now as expected too):
SELECT
co.email,
co.id,
CASE WHEN my.my_id = 581 THEN my.data END AS DT_Created,
CASE WHEN my.my_id = 3347 THEN my.data END AS DT_Upgraded,
CASE WHEN my.my_id = 3014 THEN my.lastmod END AS DT_Cancelled
FROM
ex_emails.cid208 co
INNER JOIN ex_emails.my208 my ON my.eid = co.id
WHERE my.my_id = 3014
GROUP BY email
However, the following query produces a results set where DT_Cancelled has all NULL values.
SELECT
co.email,
co.id,
CASE WHEN my.my_id = 581 THEN my.data END AS DT_Created,
CASE WHEN my.my_id = 3347 THEN my.data END AS DT_Upgraded,
CASE WHEN my.my_id = 3014 THEN my.lastmod END AS DT_Cancelled
FROM
ex_emails.cid208 co
INNER JOIN ex_emails.my208 my ON my.eid = co.id
WHERE (my.my_id = 581 OR my.my_id = 3347 OR my.my_id = 3014) # Here is the altered line
GROUP BY email
I then tried this but got the same result: DT_Cancelled was all NULL values.
SELECT
co.email,
co.id,
CASE WHEN my.my_id = 581 THEN my.data END AS DT_Created,
CASE WHEN my.my_id = 3347 THEN my.data END AS DT_Upgraded,
CASE WHEN my.my_id = 3014 THEN my.lastmod END AS DT_Cancelled
FROM
ex_emails.cid208 co
INNER JOIN ex_emails.my208 my ON my.eid = co.id AND (my.my_id = 581 OR my.my_id = 3347 OR my.my_id = 3014)
GROUP BY email
I had expected to see all the records I saw from the above query. Why have I lost them all with this modification?
Your GROUP BY is bad.
When you GROUP BY, you need to add all fields that are not aggregating with a SUM or MAX or AVG or what-have-you. MySQL allows this wrong query to run (which is infuriatingly stupid), even though the SQL is wrong and mySQL is trying to guess at what you mean. Fix your GROUP BY by grouping everything with GROUP BY 1,2,3,4,5 and run each query again. If that doesn't solve it, then perhaps we'll probably need some example data.
2I know I am having a simple issue .. But I cannot for the life of me solve it .. Here is what I am trying to do. I have 3 tables and some sample data:
customer_entity_varchar
entity_id attribute_id value
'2' '5' 'John'
'2' '7' 'Smith'
'2' '336' 'ADELANTO'
'3' '5' 'Jane'
'3' '7' 'Doe'
'3' '336' 'ADELANTO'
'4' '5' 'Peter'
'4' '7' 'Griffin'
'4' '336' 'Not ADELANTO'
customer_entity
entity_id email
'2' 'jsmith#whatever.com'
'3' 'janed#thisthat.com'
'4' 'peterg#notanemail.com'
What I am trying to come up with first name, last name and email for everyone that matches a certain district which is attribut_id = '336'. What I am trying is this:
SELECT CE.email as email,
max(case when CEV.attribute_id = '5' then CEV.value end) as FirstName,
max(case when CEV.attribute_id = '7' then CEV.value end) as LastName
FROM customer_entity_varchar CEV
LEFT JOIN customer_entity CE
ON ( CE.entity_id = CEV.entity_id)
WHERE CEV.value ='ADELANTO'
AND CEV.attribute_id='336'
My hopes for a result are:
email FirstName LastName
jsmith#whatever.com John Smith
janed#thisthat.com Jane Doe
However what I am getting back is a SINGLE row -- where email has a value, however both FirstName and LastName are blank. Is my logic flawed?
I would probably solve this like this. It's a solution that favours readability.
WITH firstNames AS
(SELECT entity_id, Value FROM customer_entity_char WHERE attribute_id = '5')
lastNames AS
(SELECT entity_id, Value FROM customer_entity_char WHERE attribute_id = '7')
districts AS
(SELECT entity_id, Value FROM customer_entity_char WHERE attribute_id = '336')
SELECT ce.email, fn.Value, ln.Value, d.Value FROM firstNames fn,lastNames ln, districts d
INNER JOIN customer_entity ce
WHERE fn.entity_id = ln.entity_id AND ln.entity_id = d.entity_id AND ce.entity_id = d.entity_id
AND d.Value = 'ADELANTO';
The WHERE condition in your query is excluding rows with attribute_id 5,7 and so it will not give value containing first and last name.
Try this
SELECT CE.email as email,
max(case when CEV.attribute_id = '5' then CEV.value end) as FirstName,
max(case when CEV.attribute_id = '7' then CEV.value end) as LastName
FROM
(
SELECT entity_id,attribute_id,value
FROM customer_entity_varchar
WHERE entity_id IN (
SELECT entity_id
FROM customer_entity_varchar
WHERE value ='ADELANTO'
AND attribute_id='336'
)
AND attribute_id IN ('5','7')
)As CEV
INNER JOIN customer_entity CE
ON CE.entity_id = CEV.entity_id
GROUP BY CEV.entity_id
SQL Fiddle Demo
You're dealing with an entity-attribute-value table. That means you're going to suffer the joys of an endless number of outer joins in every query. This is why EAV sucks and you're told not to do it unless you have to.
The single query fix:
SELECT DISTINCT
-- ^^^^You need this because you're cross joining the crap out of the EAV table.
CEV.entity_id,
CE.email,
CEV5.value "FirstName",
CEV7.value "LastName"
FROM customer_entity_varchar CEV
-- ^^^^ This is the base table that determines what entities exist. It makes sure you always have your entity_id even if attributes are missing.
LEFT JOIN customer_entity_varchar CEV5
ON CEV5.entity_id = CEV.entity_id AND CEV5.attribute_id = 5
-- ^^^^ This is the join that lets you access attribute 5, FirstName
LEFT JOIN customer_entity_varchar CEV7
ON CEV7.entity_id = CEV.entity_id AND CEV7.attribute_id = 7
-- ^^^^ This is the join that lets you access attribute 7, LastName
LEFT JOIN customer_entity_varchar CEV336
ON CEV336.entity_id = CEV.entity_id AND CEV336.attribute_id = 336
-- ^^^^ This is the join that lets you access attribute 336, City?
LEFT JOIN customer_entity CE
ON CE.entity_id = CEV.entity_id
-- ^^^^ Here's the other table, joined to the base table so we're sure it joins when partial data exists.
WHERE CEV336.value = 'ADELANTO'
Here's how to do it more easily.
First, make a view for your EAV table:
CREATE VIEW vw_customer_entity_varchar AS
SELECT DISTINCT
CEV.entity_id
CEV5.value "FirstName",
CEV7.value "LastName",
CEV336.value "City?"
FROM customer_entity_varchar CEV
LEFT JOIN customer_entity_varchar CEV5
ON CEV5.entity_id = CEV.entity_id AND CEV5.attribute_id = 5
LEFT JOIN customer_entity_varchar CEV7
ON CEV7.entity_id = CEV.entity_id AND CEV7.attribute_id = 7
LEFT JOIN customer_entity_varchar CEV336
ON CEV336.entity_id = CEV.entity_id AND CEV336.attribute_id = 336
Add additional joins for every field in your EAV table.
Then you can treat this view as a normal table, except it has terrible performance, and can't use it for INSERT, UPDATE, or DELETE.
Welcome to EAV hell.
First, do you really have those tick marks around all your values? In other words, are the quotations stored in the db?
As a couple of folk have pointed out, this database design is really a pain to query. Basically, you'll have to take two passes through your customer_entity_varchar table. One pass is to get the entity_id for whoever your're interested in (aliased as T1 in my query), and one to get their attributes (cev in the query). You will filter for this 'adelanto' value against the t1 query, which will give you the entity_ids. Then you can join that back to the CEV alias to get the actual information you want. (Hopefully all that made sense).
I think this what you are looking for:
SQL Fiddle
SELECT CE.email as email,
max(case when CEV.attribute_id = '5' then CEV.value end) as FirstName,
max(case when CEV.attribute_id = '7' then CEV.value end) as LastName
FROM customer_entity_varchar CEV
inner join (select distinct entity_id, value from customer_entity_varchar where attribute_id = 336) t1
on t1.entity_id = cev.entity_id
LEFT JOIN customer_entity CE
ON ( CE.entity_id = CEV.entity_id)
and t1.value = 'ADELANTO'
group by
ce.entity_id
EDIT: As MPelletier also pointed out, you need to include a group by. I can't stand that MySQL will allow you to submit that query without it.