grouping records conditionally - mysql

My Google-fu is coming up short on this one. I've got a table of transactions, like this:
id email source amount timestamp
1 daniel#example.com vendor 10 2014-03-10 23:34:40
2 john#example.com website 15 2014-03-11 13:30:00
3 mary#example.com website 50 2014-03-11 17:30:00
4 daniel#example.com website 65 2014-03-13 20:06:30
5 mary#example.com vendor 10 2014-03-14 16:20:30
I want to be able to group these by email, but only for users who:
A) came in through the 'vendor' source initially, and
B) also made a transaction through the 'website' source.
So for the above sample data, I would want this:
email total_amount transactions
daniel#example.com 75 2
Mary would not be included because her first transaction was through 'website', and not 'vendor'. John would not be included because he did not have a transaction through the vendor at all.
EDIT:
Less ideal, but still useful, would be this result set:
email total_amount transactions
daniel#example.com 75 2
mary#example.com 60 2
Where Mary and Daniel are both included because they both came in through the 'vendor' source in at least one transaction.

SELECT A.Email, sum(B.Amount) as Total_Amount, count(B.time) as Transactions
FROM tableName A
INNER join tableName B
on A.Email=B.Email
AND A.source='vendor'
Group By A.Email
Requirements are a bit unclear as you initially indicate the must initially come though vendor, but then you retract that statement later by adding mary.
http://sqlfiddle.com/#!2/bb4f9/1/0
If date/timestamps are important add an AND clause for A.Time<= B.Time and aggregrate the A.Amoun t and A.time and add those in like...
SELECT A.Email, sum(B.Amount)+ sum(A.Amount) as Total_Amount, count(B.time)+count(A.Time) as Transactions
FROM tableName A
INNER join tableName B
on A.Email=B.Email
AND A.source='vendor'
and A.Time<=B.Time
Group By A.Email
But this assumes vendor entry will only occur once for each email
So this solution first finds a vendor entry (if there's more than one for an email address this will not return accurate counts) then it finds any entries for the same email address with a source of website occurring after that vendor entry and aggregates the totals for that email adding in the vendor entry totals. While it works for the same data provided, it may not work as desired if multiple vendor entries exist for the same email. Without understanding how the totals should occur or if multiple data exists, or understanding why you need this information based on this data, I can't think of a better option without making lots of assumptions.
SELECT A.Email, sum(B.Amount)+sum(A.Amount) as Total_Amount,
count(B.time)+count(A.Time) as Transactions
FROM tableName A
INNER join tableName B
on A.Email=B.Email
AND A.source='vendor'
AND A.Time < B.Time and B.Source='website'
Group By A.Email

This query should give you the desired result by using a subquery to find the persons that have an initial 'vendor' record followed by a 'website' record, before collecting the summary information from the records for these persons.
If you remove the lines marked with -- *, persons whose 'vendor' record is not their first one is also included.
SELECT email, SUM(amount) AS total_amount, COUNT(*) AS transactions
FROM transactions
WHERE email IN
(SELECT t1.email FROM transactions t1
LEFT JOIN transactions t0 -- *
ON t0.email = t1.email AND t0.timestamp < t1.timestamp -- *
LEFT JOIN transactions t2
ON t2.email = t1.email
WHERE t1.source = 'vendor' AND t2.source = 'website'
AND t0.email IS NULL -- *
)
GROUP BY email;
See http://www.sqlfiddle.com/#!2/864898/8/0

Your query should look like this :
select email, sum(amount) ,count(*)
from tbl
where email='daniel#example.com'
group by email;
OR - to count all email !
select email, sum(amount) ,count(*)
from tbl
group by email;
All by vendor
select email, sum(amount) ,count(*)
from tbl
where source ='vendor'
group by email;
Also demo here:
http://sqlfiddle.com/#!2/de36ed/2

Try this :-
select x1.email_id,(x1.tot + x2.tot)as total_amount,(x1.cnt + x2.cnt)as transactions from
(select t1.email_id,count(t1.email_id)as cnt,sum(t1.totalamt)as tot from testdata t1 where t1.sourcee='web' group by t1.email_id)x1
inner join (select t2.email_id,count(t2.email_id)as cnt,sum(t2.totalamt)as tot from testdata t2 where t2.sourcee='vendor' group by t2.email_id)x2
on x1.email_id=x2.email_id group by x1.email_id;
Output :-
Its working fine.If required you please change the field name as per your table structure.
Hope it will help you.

Related

Join 2 tables to find out how many open incidents a customer has open

I'm doing a piece of work that has highlighted a number of accounts in our internal system that have been duplicated. To identify the duplicates, I've created the following script:
select SAMAccountName, COUNT(*)
from dbo.customer
group by SAMAccountName
having COUNT(*) > 1
order by SAMAccountName asc
The NULL accounts need to be ignored as they are related to PowerShell scripts being used currently.
On the back of this, I need to find out how many open incidents these duplicates have on our system. This is where I have to dip into the Incident table. I'd like to bring back the following columns from Incident...
select customerdisplayname, customeremail, status
from dbo.incident
The Status of the incident CANNOT be Resolved or Closed.
The CustomerDisplayName field in the Incident table is the same as FullName in the Customer table. Not sure if this will be needed in the script.
Any help you have on this would be much appreciated.
From your description, and assuming that duplicate SAMAccountName values could have differing fullname values, then the following seems appropriate:
select t.SAMAccountName, i.customerdisplayname, i.customeremail, i.status
from
(
select SAMAccountName
from dbo.customer
group by SAMAccountName
having count(*) > 1
) t
inner join dbo.customer c on t.SAMAccountName = c.SAMAccountName
inner join dbo.incident i on c.fullname = i.customerdisplayname
where i.status <> 'Resolved' and i.status <> 'Closed'

Ms Access Query To Get Sum

I have a Ms Access table with four columns; PledgeID, Ref, Paid, and Balance. One PledgeID may have several records. I need a query that will help me get the sum of the Balance for ONLY the last record of each PledgeID. The last PledgeID will be the one with the highest ref. I have attached a photo of the table for easy reference.
Click To View Image
You could use this SQL:
SELECT T2.PledgeID
, T2.REF
, T2.Balance
FROM (
SELECT PledgeID
, MAX(Ref) AS REF_RETURN
FROM MyTable
GROUP BY PledgeID
) T1 INNER JOIN MyTable T2 ON T1.REF_RETURN = T2.REF
It would probably be easier to limit the Ref field to just numeric - 1, 2 rather than PID/2018/00007-1 & PID/2018/00007-2.
You could use a subquery to find the latest balance per PledgeID and sum these balances:
SELECT Sum(Balance)
FROM Balances
WHERE Ref = (SELECT Max(Ref) FROM Balances AS b WHERE PledgeID = Balances.PledgeID);
Looking at your sample table, the result seems to be 0.00.

MySQL group and sum with joined tables

I've got pretty tricky problem with MySQL.
I have two tables with one to many relation (below colums that are relevant)
Table A (campaigns):
id | channel_type | date
Table B (budgets):
id | campaign_id | budget
I need single query to fetch following result:
Campaign count by channel_type
Sum of all budgets that are related to found campaigns.
I need to filter results by columns in campaigns table (e.g. WHERE campaigns.date > '2014-05-01')
I have tried following approach:
SELECT channel_type, COUNT(*) cnt,
(SELECT SUM(budget) FROM budgets WHERE budgets.campaign_id = campaigns.id))
as budget
FROM campaigns
WHERE campaigns.date >= 'some-value'
AND [more conditions]
GROUP BY campaigns.channel_type
But this of course fails miserably because of GROUP i am getting only first campaigns.id result for channel_type.
Any tips (and solution) would be really appreciated!
TIA
Get the total budget from budgets table using GROUP BY campain_id. It will be subquery. Name it. For example, A.
Now get the total id counts from campains using GROUP BY channel_type and WHERE date>='some-value'.
Use step 2 and 1(the subquery will act as table) in the final query and you will get the results.
You can post schema and then I can check.
I think this should work :
SELECT channel_type, COUNT(*) cnt,
(SELECT SUM(t2.budget) FROM budgets t2 WHERE t2.campaign_id IN (
SELECT t3.id FROM campaigns t3 WHERE t3.channel_type = t1.channel_type))
AS budget
FROM campaigns t1
WHERE t1.date >= 'some-value'
AND [more conditions]
GROUP BY t1.channel_type
see this fiddle
I've found working solution.
Here's working query:
SELECT SUM(budget) as budget, COUNT(*) as count FROM
(SELECT * FROM campaigns WHERE [conditions]) AS found_campaigns
LEFT JOIN budgets ON budgets.campaign_id = found_campaigns.id
GROUP BY channel_type

generate index columns for "ORDER BY x, y"

I use this query to summarize the contents of the table export_blocks, aggregated by user and date, and save it as a new table:
CREATE TABLE export_days
SELECT user_id DATE(submitted) AS date_str,
FROM export_blocks
GROUP BY user_id, DATE(submitted)
ORDER BY user_id, submitted
How can I, for each user_id get an incremental index for the date of records for that user? The indicies should start at 1 for each user, following the ORDER BY. I.e. I'd like to generate the date_index of the output below using SQL:
user_id date_str date_index
brian 2014-06-10 1
brian 2014-06-12 2
brian 2014-06-15 3
louis 2014-06-08 1
louis 2014-06-16 2
lucy 2013-11-15 1
(etc...)
I've been trying https://stackoverflow.com/a/5493480/1297830 but I cannot get it to work. It stops the counters prematurely, giving too low numbers for id_no and date_no.
Basing it on your sample query, you can do simple (dependent) subqueries to get the result;
SELECT id, date_str,
(SELECT COUNT(DISTINCT id)+1 FROM mytable WHERE id < a.id) id_no,
(SELECT COUNT(id)+1 FROM mytable WHERE id = a.id AND date_str < a.date_str) date_no
FROM mytable a
ORDER BY id;
...or you could do a couple of self joins;
SELECT a.id, a.date_str,
COUNT(DISTINCT b.id)+1 id_no,
COUNT(DISTINCT c.date_str)+1 date_no
FROM mytable a
LEFT JOIN mytable b ON a.id > b.id
LEFT JOIN mytable c ON a.id = c.id AND a.date_str > c.date_str
GROUP BY a.id, a.date_str
ORDER BY a.id, a.date_str;
An SQLfiddle showing both in action.
Sadly neither is really a very performant solution, but since MySQL lacks analytical (ie ranking) functions, the options are limited. Using user variables to do the ranking is also an option, however they're notoriously tricky to use and aren't portable so I'd go there only if performance demands it.
Based on Joachim's excellent answer I worked out the solution. It also works when there's multiple rows per day for each user.
CREATE TABLE export_days
SELECT
user_id, DATE(submitted) AS date_str,
(SELECT COUNT(DISTINCT DATE(submitted))+1 FROM export_blocks WHERE user_id = a.user_id AND submitted < a.submitted) date_no
FROM export_blocks a
GROUP BY user_id, DATE(submitted)
ORDER BY user_id, submitted

mysql Get missing values

I know this is ordinary question but I need something more. I have an issue about getting values that are not inserted in one table.
Ok here are my tables:
name: importantDates; cols: id, date
name: inserts; cols: id, date, employe_id
My question: how to get missing values for each employe? Let's say I need missing inserts from employe with id=213?
So far, I wrote this, but it doesn't work yet as if there is insert for one worker in one day, it eliminates one day for all workers.
code:
SELECT i.date
FROM importantDates i
LEFT OUTER JOIN inserts s
ON i.date = DATE(s.date)
WHERE i.date BETWEEN '2013-1-1'
AND '2013-2-23'
AND s.date IS NULL;
Now how can I add checking for employe_id?
Thanks guys, if you need anything more I'm always available.
EDIT:
Here is sample:
Employe:
1. sam
2. mike
3. joe
importantDate:
1. 2013-01-01
2. 2013-01-02
3. ...
40. 2013-02-23
inserts:
1. 2013-02-01, 1
2. 2013-02-01, 2
3. 2013-02-01, 3
4. 2013-02-02, 3
5. 2013-02-03, 1
6. 2013-02-03, 2
7. 2013-01-12, 1
So, when I run query, I should get all "missing" inserts. For each employe I should get date and ID of employee when insert is missing. A lot of data but it is important to know which are not inserted and which are.
Assuming you have an employee table, try:
select sq.* from
(select e.employe_id, i.date
FROM importantDates i
CROSS JOIN employee e
WHERE i.date BETWEEN '2013-1-1' AND '2013-2-23') sq
LEFT OUTER JOIN inserts s
ON sq.date = DATE(s.date) and sq.employe_id = s.employe_id
WHERE s.date IS NULL;
If you don't have a separate employee table, you can simulate one by changing employee in the above query to be:
(select distinct employe_id from inserts) as e
Instead of LEFT OUTER JOIN use a simple LEFT JOIN and provide the dates correctly
SELECT
i.date
FROM importantDates i
LEFT JOIN inserts s
ON i.date = DATE(s.date)
WHERE i.date BETWEEN '2013-01-01'
AND '2013-02-23'
AND s.date IS NULL;
I'm not quite sure if I understand, what you're trying to achieve, but if you want to get every row from table_a which is not in table_b you can do this:
SELECT * FROM table_a
WHERE table_a.col NOT IN
(
SELECT col FROM table_b
)
So (if I understand you correctly) in your case:
SELECT i.date FROM importantDates i
WHERE i.date NOT IN
(
SELECT date FROM inserts WHERE employe_id = 213
)
AND i.date BETWEEN '2013-01-01' AND '2013-02-23';
For more documentation to the IN-clause see mysql documentation
UPDATE:
To get the corresponding employee you can alter the statement to this:
SELECT i.date, e.* FROM importantDates i
JOIN employees e
WHERE i.date NOT IN
(
SELECT s.date FROM inserts s WHERE s.employe_id = e.employe_id
)
AND i.date BETWEEN '2013-01-01' AND '2013-02-23';
However, this is not recommendable because the subquery is correlated to the mainquery.