Crosstab Query on multiple data points - ms-access

I have a table that tracks employee quality assessment data. It includes the employee name, 5 yes/no fields tracking important items and the date the user did each task as column headings. Each employee gets 10 records a month so it includes a lot of data about how well our employees are doing at those 5 tasks.
I would like a report that shows me the monthly averages of these 5 yes/no fields: Appeal, NRP, Churn, Protocol, and Resub. I want those to be the Row Headers. I want the column headers to be sequential Months and the Averages to be the values. I can do this with a crosstab query for a single item such as avg:Appeal as the value and the user as the row header. How can I construct my query to use all 5 yes/no fields? They hoped for result would look like:
Table image showing how I want it to look
Comments on the Correct Answer:
June7 came up with a great answer! I changed the True to False in the DataUNION query because I wanted the Accuracy percentage and the true indicates an error on the employee evaluation. I also added in a few fields I didn't mention before. Thank you very much for helping a scrub out June7! Reading through what you wrote inspired me to start taking an SQL course on Lynda. I know its basic but you have to start somewhere and I'm getting to the point where access's builtin functions aren't doing it for me. Hopefully with the next question I'll be able to address the concerns of the commentators below that were upset that I didn't have code for myself that I had tried first.
June7's revised Code

Consider:
Query1: DataUNION
SELECT ID AS SourceID, Emp, Year([TaskDate]) AS Yr, Format([TaskDate], "mmm") AS Mo, "Appeal" AS Trend
FROM Data
WHERE Appeal=True
UNION SELECT ID, Emp, Year([TaskDate]), Format([TaskDate], "mmm"), "NRP"
FROM Data WHERE NRP = True
UNION SELECT ID, Emp, Year([TaskDate]), Format([TaskDate], "mmm"), "Churn"
FROM Data WHERE Churn = True
UNION SELECT ID, Emp, Year([TaskDate]), Format([TaskDate], "mmm"), "Protocol"
FROM Data WHERE Protocol = True
UNION SELECT ID, Emp, Year([TaskDate]), Format([TaskDate], "mmm"), "Resub"
FROM Data WHERE Resub = True;
Query2: DataCOUNT
SELECT DataUNION.Yr, DataUNION.Mo, DataUNION.Trend,
Count(DataUNION.Emp) AS CountOfEmp, Q.CntYrMo, Count([Emp])/[CntYrMo]*100 AS Pct
FROM (SELECT Year([TaskDate]) AS Yr, Format([TaskDate],"mmm") AS Mo, Count(Data.ID) AS CntYrMo
FROM Data
GROUP BY Year([TaskDate]), Format([TaskDate],"mmm")) AS Q
INNER JOIN DataUNION ON (Q.Yr = DataUNION.Yr) AND (Q.Mo = DataUNION.Mo)
GROUP BY DataUNION.Yr, DataUNION.Mo, DataUNION.Trend, Q.CntYrMo;
Query3:
TRANSFORM First(DataCount.Pct) AS FirstOfPct
SELECT DataCount.Yr, DataCount.Trend
FROM DataCount
GROUP BY DataCount.Yr, DataCount.Trend
PIVOT DataCount.Mo In ("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec");

Related

Gathering data from three separate tables, sql

I have three separate tables that represent student attendance for three weeks, respectively. I want to be able to generate four columns that break down the attendance by week for each of the students. If a student was present multiple times a week, the number of times present should be added. Also, if a student was present in one week and not the next, it would get 1 for the month present (assuming it was only present once) and and 0 for the one absent. I have tried to multiple variations of count() and joins but to no avail. Any help would be greatly appreciated. The following is a truncated fiddle:
http://www.sqlfiddle.com/#!9/b847a
Here is a sample of what I am trying to achive:
Name | CurrWeek | LastWeek | TwoWkAgo
Paula | 0 | 2 | 3
Rather than three tables you should have only one with a column for the week. So naturally one solution for your request is to build it on-the-fly with UNION ALL:
select
name,
sum(week = 'currentWeek') as currentWeek,
sum(week = 'lastWeek') as lastWeek,
sum(week = 'thirdWeek') as thirdWeek
from
(
select 'currentWeek' as week, name from currentWeek
union all
select 'lastWeek' as week, name from lastWeek
union all
select 'thirdWeek' as week, name from thirdWeek
) all_weeks
group by name
order by name;
(If you want to join the three tables instead, you'd need full outer joins, which MySQL does not support, if I remember correctly. Anyway, my advice is to change the data model.)
You can try this query:
select currweek.name, currweek.att, lastweek.att, twoWkAgo.att from
(select name, count(attendance) as att from currentWeekTable group by name) currweek,
(select name, count(attendance) as att from lastWeekTable group by name) lastweek,
(select name, count(attendance) as att from twoWeekTable group by name) twoWkAgo
where twoWkAgo.name=currWeek.name and twoWkAgo.name=lastweek.name;
Assuming your 3 attendance tables contain name as common field.

Concatenate references of duplicate values in MySQL

I have a table (chapter) that contains 5 columns for officers in an organization: ID (key), president, vice_president, secretary, treasurer. For each office there is the value of a reference number to an individual.
For some IDs, the same value is listed for more than one of the 4 offices. You can see a basic example of my data structure below:
ID president vice_president secretary treasurer
105 1051456 1051456 1051466 1051460
106 1060923 1060937 1060944 1060944
108 1081030 1081027 1081032 1081017
110 1100498 1100491 1100485 1100485
I have also posted the same at http://sqlfiddle.com/#!9/57df1
My goal is to identify when a value is in more than one field and to SELECT that value as well as a concatenated list of all of the column titles in which it is found. For example from the supplied sample dataset, I would ideally like to return the following:
member offices
1051456 president, vice_president
1060944 secretary, treasurer
1100485 secretary, treasurer
I have found a few other examples that are similar, but nothing seems work towards what I am looking to do. I'm a novice but can piece things together from examples fairly well. I was also thinking that there might be an easier way by joining with the information_schema database as that is how I have pulled column titles in the past. It doesn't seem that this should as difficult as it is, and hopefully I am missing an easy and obvious solution. My full dataset is rather large and I would prefer to avoid any intensive sub-queries for the sake of performance. My SQL format is MySQL 5.5.
Any help or guidance would be greatly appreciated!
One method uses union all to unpivot the data and then re-aggregates:
select member, group_concat(office)
from ((select id, president as member, 'president' as office from t) union all
(select id, vice_president, 'vice_president' as office from t) union all
(select id, secretary, 'secretary' as office from t) union all
(select id, treasurer, 'treasurer' as office from t)
) t
group by member
having count(distinct office) > 1;
If you want to control the order of the values, then add a priority:
select member, group_concat(office order by priority) as offices
from ((select id, president as member, 'president' as office, 1 as priority from t) union all
(select id, vice_president, 'vice_president' as office, 2 from t) union all
(select id, secretary, 'secretary' as office, 3 from t) union all
(select id, treasurer, 'treasurer' as office, 4 from t)
) t
group by member
having count(distinct office) > 1;

Build sql query from multiple tables for cyfe dashboard

For the purpose of monitoring my data from my users I want to visualise my data in a Cohort analysis. Lets say that i have the following tables in my database:
Table: track_register
user_id, date, time
And in the following table:
Table: track_loginuser_id, date, time, succes
How i want my cohort analysis to look is like:
Months Sign Ups loged in more then once
May 40 80%
I am using Cyfe to visualise this so the data has to be formatted in a table like this:
Month,Sign Ups,Loged in more then once
May 2015,40,32
Jun 2015,60,55
(click here for cyfe example)
Eventually i want to add more data to the cohort from other tables such as percentage of users who actually bought the product and more of that good stuff.
The first set of data (the signups per month) is not the hard part. But what i am struggling with is how to fetch the data from the track login table. I will have to count the number of times a specific user has loged in and if thats > 1 then +1. I can imagine that u use CASE for that. The trouble is to separated it by the correct moth. Because the moth where de +1 supposed to go needs to be fetched from the track_register table.
Its seems kind of hard to me to put this all in one single query? But if it couldn't be done why go to the trouble of building a cohort analysis on cyfe?
Hi DATE as field name is restricted so I used DATA.
You can try this code:
SELECT TO_CHAR(NVL(a.data, b.data), 'MON YYYY') months
, COUNT(DISTINCT a.login) sign_ups
, SUM(CASE WHEN COUNT(DISTINCT b.login) > 1 THEN 1 ELSE 0 END) Loged_in_more_then_once
FROM track_register a LEFT JOIN track_login b ON a.login = b.login
GROUP BY TO_CHAR(NVL(a.data, b.data), 'MON YYYY')
ORDER BY 1
Or:
SELECT TO_CHAR(NVL(a.data, b.data), 'MON YYYY') months
, COUNT(DISTINCT a.login) sign_ups
, SUM(CASE WHEN COUNT(DISTINCT b.login) > 1 THEN 1 ELSE 0 END) Loged_in_more_then_once
FROM track_register a LEFT JOIN track_login b
ON a.login = b.login AND LAST_DAY(a.data) = LAST_DAY(b.data)
GROUP BY TO_CHAR(NVL(a.data, b.data), 'MON YYYY')
ORDER BY 1

SQL Select statement from multiple tables while adding values

I'm having a bit of trouble figuring out a good statement to write. I am able to achieve what I want when I query a specific 'Company' but I wanting to get the values for all of the companies in the database.
Basically I have 3 tables: Users, Companies, Plans_ExchangeMailbox. What I need to do is query how many plans are in use for each company. The plans are assigned at the user level in the users table.
Here is my table layouts:
USERS
DisplayName
CompanyCode (This is the ID from the CompanyCode in the Companies table)
MailboxPlan (This is the ID from the Plans_ExchangeMailbox Table)
Companies
CompanyName
CompanyCode
Plans_ExchangeMailbox
MailboxPlanName
MailboxPlanID
Here is the format I am looking to generate:
CompanyName, MailboxPlanName, Count (this is the number of MailboxPlanID for a company)
I was able to get this working but the problem is it can only do one company at a time and it doesn't get the CompanyName:
SELECT
Plans_ExchangeMailbox.MailboxPlanName,
SUM(CASE WHEN Users.MailboxPlan = Plans_ExchangeMailbox.MailboxPlanId THEN 1 ELSE 0 END) AS PlanCount
FROM
Plans_ExchangeMailbox, Users
WHERE
Users.CompanyCode='CC0'
GROUP BY
Plans_ExchangeMailbox.MailboxPlanName
The Final Format How it Should Be:
Headers: CompanyName, PlanName, Count
Values:
Microsoft, Bronze Plan, 5
Microsoft, Gold Plan, 20
Dell, Bronze Plan, 3
Dell, Silver Plan, 80
etc.....
Try this:
SELECT
C.CompanyName,
E.MailboxPlanName,
COUNT(1) Cnt
FROM Companies C
JOIN Users U
ON C.CompanyCode = U.CompanyCode
JOIN Plans_ExchangeMailbox E
ON U.MailboxPlan = E.MailboxPlanID
GROUP BY
C.CompanyCode,
C.CompanyName,
E.MailboxPlanID,
E.MailboxPlanName
Grouped by C.CompanyCode and E.MailboxPlanID in case if there are different companies or MailboxPlan with same name. If no,you can remove them from GROUP BY clause.

Multiple LEFT JOINs to self with criteria to produce distribution

Although several . questions . come . close . to what I want (and as I write this stackoverflow has suggested several more, none of which quite capture my problem), I just don't seem to be able to find my way out of the SQL thicket.
I have a single table (let's call it the user_classification_fct) that has three fields: user, week, and class (e.g. user #1 in week #1 had a class of 'Regular User', while user #2 in week #1 has a class of 'Infrequent User'). (As an aside, I have implemented classes as INTs, but wanted to work with something legible in the form of VARCHAR while I sorted out the SQL.)
What I want to do is produce a summary report of how user behaviour is changing in aggregate along the lines of:
There were 50 users who were regular users in both week 1 and week 2 and ...
There were 10 users who were regular users in week 1, but fell to infrequent users in week 2
There were 5 users who went from infrequent in week 1 to regular in week 2
... and so on ...
What makes this slightly more tricky is that user #5000 might only have started using the service in week 2 and so have no record in the table for week 1. In that case, I'd want to see a NULL FOR week 1 and a 'Regular User' (or whatever is appropriate) for week 2. The size of the table is not strictly relevant, but with 5 weeks' worth of data I'm looking at 42 million rows, so I do not want to insert 4 'fake' rows of 'Non-User' for someone who only starts using the service in week 5 or something.
To me this seems rather obviously like a case for using a LEFT or RIGHT JOIN in MySQL because the NULL should come through on the 'missing' record.
I have tried using both WHERE and AND conditions on the LEFT JOINs and am just not getting the 'right' answers (i.e. I either get no NULL values at all in the case of trailing WHERE conditions, or my counts are far, far too high for the number of distinct users (which is ca. 10 million) in the case of the AND constraints used below). Here's was my last attempt to get this working:
SELECT
ucf1.class_nm AS 'Class in 2012/15',
ucf2.class_nm AS 'Class in 2012/16',
ucf3.class_nm AS 'Class in 2012/17',
ucf4.class_nm AS 'Class in 2012/18',
ucf5.class_nm AS 'Class in 2012/19',
count(*) AS 'Count'
FROM
user_classification_fct ucf5
LEFT JOIN user_classification_fct ucf4
ON ucf5.user_id=ucf4.user_id
AND ucf5.week_key=201219 AND ucf4.week_key=201218
LEFT JOIN user_classification_fct ucf3
ON ucf4.user_id=ucf3.user_id
AND ucf4.week_key=201218 AND ucf3.week_key=201217
LEFT JOIN user_classification_fct ucf2
ON ucf3.user_id=ucf2.user_id
AND ucf3.week_key=201217 AND ucf2.week_key=201216
LEFT JOIN user_classification_fct ucf1
ON ucf2.user_id=ucf1.user_id
AND ucf2.week_key=201216 AND ucf1.week_key=201215
GROUP BY 1,2,3,4,5;
In looking at the various other questions on stackoverflow.com, it may well be that I need to perform the queries one-at-a-time and UNION the result sets together or use parentheses to chain them one-to-another, but those approaches are not ones that I'm familiar with (yet) and I can't even get a single LEFT JOIN (i.e. week 5 to week 1, dropping all the other weeks of data) to return something useful.
Any tips would be much, much appreciated and I would really appreciate suggestions that work in MySQL as switching database products is not an option.
You can do this with a group by. I would start by summarizing all the possible combinations for the five weeks as:
select c_201215, c_201216, c_201217, c_201218, c_201219,
count(*) as cnt
from (select user_id,
max(case when week_key=201215 then class_nm end) as c_201215,
max(case when week_key=201216 then class_nm end) as c_201216,
max(case when week_key=201217 then class_nm end) as c_201217,
max(case when week_key=201218 then class_nm end) as c_201218,
max(case when week_key=201219 then class_nm end) as c_201219
from user_classification_fct ucf
group by user_id
) t
group by c_201215, c_201216, c_201217, c_201218, c_201219
This may solve your problem. If you have 5 classes (including NULL), then this will return at most 5^5 or 3,125 rows.
This fits into Excel, so you can do the final processing there. Alternatively, you can still use the database.
If you want to extract pairs of weeks, then I would suggest putting the above into a temporary table, say "t". And doing a series of extracts with unions:
select *
from ((select '201215' as weekstart, c_201215, c_201216, sum(cnt) as cnt
from t
group by c_201215, c_201216
) union all
(select '201216', c_201216, c_201217, sum(cnt) as cnt
from t
group by c_201216, c_201217
) union all
(select '201217', c_201217, c_201218, sum(cnt) as cnt
from t
group by c_201217, c_201218
) union all
(select '201218', c_201218, c_201219, sum(cnt) as cnt
from t
group by c_201218, c_201219
)
) tg
order by 1, cnt desc
I suggest putting it in a subquery because you don't want to message around with common-subquery optimizations on such a large table. You'll get to your final answer by summarizing first, and then bringing the data together.