MySQL count across multiple tables with multiple conditions - mysql

I need to generate a report of how many registered users there are based on groups with more than 'x' members (members are users and can be registered or not registered).
A real simplified version would be:
table.users
userid int(11) NOT NULL,
username VARCHAR(40),
PRIMARY KEY (userid)
table.groups
gid int(11) NOT NULL,
guserid int(11) NOT NULL,
groupid VARCHAR(12) NOT NULL,
PRIMARY KEY (gid)
And some sample data:
INSERT INTO users (userid, username) VALUES
('1','bob'),('2','steve'),('3',''),('4','jill'),
('5',''),('6',''),('7','john'),('8','stan'),
('9',''),('10','rachel'),('11','lisa');
Out of those 11 users, 7 have usernames (registered)
INSERT INTO groups (gid, guserid, groupid) VALUES
('1','1','ABC123'),('2','2','ABC123'),('3','3','XYZ789'),('4','4','ABC123'),
('5','5','XYZ789'),('6','6','ABC123'),('7','7','DEF456'),('8','8','ABC123'),
('9','9','DEF456'),('10','10','XYZ789'),('11','11','XYZ789');
I need to get the groupid, the count of that groupid in the groups table, and then the count of the users registered for that group (username is not null).
'ABC123','5','4'
'XYZ789','4','2'
'DEF456','2','1'
The real data is a much larger subset and I need to get only results where we have a possible number of users in groups of more than 500 (which is around 1,000 groups which have anywhere from 500 to 25000 possible members). Everything I've tried involves nested selects and I can get close, but not the exact data returned that I need.

You just need to LEFT JOIN to the users table and COUNT:
SELECT groupid, COUNT(*), COUNT(DISTINCT u.username)
FROM groups AS g
LEFT JOIN users AS u ON u.userid = g.guserid AND u.username <> ''
GROUP BY groupid
Demo here

Something along the lines of:
SELECT groups.gid,
COUNT(groups.gid),
SUM(CASE WHEN users.username='' THEN 1 ELSE 0 END)
FROM groups
JOIN users ON users.userid=groups.guserid
GROUP BY groups.gid
Untested.

Related

How to create a query with JOIN and WHERE or how to make them friends?

I need to make a query, where there are columns of client's names and their orders per month.
Some clients don't have orders at some months and there fields must have 0.
The problem is, when i use WHERE and OUTER JOIN (no matter which one) at one query*, nessesary zero`s cutteed by WHERE. So how can i solve that?
Descripton of tables are pinned.
SELECT name
, ordering.id_client
, COUNT(order_date)
FROM ordering
RIGHT
OUTER
JOIN client
ON client.id_client = ordering.id_client
WHERE month(order_date) = 1
GROUP
BY name;
**Descripton**: (https://i.imgur.com/TrUGOLW.png)
**Example of my query** (there are 6 notes about clients at my db, showed only 4 of 6):
(https://i.imgur.com/ABP6pP0.png)
**MRE stuff**
Client: create table client(id_client int primary key auto_increment, name var char(50), passport_code int, addr varchar(70));
insert into client values(null, 'Penny Anderson', 6485, 'New Orlean');
Ordering: create table ordering(id_order int primary key auto_increment, id_client int, order_date date, foreign key(id_client) references client(id_client));
insert into ordering values(null, 1, date('2020-05-01'));
Try a simple left join starting from client's table
SELECT client.name
, client.id_client
, COUNT(order_date)
FROM client
LEFT JOIN ordering ON client.id_client = ordering.id_client
AND month(ordering.order_date) = 1
GROUP BY client.id_client;
If the condition for join is related to the left joined table then add this condition in the related ON clause and not in where otherwise this work as an inner join

How do I write an SQL query that gives me the amount of times a value is repeated in a table?

So I have two tables which I have created:
CREATE TABLE IF NOT EXISTS `Advertising_Campaign` (
`CampaignID` VARCHAR(10) NOT NULL,
`AdvertName` varchar(45) NOT NULL,
`ProjectLead` VARCHAR(10) NULL,
`CostEstimate` decimal NULL,
`CampaignCost` decimal NULL,
`EndDateEst` date NULL,
`StartDate` date NULL,
`EndDate` date NULL,
`Theme` VARCHAR(45) NOT NULL,
`AdvertType` VARCHAR(45) NOT NULL,
PRIMARY KEY (`CampaignID`))
ENGINE = InnoDB;
CREATE TABLE IF NOT EXISTS `staff_works_campaign` (
`CampaignID` VARCHAR(10) NOT NULL,
`StaffID` VARCHAR(10) NOT NULL,
`SalaryGrade` Integer NOT NULL,
`isSup` VARCHAR(3) NOT NULL,
PRIMARY KEY (`StaffID`, `CampaignID`),
CONSTRAINT `FK_StaffID3` FOREIGN KEY (`StaffID`) REFERENCES `Staff` (`StaffID`),
CONSTRAINT `FK_CampaignID2` FOREIGN KEY (`CampaignID`) REFERENCES `Advertising_Campaign` (`CampaignID`))
ENGINE = InnoDB;
which gives the tables:
Basically, I want to write a query that will return me a list of the advertising_campaign.AdvertName with more than 2 staff members working on them and a count of the number of staff members whose staff_works_campaign.SalaryGrade is greater than 2.
I have tried:
select a.advertname, count(*) as 'Greater Than 2'
from advertising_campaign a inner join staff_works_campaign
where staff_works_campaign.SalaryGrade > 2;
Which isn't exactly what I want, it returns:
I am a bit unsure at what this is returning exactly because I would thought it would have returned a count of 2(because of the fact that there are 2 entries with a SalaryGrade of 4 in the table), might be because of the way inner join works?
I am also a bit confused as to how to filter for 'more than 2 staff members', My idea is to see the amount of times the staff_works_campaign.CampaignID has appeared in the staff_works_campaign table to see how many staff members are apart of the same campaign.
I'm not sure how to structure it to count the amount of times campaignID is repeated and to return the names of the adverts that have a campaignID that has 2 or more staff members working on it.
So in this case I would want it to return a table with AdvertName of only those campaigns with two or more people working on them and a count of those staff members who have a salary grade greater than 2.
SELECT
a.CampaignID
,a.AdvertName
,COUNT(DISTINCT s.StaffID) AS [Count of staff]
,SUM(
--Use this to get a total of the staff who are
--in a SalaryGrade greater than 2
CASE WHEN s.SalaryGrade > 2
THEN 1
ELSE 0 --anyone who is under this level will be a 0 and not count
END
) as [Count of staff above salary grade]
FROM
advertising_campaign AS a
INNER JOIN staff_works_campaign AS s
--dont forget the join condition
ON a.CampaignID = s.CampaignID
--Dont want a where here, we want to include ALL staff.
--WHERE
-- staff_works_campaign.SalaryGrade > 2
GROUP BY
a.CampaignID
,a.AdvertName
HAVING
--more than two members of staff working on the same campaign.
COUNT(DISTINCT s.StaffID) > 2
Firstly you need a condition to join the two tables on. Secondly you can use a Group By and Having clause to put the filter on aggregation. Finally you need to count the number of staff with a salary grade > 2, which you can SUM a conditional for. Something like this:
select a.advertname, Sum(CASE WHEN c.SalaryGrade > 2 THEN 1 ELSE 0 END) as 'Greater Than 2'
from advertising_campaign a inner join staff_works_campaign c
on a.CampaignId = c.CampaignId
Group By a.advertname Having count(*) >= 2;
You can do something like following if you want both conditions together,
2 people working on camp, whose salary_grade >2
SELECT AdvertName
FROM Advertising_Campaign
WHERE CampaignID IN
(
SELECT
CampaignID
FROM
staff_works_campaign
WHERE
SalaryGrade > 2
GROUP BY
CampaignID
HAVING
COUNT(DISTINCT StaffID) >= 2
)
What you have received as a result from your query is the count of all staff members across all campaigns that have a salary grade greater than 2. It returns "Star Wars 3" as the advert name simply because it's the first name it came across in all the results that the COUNT operates over. (Some other SQL technologies such as Microsoft SQL Server actually won't allow you to do this kind of query to avoid this confusion.)
In order to get the results to be split by the campaign, you have to use the GROUP BY clause as suggested in the other answers. This will tell SQL to calculate any aggregate functions (i.e. COUNT) over groups of records that all match for one or more fields. In your case, you want to group by the campaignID, since you want the COUNT to be calculated for each campaign individually. You could do this on the advert name as well, but better to do it on the ID in case you have two with the same name. Modifying your query to do that, we get:
select a.campaignID, count(*) as 'Greater Than 2'
from advertising_campaign a inner join staff_works_campaign
where staff_works_campaign.SalaryGrade > 2
group by a.campaignID;
This still isn't quite going to work though, because the salary grade condition is applied before the COUNT. We need to move that part out into a new query that wraps around this one. We also need to limit the campaigns down to those with two staff - thankfully, we don't need yet another outer query for that. The HAVING keyword allows a condition to be applied after a GROUP BY, so we can do:
select a.campaignID, count(*) as 'staff_amount'
from advertising_campaign a inner join staff_works_campaign
group by a.campaignID
having staff_amount > 2;
Now, adding the staff salary condition and another select from advertising_campaign to get the advert name in an outer query, we finally get:
select advertising_campaign.advertname
from advertising_campaign
inner join staff_works_campaign on advertising_campaign.campaignid = staff_works_campaign.campaignid
inner join
(
select a.campaignID, count(*) as 'staff_amount'
from advertising_campaign a inner join staff_works_campaign
group by a.campaignID
having staff_amount > 2
) large_campaigns on advertising_campaign.campaignid = large_campaigns.campaignid
where staff_works_campaign.salarygrade > 2

How can I select and order rows from one table based on values contained in another?

I am currently working with two tables.
status:
id INT(4) AUTO_INCREMENT
username VARCHAR(20)
text LONGTEXT
datetime VARCHAR(25)
attachment VARCHAR(11)
timestamp VARCHAR(50)
friends:
id INT(2) AUTO_INCREMENT
added INT(1)
userto VARCHAR(32)
userfrom VARCHAR(32)
I would like to add the option for a user to filter statuses for only their friend's statuses, and display them with the newest one first. By this I mean most recent status, not newest per friend, which I suspect can be done with:
ORDER BY status.id DESC
How would I order the statuses based on the statuses of the users on the person's friends list?
Well, without any sample data it would be hard to do this for sure, but I would walk through it this way.
The statuses we want to show (regardless of order) are those the user is friends with. So, let's get the friends of Adam for example:
SELECT DISTINCT userto
FROM friends
WHERE userfrom = 'Adam'
UNION
SELECT DISTINCT userfrom
FROM friends
WHERE userto = 'Adam';
At this moment, I should point out that in your friends table the usernames are VARCHAR(32) and in the status table they are VARCHAR(20), I would assume they should be the same.
So, now you can filter the status based on whether or not the username is in the above subquery, and you could order by id descending, assuming you add them in order, but the best way would be to order by the timestamp on the status:
SELECT *
FROM status
WHERE username IN
(SELECT DISTINCT userto
FROM friends
WHERE userfrom = 'Adam'
UNION
SELECT DISTINCT userfrom
FROM friends
WHERE userto = 'Adam')
ORDER BY datetime DESC;
EDIT
I would also rethink your variables for datetime and timestamp, as well as rethink the name. While these are valid, they are also reserved words (DATETIME and TIMESTAMP are data types in MySQL). A possible reconstruction of your status table could be:
id INT(4) AUTO_INCREMENT
username VARCHAR(32)
statusText LONGTEXT
statusPostDate DATETIME
attachment VARCHAR(11)
The DATETIME variable will hold both the date and time portions for you.
Try this
SELECT status.*
FROM status
JOIN friends
ON status.username = friends.userto
WHERE friends.userfrom = '$username'
UNION
SELECT status.*
FROM status
JOIN friends
ON status.username = friends.userfrom
WHERE friends.userto = '$username'
ORDER BY status.id DESC;
this query loads statuses from usernames on either end of the friendship relationship that includes $username. Union is used to glue two lists top to bottom. Join is used to glue two lists side by side and align them on the author's username.

Alternative to GROUP_CONCAT? Multiple joins to same table, different columns

I'm not even sure of the correct terminology here. MySQL newbie, more or less.
Given a couple tables defined as follows:
CREATE TABLE users
( user_id int(11) NOT NULL auto_increment
, name VARCHAR(255)
, pri_location_id mediumint(8)
, sec_location_id mediumint(8)
, PRIMARY KEY (user_id)
);
CREATE TABLE locations
( location_id mediumint(8) NOT NULL AUTO_INCREMENT
, name varchar(255)
, PRIMARY KEY (location_id)
)
I'm trying to do a query to get the user name and both primary and secondary locations in one go.
I can get one like this:
SELECT u.name AS user_name, l.name as primary_location FROM users u, locations l WHERE u.primary_location_id=l.location_id
But I'm drawing a total blank on the correct syntax to use to get both in one query.
SELECT u.name AS user_name, l1.name as primary_location , l2.name as secondary_location
FROM users u
JOIN locations l1 ON(u.pri_location_id=l1.location_id)
JOIN locations l2 ON(u.sec_location_id = l2.location_id);
First of, I would strongly consider changing your DB schema if allowable to add a users_locations table that can be used to properly describe this many to many relationship.
This table could look like:
user_id location_id location_type
1 1 primary
1 2 secondary
2 1 secondary
2 2 primary
and so forth.
You would likely want a compound primary key across all three columns. And location_type might best be enum data type.
Your query would then be like
SELECT
u.name AS user_name
l.name AS location_name
ul.location_type AS location_type
FROM users AS u
INNER JOIN user_location AS ul /* possibly use left join here if user can exist without a location */
ON u.user_id = ul.user_id
INNER JOIN locations AS l
ON ul.location_id = l.location_id
ORDER BY ul.location_type ASC
This would return up to two records per user (separate record for primary and secondary, primary listed first)
If you need this collapsed to a single record you could do this:
SELECT
u.name AS user_name
COALESCE(CASE WHEN ul.location_type = 'primary' THEN l.name ELSE NULL END CASE) AS primary_location,
COALESCE(CASE WHEN ul.location_type = 'secondary' THEN l.name ELSE NULL END CASE) AS secondary_location
FROM users AS u
INNER JOIN user_location AS ul /* possibly use left join here if user can exist without a location */
ON u.user_id = ul.user_id
INNER JOIN locations AS l
ON ul.location_id = l.location_id
GROUP BY `user_name`
If however you are stuck with current schema, then solution by #Jlil should work for you.

T-SQL - calculate a boolean field on the fly

I'm using SQL Server 2008.
Let's say I have two hypothetical tables like below:
CREATE TABLE [Department](
[Id] int IDENTITY(1,1),
[ManagerId] int NULL, -- << Foreign key to the Person table
-- other fields
)
CREATE TABLE [Person](
[Id] int IDENTITY(1,1),
[DepartmentId] int NOT NULL, -- << Foreign key to the Department table
-- other fields
)
Now, I want to return a list of rows from the [Person] table (i.e. list of staff for a given department). Only one (or zero) of these rows will match the [ManagerId] field in the [Department] table. And I want to flag the matched row with a boolean field on the fly... the resultant rowset will resemble the following schema:
[Id] INT,
[IsManager] BIT NOT NULL DEFAULT 0,
-- other fields
The [IsManager] field will be TRUE when [Department].[ManagerId] matches [Person].[Id].
This is fairly trivial to do with two (or more) queries. But how can I achieve this using a single SQL statement?
Add an expression to your SELECT clause where you compare actual persons Id with ManagerId from persons department
SELECT
Person.Id,
Department.Id,
CAST(CASE WHEN Person.Id=Department.ManagerId THEN 1 ELSE 0 END AS BIT) AS IsManager
FROM Person
INNER JOIN Department ON Person.DepartmentId=Department.Id
WHERE Person.DepartmentId=<CONDITION>
A left join from the Person table to the department table on ManagerId will do the trick for you:
SELECT p.Id AS PersonId, d.Id AS DepartmentId,
CAST(CASE WHEN d.Id IS NULL THEN 0 ELSE 1 END) AS IsManager
FROM Person p LEFT JOIN Department d ON p.Id = d.ManagerId
How it works: All rows from Person are return, regardless of the existence of a corresponding Department matching on ManagerId. For those Person records without a matching department, all of the Department fields in the resultset are NULL, so we can use that to determine whether or not there is a match.
Note that this query may return duplicate Person records, if a person is a manager for multiple departments. To this end, I have added the DepartmentId to the list. If you require a unique list of persons and their IsManager flag, drop d.DepartmentId from the select clause and insert DISTINCT after the select:
SELECT DISTINCT p.Id AS PersonId,
CAST(CASE WHEN d.DepartmentId IS NULL THEN 0 ELSE 1 END) AS IsManager
FROM Person p LEFT JOIN Department d ON p.Id = d.ManagerId