Find unique sets in sql data - mysql

I have three tables:
Modules
| ID | name |
Subscription
| module_id | user_id | ...
User
| ID | user_name |
I need a list of unique subscription sets. (ie. x users subscribed to modules (1), y users subscribed to (1,2), etc. Can I do this in SQL?

Let's set up some tables.
create table modules (
module_id integer not null,
module_name varchar(15) not null,
primary key (module_id),
unique (module_name)
);
insert into modules values (1, 'First module');
insert into modules values (2, 'Second module');
create table users (
user_id integer not null,
user_name varchar(15) not null,
primary key (user_id),
unique (user_name)
);
insert into users values (100, 'First user');
insert into users values (101, 'Second user');
create table subscriptions (
module_id integer not null,
user_id integer not null,
primary key (module_id, user_id),
foreign key (module_id)
references modules (module_id),
foreign key (user_id)
references users (user_id)
);
insert into subscriptions values (1, 100);
insert into subscriptions values (1, 101);
insert into subscriptions values (2, 100);
To get the right counts, all you need is a query on the table "subscriptions".
select module_id, count(*) as num_users
from subscriptions
group by module_id
order by module_id;
module_id num_users
--
1 2
2 1
You can use that statement in a join to get the module name instead of the module id number.
select t1.module_name, t2.num_users
from modules t1
inner join (select module_id, count(*) as num_users
from subscriptions
group by module_id
) t2
on t1.module_id = t2.module_id
order by t1.module_name;
module_name num_users
--
First module 2
Second module 1
You'll need to use the Innodb engine in order to enforce foreign key constraints.
To get the users who are subscribed to both module ids 1 and 2, use a WHERE clause to pick the module id numbers, use a GROUP BY clause to get the count, and use a HAVING clause to restrict the output to only those user id numbers who have a count of 2. (That means they've subscribed to both those modules in the WHERE clause.)
select user_id, count(*) num_modules
from subscriptions
where module_id in (1, 2)
group by user_id
having count(*) = 2;
This kind of requirement can blow up in your face quite quickly if you need to report on all the possible combinations of modules. For only 10 modules, there are over 1000 possible combinations. Usually, you'd want to write a program to either
generate dynamic SQL,
generate a static SQL statement for each of the possible combinations,
write a new SQL statement each time you're asked to report on a combination (usually, most combinations aren't interesting), or
renegotiate the requirements.

My colleague came up with an interesting solution
select module_combinations
from (select user_id, group_concat(module_id separator ', ') as module_combinations from subscriptions group by user_id) a
group by a.module_combinations
But this seems closer to answering the original question.
select module_combinations, count(*) as num_users
from (select group_concat(module_id order by module_id) as module_combinations
from subscriptions
group by user_id) a
group by a.module_combinations;

Related

MySQL select row from one table with multiple rows in a second table and get array of multi row in selected row

i have one table containing "Client" information, and another including "Tickets" information for each client.
int-------| varchar -------| varchar
client_id | client_name | client_tickets
----------+----------------+--------------
1 | Title one | 1,2
2 | Title two | 2,3
Simplified tickets table
int--------| varchar -------| varchar
ticket_id | ticket_name | ticket_price
-----------+-------------+--------------
1 | ticketone | 30
2 | tickettwo | 40
3 | ticketthree | 50
4 | ticketfour | 60
5 | ticketfive | 70
With the above two tables, I want to produce a single table with a single query with all the pertinent information to generate a search grid
So as to give the following output :
client_id | client_name | client_tickets | ticket_names | ticket_prices
----------+----------------+----------------+-----------------------+--
1 | Title one | 1,2 | ticketone,tickettwo | 30,40
2 | Title two | 2,3 | tickettwo,ticketthree | 40,50
ticket_names,ticket_ids,client_name are varchar
I want to receive the final 5 columns with one request
for example :
SELECT s.*,
(SELECT GROUP_CONCAT(ticket_name SEPARATOR ',') FROM tickets_table WHERE ticket_id IN(s.client_tickets)) AS ticket_names,
(SELECT GROUP_CONCAT(ticket_price SEPARATOR ',') FROM tickets_table WHERE ticket_id IN(s.client_tickets)) AS ticket_prices
FROM client_table s where s.client_id=1
Which seems to have a problem
Do you have a better suggestion?
Please make your suggestions
Update :
To clean the result I want
The following code has two querys,
I want this code to be done with a query
$client_result = $conn->query("SELECT * FROM client_table where client_id=1");
while($client_row = $client_result->fetch_assoc()) {
$ticket_result = $conn->query("SELECT * FROM tickets_table where ticket_id IN ($client_row['client_tickets'])");
while($ticket_row = ticket_result->fetch_assoc()) {
echo $ticket_row['ticket_name']."<br>";
}
}
update 2
i use suggest #raxi , but my mariadb is 10.4.17-MariaDB and don't support JSON_ARRAYAGG , for resolve it according to the reference Creating an aggregate function
, Using SQL
DELIMITER //
DROP FUNCTION IF EXISTS JSON_ARRAYAGG//
CREATE AGGREGATE FUNCTION IF NOT EXISTS JSON_ARRAYAGG(next_value TEXT) RETURNS TEXT
BEGIN
DECLARE json TEXT DEFAULT '[""]';
DECLARE CONTINUE HANDLER FOR NOT FOUND RETURN json_remove(json, '$[0]');
LOOP
FETCH GROUP NEXT ROW;
SET json = json_array_append(json, '$', next_value);
END LOOP;
END //
DELIMITER ;
What you want a fairly straightforward SELECT query with some LEFT/INNER JOIN(s).
This website has some good examples/explanations which seem very close to your need: https://www.mysqltutorial.org/mysql-inner-join.aspx
I would give you a quick working example, but it is not really clear to me what datatype the relevant columns are. Both tables' _id-columns are likely some variant of INTEGER, are they also both primary keys (or otherwise atleast indexed ?), the client_name/ticket_name are likely VARCHAR/TEXT/STRING types, but how exactly is the remaining column stored? as json or array or ? (+details)
Also you tagged your post with PHP, are you just after the SQL query ? or looking for PHP code with the SQL inside it.
updated
Improved version of the schema
CREATE TABLE clients (
client_id SERIAL,
client_name VARCHAR(255) NOT NULL,
PRIMARY KEY (client_id)
);
CREATE TABLE tickets (
ticket_id SERIAL,
ticket_name VARCHAR(255) NOT NULL,
ticket_price DECIMAL(10,2) NOT NULL,
PRIMARY KEY (ticket_id)
);
-- A junction table to glue those 2 tables together (N to N relationship)
CREATE TABLE client_tickets (
client_id BIGINT UNSIGNED NOT NULL,
ticket_id BIGINT UNSIGNED NOT NULL,
PRIMARY KEY (client_id, ticket_id)
);
I have changed the datatypes.
client_name and ticket_name are still VARCHARS. I've flagged them as NOT NULL (eg: required fields), but you can remove that part if you don't like that.
client_id/ticket_id/ticket_price are also NOT NULL but changing that has negative side-effects.
ticket_price is now a DECIMAL field, which can store numbers such as 1299.50 or 50.00 The (10,2) bit means it covers every possible number up to 8 whole digits (dollars/euros/whatever), and 2 decimals (cents). so you can store anything from $ -99.999.999,99 to $ 99.999.999,99 .
in SQL always write numbers (like lets say 70k) in this notation: 70000.00 (eg: a dot, not a comma; and no thousandseperators).
client_id and ticket_id are both SERIALs now, which is shorthand for BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE and theyre both PRIMARY KEYs on top of that. That probably sounds complicated but they're still just ordinary INTEGERs with values like 4 or 12 etc.
The UNIQUE bit prevents you from having 2 clients with the same ID number, and the AUTO_INCREMENT means that when you add a new client, you dont have to specify an ID (though you are allowed to); you can just do:
INSERT INTO clients (client_name) values ('Fantastic Mr Fox');
and the client_id will automatically be set (incrementing over time). And the same goes for ticket_id in the other table.
.
I've replaced your original client_tickets column, into a separate junction table.
Records in there store the client_id of a client and the ticket_id that belongs to them.
A client can have multiple records in the junction table (one record for each ticket they own).
Likewise, a ticket can be mentioned on any number of rows.
It's possible for a certain client_id to not have any records in the junction table.
Likewise, it's possible for a certain ticket_id to not have any records in the junction table.
Identical records cannot exist in this table (enforced by PRIMARY KEY).
Testdata
Next, we can put some data in there to be able to test it:
-- Create some tickets
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (1, 'ticketone', '30' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (2, 'tickettwo', '40' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (3, 'ticketthree', '50' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (4, 'ticketfour', '60' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (5, 'ticketfive', '70' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (6, 'ticketsix', '4' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (7, 'ticketseven', '9' );
INSERT INTO tickets (ticket_id, ticket_name, ticket_price) values (8, 'ticketeight', '500' );
-- Create some users, and link them to some of these tickets
INSERT INTO clients (client_id, client_name) values (1, 'John');
INSERT INTO client_tickets (client_id, ticket_id) values (1, 3);
INSERT INTO client_tickets (client_id, ticket_id) values (1, 7);
INSERT INTO client_tickets (client_id, ticket_id) values (1, 1);
INSERT INTO clients (client_id, client_name) values (2, 'Peter');
INSERT INTO client_tickets (client_id, ticket_id) values (2, 5);
INSERT INTO client_tickets (client_id, ticket_id) values (2, 2);
INSERT INTO client_tickets (client_id, ticket_id) values (2, 3);
INSERT INTO clients (client_id, client_name) values (3, 'Eddie');
INSERT INTO client_tickets (client_id, ticket_id) values (3, 8);
INSERT INTO clients (client_id, client_name) values (9, 'Fred');
-- Note: ticket #3 is owned by both client #1/#2;
-- Note: ticket #4 and #6 are unused;
-- Note: client #9 (Fred) has no tickets;
Queries
Get all the existing relationships (ticket-less clients are left out & owner-less tickets are left out)
SELECT clients.*
, tickets.*
FROM client_tickets AS ct
INNER JOIN clients ON ct.client_id = clients.client_id
INNER JOIN tickets ON ct.ticket_id = tickets.ticket_id
ORDER BY clients.client_id ASC
, tickets.ticket_id ASC ;
Get all the tickets that are still free (owner-less)
SELECT tickets.*
FROM tickets
WHERE tickets.ticket_id NOT IN (
SELECT ct.ticket_id
FROM client_tickets AS ct
)
ORDER BY tickets.ticket_id ASC ;
Get a list of ALL clients (even ticketless ones), and include how many tickets each has and the total price of their tickets.
SELECT clients.*
, COALESCE(COUNT(tickets.ticket_id), 0) AS amount_of_tickets
, COALESCE(SUM(tickets.ticket_price), 0.00) AS total_price
FROM clients
LEFT JOIN client_tickets AS ct ON ct.client_id = clients.client_id
LEFT JOIN tickets ON ct.ticket_id = tickets.ticket_id
GROUP BY clients.client_id
ORDER BY clients.client_id ASC ;
Put all the juicy info together (owner-less tickets are left out)
SELECT clients.*
, COALESCE(COUNT(sub.ticket_id), 0) AS amount_of_tickets
, COALESCE(SUM(sub.ticket_price), 0.00) AS total_price
, JSON_ARRAYAGG(sub.js_tickets_row) AS js_tickets_rows
FROM clients
LEFT JOIN client_tickets AS ct ON ct.client_id = clients.client_id
LEFT JOIN (
SELECT tickets.*
, JSON_OBJECT( 'ticket_id', tickets.ticket_id
, 'ticket_name', tickets.ticket_name
, 'ticket_price', tickets.ticket_price
) AS js_tickets_row
FROM tickets
) AS sub ON ct.ticket_id = sub.ticket_id
GROUP BY clients.client_id
ORDER BY clients.client_id ASC ;
-- sidenote: output column `js_tickets_rows` (a json array) may contain NULL values
An list of all tickets with some aggregate data
SELECT tickets.*
, IF(COALESCE(COUNT(clients.client_id), 0) > 0
, TRUE, FALSE) AS active
, COALESCE( COUNT(clients.client_id), 0) AS amount_of_clients
, IF(COALESCE( COUNT(clients.client_id), 0) > 0
, GROUP_CONCAT(clients.client_name SEPARATOR ', ')
, NULL) AS client_names
FROM tickets
LEFT JOIN client_tickets AS ct ON ct.ticket_id = tickets.ticket_id
LEFT JOIN clients ON ct.client_id = clients.client_id
GROUP BY tickets.ticket_id
ORDER BY tickets.ticket_id ASC
, clients.client_id ASC ;

What is the best way to query this Department-Employee table to get the department which have the exact employees?

I am using MySQL 5.6. I have three tables
Employee(id, ..... PRIMARY KEY (id))
Department(id, ...., PRIMARY KEY (id))
Department_Employee(d_id, e_id,
FOREIGN KEY(d_ID) REFERENCES Department(id),
FOREIGN KEY(e_id) REFERENCES Employee(id)
ADD CONSTRAINT PK_D_E_Mapping PRIMARY KEY (e_id, d_id) )
Department and Employee have a many-many relationship.
Let's say I'm given a list of Employee Ids(1, 2, 3) and I need to query the Department_employee table to get the Department_id which has only these 3 employees and no one else.
This is what I've managed to come up with so far.
SELECT id
FROM
( SELECT d_id
from Department_employee
where e_id in (1, 2, 3)
)
GROUP
BY id HAVING COUNT = 3;
I feel like there is definitely a better way to do this.
How can this query be improved?
You can group by department id and use GROUP_CONCAT() to set the condition in the HAVING clause:
SELECT d_id
FROM Department_employee
GROUP BY d_id
HAVING GROUP_CONCAT(e_id ORDER BY e_id) = '1,2,3' -- the ids in ascending order
Or:
SELECT d_id
FROM Department_employee
GROUP BY d_id
HAVING COUNT(*) = 3 AND SUM(e_id NOT IN (1, 2, 3)) = 0

Group by wether or not the cell exists in another table as well

My SQL syntax is MariaDB (MySQL)
I have a table with organisation spokepersons, and a table with VIP organizations, and a table with presentations. How do I group or sort by wether the spokeperson's organisation is VIP, so that VIP organisation spokepersons show up on top when retrieving all presentations?
Table presentations: int presentation_id, int person_id, varchar title, date date
Table persons: int person_id, varchar name, varchar function, varchar organisation
Table VIP_orgs: int org_id, varchar org_name
Query that doesn't work:
CREATE TABLE persons (
person_id INT AUTO_INCREMENT,
name VARCHAR(64),
organisation VARCHAR(64),
PRIMARY KEY (person_id)
);
INSERT INTO `persons` (name, organisation) VALUES
("Guy Fieri", "VIP-org"),
("Fiona", "VIP inc."),
("Mr. Robot", "Evil Corp"),
("Marcus Antonius", "Rome"),
("Cicero", "Rome"),
("Shrek", "VIP inc.");
CREATE TABLE presentations (
presentation_id INT AUTO_INCREMENT,
person_id INT,
PRIMARY KEY (presentation_id)
);
INSERT INTO `presentations` (person_id) VALUES
(1),(1),(1),(1), -- guy fieri has 4
(2),
(3),(3),(3),(3),(3),
(4),(4),(4),(4),
(5),(5),(5),
(6),(6),(6),(6);
CREATE TABLE VIP_orgs (
org_id INT AUTO_INCREMENT,
org_name VARCHAR(64),
PRIMARY KEY (org_id)
);
INSERT INTO `VIP_orgs` (org_name) VALUES
("VIP-org"),
("VIP inc.");
SELECT organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
GROUP BY (SELECT org_name FROM `VIP_orgs` WHERE `VIP_orgs`.org_name = organisation), organisation
ORDER BY count DESC;
What I expect it to do:
return a table org_name, (total combined number of presentations by each spokeperson of that org)
Sorted by count of presentations, grouped by organisation, VIP organisations grouped on top.
The VIP and non-VIP parts should be sorted by count independently. The returned table should thus look something like this:
name count
VIP inc. 5
VIP-org 4
Rome 7
Evil Corp 5
The query works 50%: it counts all presentations and sorts it, but it doesn't seem to group by VIP organizations. In actuality the returned table looks like this:
name count
Rome 7
VIP inc. 5
Evil Corp 5
VIP-org 4
The schema doesn't look right. I would suggest creating an organisations table with a vip BOOLEAN column and add foreign key in persons table. Make the following changes in the schema:
CREATE TABLE `organisations` (
organisation_id INT AUTO_INCREMENT,
name VARCHAR(64),
vip BOOLEAN,
PRIMARY KEY (organisation_id)
);
INSERT INTO `organisations` (name, vip) VALUES
("VIP-org", True),
("VIP inc.", True),
("Evil Corp", False),
("Rome", False);
CREATE TABLE persons (
person_id INT AUTO_INCREMENT,
name VARCHAR(64),
organisation_id INT,
PRIMARY KEY (person_id),
FOREIGN KEY (organisation_id) REFERENCES `organisations`(organisation_id)
);
INSERT INTO `persons` (name, organisation_id) VALUES
("Guy Fieri", 1),
("Fiona", 2),
("Mr. Robot", 3),
("Marcus Antonius", 4),
("Cicero", 4),
("Shrek", 2);
Now the query would look something like this:
SELECT `organisations`.name as organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
JOIN `organisations` ON `organisations`.organisation_id = `persons`.organisation_id
GROUP BY `organisations`.organisation_id
ORDER BY `organisations`.vip DESC, count DESC;
Output:
+--------------+------------+
| organisation | count |
+--------------+------------+
| VIP inc. | 5 |
| VIP-org | 4 |
| Rome | 7 |
| Evil Corp | 5 |
+--------------+------------+
You can see the result here: db <> fiddle
Instead of grouping by, I needed to sort. DOh!
Edit: this doesn't quite work. It does not sort by count. If I put the ORDER BY count clause first, it puts all vip orgs on the bottom.
Edit 2: using EXISTS, it seems to work
SELECT organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
GROUP BY organisation
ORDER BY EXISTS (SELECT org_name FROM `VIP_orgs` WHERE `VIP_orgs`.org_name = organisation) DESC, count DESC;

MySQL Query using a Foreign Key to return Names in Ordered Format

I have a query related to fetching records from the combination of 2 tables in a way that the returned result will be fetched using the ORDER by clause with the help of foreign key.
I have two tables named users and orders.
Table: users
id name
1 John
2 Doe
Table: orders
id user_id for
1 2 cake
2 1 shake
2 2 milk
In table:orders, user_id is foreign key representing id in table:users.
Question:
I want to extract the records from table:orders but the ORDER should be based on name of users.
Desired Results:
user_id for
2 cake
2 milk
1 shake
Note: Here user_id with 2 is showing before user id with 1. This is because the name Doe should be shown before the name John because of order by.
What I have done right now:
I have no idea about MySQL joins. By searching this thing on the internet i did not find a way how i will achieve this thing. I have written a query but it will not fetch such record but have no idea what should i do to make it work exactly like what i want to.
SELECT * FROM orders ORDER BY user_id
It will fetch the records according to the order of user_id but not with name.
you are right join both tables is the simplest way to achieve that and you can show the names also, as you have them joined anyway
CREATE TABLE orders (
`id` INTEGER,
`user_id` INTEGER,
`for` VARCHAR(5)
);
INSERT INTO orders
(`id`, `user_id`, `for`)
VALUES
('1', '2', 'cake'),
('2', '1', 'shake'),
('2', '2', 'milk');
CREATE TABLE users (
`id` INTEGER,
`name` VARCHAR(4)
);
INSERT INTO users
(`id`, `name`)
VALUES
('1', 'John'),
('2', 'Doe');
SELECT o.`user_id`, o.`for` FROM orders o INNER JOIN users u ON u.id = o.user_id ORDER BY u.name
user_id | for
------: | :----
2 | cake
2 | milk
1 | shake
db<>fiddle here
You can get your desired results by join orders table and users table by simply using below query.
SELECT user_id, for FROM orders, users where user_id = id ORDER BY name;
Using where condition, we match corresponding rows where user_id in orders table equals id in users table. By using ORDER BY for name column in users table, rows will be sorted in ascending order. Here user_id and for columns in orders table will be show as final result.
Here I haven't use users.id or orders.user_id because they are in different formats. If you use same format for columns, you need to use above syntax.

Merge Duplicate Rows in MySQL

I have a database like this:
users
id name email phone
1 bill bill#fakeemail.com
2 bill bill#fakeemail.com 123456789
3 susan susan#fakeemail.com
4 john john#fakeemail.com 123456789
5 john john#fakeemail.com 987654321
I want to merge records considered duplicates based on the email field.
Trying to figure out how to use the following considerations.
Merge based on duplicate email
If one row has a null value use the row that has the most data.
If 2 rows are duplicates but other fields are different then use the one
with the highest id number (see the john#fakeemail.com row for an example.)
Here is a query I tried:
DELETE FROM users WHERE users.id NOT IN
(SELECT grouped.id FROM (SELECT DISTINCT ON (email) * FROM users) AS grouped)
Getting a syntax error.
I'm trying to get the database to transform to this, I can't figure out the correct query:
users
id name email phone
2 bill bill#fakeemail.com 123456789
3 susan susan#fakeemail.com
5 john john#fakeemail.com 987654321
Here is one option using a delete join:
DELETE
FROM users
WHERE id NOT IN (SELECT id
FROM (
SELECT CASE WHEN COUNT(*) = 1
THEN MAX(id)
ELSE MAX(CASE WHEN phone IS NOT NULL THEN id END) END AS id
FROM users
GROUP BY email) t);
The logic of this delete is as follows:
Emails where there is only one record are not deleted
For emails with two or more records, we delete everything except for the record having the highest id value, where the phone is also defined.
Here's a solution that will give you the latest data for each field for each user in the result table, thus meeting your second criterion as well as the first and third. It will work for as many duplicates as you have, subject to the group_concat_max_len condition on GROUP_CONCAT. It uses GROUP_CONCAT to prepare a list of all values of a field for each user, sorted so that the most recent value is first. SUBSTRING_INDEX is then used to extract the first value in that list, which is the most recent. This solution uses a CREATE TABLE ... SELECT command to make a new users table, then DROPs the old one and renames the new table to users.
CREATE TABLE users
(`id` int, `name` varchar(5), `email` varchar(19), `phone` int)
;
INSERT INTO users
(`id`, `name`, `email`, `phone`)
VALUES
(1, 'bill', 'bill#fakeemail.com', 123456789),
(2, 'bill', 'bill#fakeemail.com', NULL),
(3, 'susan', 'susan#fakeemail.com', NULL),
(4, 'john', 'john#fakeemail.com', 123456789),
(5, 'john', 'john#fakeemail.com', 987654321)
;
CREATE TABLE newusers AS
SELECT id
, SUBSTRING_INDEX(names, ',', 1) AS name
, email
, SUBSTRING_INDEX(phones, ',', 1) AS phone
FROM (SELECT id
, GROUP_CONCAT(name ORDER BY id DESC) AS names
, email
, GROUP_CONCAT(phone ORDER BY id DESC) AS phones
FROM users
GROUP BY email) u;
DROP TABLE users;
RENAME TABLE newusers TO users;
SELECT * FROM users
Output:
id name email phone
1 bill bill#fakeemail.com 123456789
4 john john#fakeemail.com 987654321
3 susan susan#fakeemail.com (null)
Demo on SQLFiddle