Join on a string with individual values inside commas [duplicate] - mysql

This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 2 years ago.
I have a table (called toy table) which shows for each toy type the sales ID processed, with commas between different sales IDs. Please note that my data schema is not this way. But I aggregate data on the toy level, to generate a report.
|------------------------|------------------------|
| Toy | IDs |
|------------------------|------------------------|
| Buzz Lightyear | 22,33,44 |
| Woody | 24,41 |
|------------------------|------------------------|
I have another table (called status table) which has status for each order ID.
|------------------------|------------------------|
| ID | Status |
|------------------------|------------------------|
| 22 | running |
| 33 | paused |
| 44 | running |
| 24 | cancelled |
| 41 | finished |
|------------------------|------------------------|
I want to make a table (through a join) in which I can have a column for status separated by commas, in the right order of sales IDs, and also a column for running IDs only. It would look like this:
|------------------------|------------------------|------------------------|-----------------------|
| Toy | IDs | Status | Running IDs. |
|------------------------|------------------------|------------------------|-----------------------|
| Buzz Lightyear | 22,33,44 |running, paused, running| 22,44 |
| Woody | 24,41 | cancelled, finished | |
|------------------------|------------------------|------------------------|-----------------------|
I dont know how to make the join. I've tried finding the position of ID through find_in_set() but I just can't seem to progress after that. I have to report on the toy level so there cannot be more than one toy in a row in the final dataset.

You need to do your JOINs before you do the aggregation. Without seeing your table structures it's hard to be 100% certain, but the query you want should look something like this:
SELECT t.Toy,
GROUP_CONCAT(o.ID ORDER BY o.ID) AS IDs,
GROUP_CONCAT(s.Status ORDER BY s.ID) AS Status,
GROUP_CONCAT(CASE WHEN s.Status = 'running' THEN s.ID END) AS `Running IDs`
FROM toys t
JOIN orders o ON o.Toy_ID = t.ID
JOIN status s ON s.ID = o.ID
GROUP BY t.Toy
Demo on SQLFiddle

Related

Using nested SELECT result for IN statement of another nested SELECT

Be gentle. I'm a high school principal coding on the side for our school site.
I have looked at answers, here, here, and here. I might just not know enough to ask the right question.
We have events that have multiple sessions and there are workshops that can be associated with multiple sessions in multiple events.
I'm trying to get a csv result, later to be put into an array, for the associated sessions and events for my Workshops.
The query below works without the second nested Select statement.
In the Alt_Events statement, I need to pull the Event_IDs that are associated with the Session_IDs that are pulled from the first nested Select.
Events
ID | Name | Description
1 | Flex Learning | A day of flexible learning.
2 | Moonshot Expo | A day to join partners to solve problems.
Event_Sessions
ID | Event_ID | Name | Description
1 | 1 | Morning Session | The first session of the day.
2 | 1 | Afternoon Session | The afternoon session.
3 | 1 | Tutoring Session | A chance to get help from teachers.
4 | 2 | Partner Field Trip | The first session of the day.
5 | 2 | Brainstorming Session | The afternoon session.
6 | 2 | Tutoring Session | A chance to get help from teachers.
Event_Workshops
ID | Name | Description
1 | Math Tutorial | Get help from your math teachers.
Event_Workshop_Links
ID | Workshop_ID | Session_ID
1 | 1 | 3
2 | 1 | 6
Output Table:
ID | Name | Description | ... | Alt_Sessions | Alt_Events
1 | Math Tutorial | Get help... | ... | 3,6 | 1,2
Here is my query.
SELECT
ws.ID, ws.Name, ws.Description, ws.Location, ws.Owner_ID, ws.Max_Attendees,
ws.Eng_Major_Allowed, ws.Eng_Minor_Allowed,
ws.HC_Major_Allowed, ws.HC_Minor_Allowed,
ws.IT_Major_Allowed, ws.IT_Minor_Allowed,
u.LastName as Owner_LastName, u.FirstName AS Owner_FirstName, u.Email AS Owner_Email,
(SELECT group_concat(SESSION_ID) FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID) AS Alt_Sessions,
(SELECT group_concat(Event_ID) FROM Event_Sessions WHERE Session_ID IN Alt_Sessions) AS Alt_Events
FROM Event_Workshops as ws
LEFT JOIN users AS u
ON ws.Owner_ID = u.ID
WHERE ws.ID = ?
ORDER BY ws.Name
I need to be able to pull the all event_ids that are in the Alt_Sessions result.
I'm guessing I can't use the result of the first nested query in the second nested query. If that's the problem, how can I pull that list of event ids?
Any and all help is greatly appreciated.
(Updated to show expected output. Also one error in transcribing the query. Session_ID instead of Event_ID in second nested statement.
Use the subquery instead of Alt_Sessions in the IN predicate like below.
(SELECT group_concat(SESSION_ID) FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID) AS Alt_Sessions,
(SELECT group_concat(Event_ID) FROM Event_Sessions WHERE Session_ID IN (SELECT SESSION_ID FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID)) AS Alt_Events
Also, there is a way to make combinations of Alt_Sessions and Alt_Events first and then join to Event_Workshops.
SELECT * FROM Event_Workshops ws
JOIN
(
SELECT
wsl.Workshop_ID,
GROUP_CONCAT(wsl.Session_ID) Alt_Sessions,
GROUP_CONCAT(wsl.ID) Alt_Events
FROM Event_Workshop_Links wsl
GROUP BY wsl.Workshop_ID
) w
ON ws.ID = w.Workshop_ID

MYSQL : Group by all weeks of a year with 0 included

I have a question about some mysql code.
I have a table referencing some employees with the date of arrival et the project id. I wanna calculate all the entries in the enterprise and group it by week.
A this moment, I can have this result
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S03 | 1
2 | 2019-S01 | 1
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
This is good, but I would like to have all the weeks returned, even if a week has 0 as result :
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S02 | 0
1 | 2019-S03 | 1
...
2 | 2019-S01 | 1
2 | 2019-S02 | 0
2 | 2019-S03 | 0
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
...
Here is my actual code :
SELECT
AP.SECTION_ANALYTIQUE AS SECTION,
FS_GET_FORMAT_SEMAINE(AP.DATE_ARRIVEE_PROJET) AS SEMAINE,
Count(*) AS COMPTE
FROM
RT00_AFFECTATIONS_PREV AP
WHERE
(AP.DATE_ARRIVEE_PROJET <= CURDATE() AND Year(AP.DATE_ARRIVEE_PROJET) >= Year(CURDATE()))
GROUP BY
SECTION, SEMAINE
ORDER BY
SECTION
Does anybody have a solution ?
I searched things on internet but didn't find anything accurate :(
Thank you in advance ! :)
The classic way to meet this requirement is to create a referential table to store all possible weeks.
create table all_weeks(week varchar(8) primary key);
insert into all_weeks values
('2019-S01'), ('2019-S02'), ('2019-S03'), ('2019-S04'), ('2019-S05'), ('2019-S06');
Once this is done, you can generate a cartesian product of all possible sections and weeks with a CROSS JOIN, and LEFT JOIN that with the original table.
Given your code snippet, this should look like:
SELECT
s.section_analytique AS section,
w.week AS semaine,
COUNT(ap.section_analytique) AS compte
FROM
(SELECT DISTINCT section_analytique from rt00_affectations_prev) s
CROSS JOIN all_weeks w
LEFT JOIN rt00_affectations_prev ap
ON s.section_analytique = ap.section_analytique AND w.week = FS_GET_FORMAT_SEMAINE(ap.date_arrivee_projet)
GROUP BY s.section_analytique, w.week
ORDER BY s.section_analytique
PS: be careful not to put conditions on the original table in the WHERE clause: this would defeat the purpose of the LEFT JOIN. If you need to do some filtering, use the referential table instead (you might need to add a few columns to it, like the starting date of the week maybe).

Left Join takes very long time on 150 000 rows

I am having some difficulties to accomplish a task.
Here is some data from orders table:
+----+---------+
| id | bill_id |
+----+---------+
| 3 | 1 |
| 9 | 3 |
| 10 | 4 |
| 15 | 6 |
+----+---------+
And here is some data from a bills table:
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+----+
I want to list all the bills that have no order associated with.
In order to achieve that, I thought that the use of LEFT JOIN was appropriated so I wrote this request:
SELECT * FROM bills
LEFT JOIN orders
ON bills.id = orders.bill_id
WHERE orders.bill_id IS NULL;
I thought that I would have the following result:
+----------+-----------+----------------+
| bills.id | orders.id | orders.bill_id |
+----------+-----------+----------------+
| 2 | NULL | NULL |
| 5 | NULL | NULL |
+----------+-----------+----------------+
But I can't reach the end of the request, it has run more than 5 minutes without result, I stopped the request because this can't be a production time anyway.
My real dataset has more than 150 000 orders and 100 000 bills. Is the dataset too big?
Is my request wrong somewhere?
Thank you very much for your tips!
EDIT: side note, the tables have no foreign keys defined... *flies away*
Your query is fine. I would use table aliases in writing it:
SELECT b.*
FROM bills b LEFT JOIN
orders o
ON b.id = o.bill_id
WHERE o.bill_id IS NULL;
You don't need the NULL columns from orders, probably.
You need an index on orders(bill_id):
create index idx_orders_billid on orders(bill_id);
By your where statement, I assume your looking for orders that have no bills.
If that's the case you don't need to do a join to the bills table as they would by definition not exist.
You will find
SELECT * FROM orders
WHERE orders.bill_id IS NULL;
A much better performing query.
Edit:
Sorry I missed your "I want to list all the bills that have no order associated with." when reading the question. As #gordon pointed out an index would certainly help. However if changing the scheme is feasible I would rather have a nullable bill.order_id column instead of a order.bill_id because you won't need a left join, an inner join would suffice to get order bills as it would be a quicker query for your other assumed requirements.

MySQL query to select rows from table 2 if *all* rows from table 1 are not present

I'm doing a kind of point-of-sale system whose MySQL database has (among other things) a table with items for sale, a table with sales, and a table with purchases (a purchase being my ad-hoc notation for any single item bought in a sale; if the same person buys three items at once, for example, that's one sale consisting of three purchases). All these tables have logical IDs, viz. item_id, sale_id, purchase_id, and are easily joined with simple pivotal tables.
I am now trying to add a discount feature; basically your garden-variety supermarket discount: buy these particular items and pay X instead of paying the full sum of the regular item prices. These 'package deals' have their own table and are linked to the items table with a simple pivotal table containing deal_id and item_id.
My problem is getting to the point of figuring out when this is to be applied. To give some example data:
items
+---------+--------+---------+
| item_id | title | price |
+---------+--------+---------+
| 12 | Shoe | 10 |
| 76 | Coat | 23 |
| 82 | Whip | 19 |
+---------+--------+---------+
sales
+---------+-----------+
| sale_id | timestamp |
+---------+-----------+
| 2973 | 144995839 |
| 3092 | 144996173 |
+---------+-----------+
purchases
+-------------+-------------+---------+----------+---------+
| purchase_id | no_of_items | item_id | at_price | sale_id |
+-------------+-------------+---------+----------+---------+
| 12993 | 1 | 12 | 10 | 2973 |
| 12994 | 1 | 76 | 23 | 2973 |
| 12996 | 1 | 82 | 19 | 2973 |
| 13053 | 1 | 12 | 10 | 3092 |
| 13054 | 1 | 82 | 19 | 3092 |
+-------------+-------------+---------+----------+---------+
package_deals
+---------+-------+
| deal_id | price |
+---------+-------+
| 1 | 40 |
+---------+-------+
deals_items
+---------+---------+
| deal_id | item_id |
+---------+---------+
| 1 | 12 |
| 1 | 76 |
| 1 | 82 |
+---------+---------+
As is hopefully obvious from that, we have a shoe that cost $10 (let's just assume we use dollars as our currency here, doesn't matter), a coat that costs $23, and a whip that costs $19. We also have a package deal that if you buy both a shoe, a coat, and a whip, you get the whole thing for $40 altogether.
Of the two sales given, one (2973) has purchased all three things and will get the discount, while the other (3092) has purchased only the shoe and the whip and won't get the discount.
In order to find out whether or not to apply the package-deal discount, I of course have to find out whether all the item_ids in a package deal are present in the purchases table for a given sale_id.
How do I do this?
I thought I should be able to do something like this:
SELECT deal_id, item_id, purchase_id
FROM package_deals
LEFT JOIN deals_items
USING (deal_id)
LEFT JOIN purchases
USING (item_id)
WHERE
sale_id = 2973
AND item_id IS NULL
GROUP BY deal_id
In my head, that retrieved all rows from the package_deal table where at least one of the item_ids associated with the package deal in question does not have a corresponding match in the purchases table for the sale_id given. This would then have told me which packages don't apply; i.e., it would return zero rows for purchase 2973 (since none of the items associated with package deal 1 are absent from the purchases table filtered on sale_id = 2973) and one row for 3092 (since one of the items associated with package deal one—namely the coat, item_id 76—is absent from the purchases table filtered on sale_id = 3092).
Obviously, it doesn't do what I naïvely thought it would—rather, it just always returns zero rows, no matter what.
It doesn't really matter much to me whether the resulting set gives me one row for each package deal that should apply, or one for each package deal that shouldn't apply—but how do I get it to show me either in a single query?
Is it even possible?
The problem with your query above is that sale_id is also NULL in the missing row that you're interested in, due to the LEFT JOIN.
This query will return the deal_id for any deals that DO NOT apply to a given order:
SELECT DISTINCT
pd.deal_id
FROM package_deals pd
JOIN deals_items di on pd.deal_id = di.deal_id
WHERE di.item_id NOT IN (SELECT item_id FROM purchases WHERE sale_id = 3092)
From that it's easy to work out the ones that do apply. Note that for a fully functioning system, you'd still need to take the purchase quantities into account - e.g. if the customer had bought 2 of two the items in the deal, but only 1 of the third... etc.
A SQL fiddle demonstrating the query is here: http://sqlfiddle.com/#!9/f2ae4/8
Note that I've made my joins using the ON syntax, as I'm simply more familiar than with USING. I expect that would work too if you prefer it.

MySQL IN() Operator not working

How to use IN() Operator not working it's.
Those table are example and look the same as the real database I have.I don't have the permitting to add tables or change
Those are the tables:
students
+------+------+
| id | name |
+------+------+
| 1 | ali |
| 2 | man |
| 3 | sos |
+------+------+
Classes
+------+---------+
| c_id | students|
+------+---------+
| 1 | 1,2,3,4 |
| 2 | 88,33,55|
| 3 | 45,23,72|
+------+---------+
When I use this query it return me only the student with id =1
because "id IN (students)" return 1 when the first value are equal.
select name,c_id from students,classes where id IN (students);
when I get the list out on PHP than add it. it work fine.But, this solution need a loop and cost many queries.
select name,c_id from students,classes where id IN (1,2,3,4);
FIND_IN_SET()
the same happened, it's only return 1 but if the value on other position it return 0.
The IN operator works just fine, where it's applicable for what it does.
First, consider restructuring your data to be normalized, and avoid storing values as comma separated lists.
Second, if you absolutely have to deal with columns containing comma separated lists of values, MySQL provides the FIND_IN_SET() function.
FOLLOWUP
Ditch the old-school comma syntax for the join operation, and use the JOIN keyword instead. And relocate the join predicates from the WHERE clause to the ON clause. Fully qualify column references, eg.
SELECT s.name
, c.c_id
FROM students s
JOIN classes c
ON FIND_IN_SET(s.student_id,c.students)
ORDER BY s.name, c.c_id
To reiterate, storing a "comma separated list" in a column is an anti-pattern; it flies against relational theory and normalization, and disregards the best practices around relational databases. O
One might argue for improved performance, but this pattern doesn't improve performance; rather it adds unnecessary complexity in query and DML operations.
You need three tables.
One table students, one table classes, and then one table, say, students_to_classes containing something like
c_id | student_id
1 | 1
1 | 2
1 | 3
1 | 4
2 | 88
and so on.
Then you can query
select c_id from students_to_classes where student_id in (1,2,3,4)
Google "n:m relationship" for background on this.
EDIT
I know you're not specifically asking for another table structure, but this is a way of having a data type (a single number) that works with IN. Please believe me that this is the right way to do it, the reason you run into trouble with something as simple as IN is that you're using a non-standard approach, which, for such a standard problem, is typically not a good idea.
That's not how the function IN is supposed to work. You use IN when you have a list of possible matches like:
instead of:
WHERE id=1 or id=2 or id=3 or id=4
you use:
WHERE id IN (1,2,3,4)
Anyhow, your logic is not correct. The relation of Class and Student is Many-to-Many, thus a third table is needed. Let's call it studend_class, where you can store the students of each class.
student
+------+------+
| id | name |
+------+------+
| 1 | ali |
| 2 | man |
| 3 | sos |
+------+------+
class
+------+---------+
| id | name |
+------+---------+
| 1 | math |
| 2 | english |
| 3 | science |
+------+---------+
student_class
+------------+-------------+
| class_id | student_id |
+------------+-------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 3 | 3 |
+--------------+-----------+
In the example above all students are in math class and ali is also in science class.
Finally, if you whant to know which students are in what class, let's say Math, you can use:
SELECT s.id, s.name, c.name
FROM student s
INNER JOIN student_class sc ON sc.student_id=s.id
INNER JOIN class c ON sc.class_id = c.id
WHERE c.name="math";