MYSQL : Group by all weeks of a year with 0 included - mysql

I have a question about some mysql code.
I have a table referencing some employees with the date of arrival et the project id. I wanna calculate all the entries in the enterprise and group it by week.
A this moment, I can have this result
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S03 | 1
2 | 2019-S01 | 1
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
This is good, but I would like to have all the weeks returned, even if a week has 0 as result :
Project ID | Week | Count
1 | 2019-S01 | 2
1 | 2019-S02 | 0
1 | 2019-S03 | 1
...
2 | 2019-S01 | 1
2 | 2019-S02 | 0
2 | 2019-S03 | 0
2 | 2019-S04 | 5
2 | 2019-S05 | 3
2 | 2019-S06 | 2
...
Here is my actual code :
SELECT
AP.SECTION_ANALYTIQUE AS SECTION,
FS_GET_FORMAT_SEMAINE(AP.DATE_ARRIVEE_PROJET) AS SEMAINE,
Count(*) AS COMPTE
FROM
RT00_AFFECTATIONS_PREV AP
WHERE
(AP.DATE_ARRIVEE_PROJET <= CURDATE() AND Year(AP.DATE_ARRIVEE_PROJET) >= Year(CURDATE()))
GROUP BY
SECTION, SEMAINE
ORDER BY
SECTION
Does anybody have a solution ?
I searched things on internet but didn't find anything accurate :(
Thank you in advance ! :)

The classic way to meet this requirement is to create a referential table to store all possible weeks.
create table all_weeks(week varchar(8) primary key);
insert into all_weeks values
('2019-S01'), ('2019-S02'), ('2019-S03'), ('2019-S04'), ('2019-S05'), ('2019-S06');
Once this is done, you can generate a cartesian product of all possible sections and weeks with a CROSS JOIN, and LEFT JOIN that with the original table.
Given your code snippet, this should look like:
SELECT
s.section_analytique AS section,
w.week AS semaine,
COUNT(ap.section_analytique) AS compte
FROM
(SELECT DISTINCT section_analytique from rt00_affectations_prev) s
CROSS JOIN all_weeks w
LEFT JOIN rt00_affectations_prev ap
ON s.section_analytique = ap.section_analytique AND w.week = FS_GET_FORMAT_SEMAINE(ap.date_arrivee_projet)
GROUP BY s.section_analytique, w.week
ORDER BY s.section_analytique
PS: be careful not to put conditions on the original table in the WHERE clause: this would defeat the purpose of the LEFT JOIN. If you need to do some filtering, use the referential table instead (you might need to add a few columns to it, like the starting date of the week maybe).

Related

Using nested SELECT result for IN statement of another nested SELECT

Be gentle. I'm a high school principal coding on the side for our school site.
I have looked at answers, here, here, and here. I might just not know enough to ask the right question.
We have events that have multiple sessions and there are workshops that can be associated with multiple sessions in multiple events.
I'm trying to get a csv result, later to be put into an array, for the associated sessions and events for my Workshops.
The query below works without the second nested Select statement.
In the Alt_Events statement, I need to pull the Event_IDs that are associated with the Session_IDs that are pulled from the first nested Select.
Events
ID | Name | Description
1 | Flex Learning | A day of flexible learning.
2 | Moonshot Expo | A day to join partners to solve problems.
Event_Sessions
ID | Event_ID | Name | Description
1 | 1 | Morning Session | The first session of the day.
2 | 1 | Afternoon Session | The afternoon session.
3 | 1 | Tutoring Session | A chance to get help from teachers.
4 | 2 | Partner Field Trip | The first session of the day.
5 | 2 | Brainstorming Session | The afternoon session.
6 | 2 | Tutoring Session | A chance to get help from teachers.
Event_Workshops
ID | Name | Description
1 | Math Tutorial | Get help from your math teachers.
Event_Workshop_Links
ID | Workshop_ID | Session_ID
1 | 1 | 3
2 | 1 | 6
Output Table:
ID | Name | Description | ... | Alt_Sessions | Alt_Events
1 | Math Tutorial | Get help... | ... | 3,6 | 1,2
Here is my query.
SELECT
ws.ID, ws.Name, ws.Description, ws.Location, ws.Owner_ID, ws.Max_Attendees,
ws.Eng_Major_Allowed, ws.Eng_Minor_Allowed,
ws.HC_Major_Allowed, ws.HC_Minor_Allowed,
ws.IT_Major_Allowed, ws.IT_Minor_Allowed,
u.LastName as Owner_LastName, u.FirstName AS Owner_FirstName, u.Email AS Owner_Email,
(SELECT group_concat(SESSION_ID) FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID) AS Alt_Sessions,
(SELECT group_concat(Event_ID) FROM Event_Sessions WHERE Session_ID IN Alt_Sessions) AS Alt_Events
FROM Event_Workshops as ws
LEFT JOIN users AS u
ON ws.Owner_ID = u.ID
WHERE ws.ID = ?
ORDER BY ws.Name
I need to be able to pull the all event_ids that are in the Alt_Sessions result.
I'm guessing I can't use the result of the first nested query in the second nested query. If that's the problem, how can I pull that list of event ids?
Any and all help is greatly appreciated.
(Updated to show expected output. Also one error in transcribing the query. Session_ID instead of Event_ID in second nested statement.
Use the subquery instead of Alt_Sessions in the IN predicate like below.
(SELECT group_concat(SESSION_ID) FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID) AS Alt_Sessions,
(SELECT group_concat(Event_ID) FROM Event_Sessions WHERE Session_ID IN (SELECT SESSION_ID FROM Events_Workshops_Links WHERE Workshop_ID = ws.ID)) AS Alt_Events
Also, there is a way to make combinations of Alt_Sessions and Alt_Events first and then join to Event_Workshops.
SELECT * FROM Event_Workshops ws
JOIN
(
SELECT
wsl.Workshop_ID,
GROUP_CONCAT(wsl.Session_ID) Alt_Sessions,
GROUP_CONCAT(wsl.ID) Alt_Events
FROM Event_Workshop_Links wsl
GROUP BY wsl.Workshop_ID
) w
ON ws.ID = w.Workshop_ID

Left Join takes very long time on 150 000 rows

I am having some difficulties to accomplish a task.
Here is some data from orders table:
+----+---------+
| id | bill_id |
+----+---------+
| 3 | 1 |
| 9 | 3 |
| 10 | 4 |
| 15 | 6 |
+----+---------+
And here is some data from a bills table:
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+----+
I want to list all the bills that have no order associated with.
In order to achieve that, I thought that the use of LEFT JOIN was appropriated so I wrote this request:
SELECT * FROM bills
LEFT JOIN orders
ON bills.id = orders.bill_id
WHERE orders.bill_id IS NULL;
I thought that I would have the following result:
+----------+-----------+----------------+
| bills.id | orders.id | orders.bill_id |
+----------+-----------+----------------+
| 2 | NULL | NULL |
| 5 | NULL | NULL |
+----------+-----------+----------------+
But I can't reach the end of the request, it has run more than 5 minutes without result, I stopped the request because this can't be a production time anyway.
My real dataset has more than 150 000 orders and 100 000 bills. Is the dataset too big?
Is my request wrong somewhere?
Thank you very much for your tips!
EDIT: side note, the tables have no foreign keys defined... *flies away*
Your query is fine. I would use table aliases in writing it:
SELECT b.*
FROM bills b LEFT JOIN
orders o
ON b.id = o.bill_id
WHERE o.bill_id IS NULL;
You don't need the NULL columns from orders, probably.
You need an index on orders(bill_id):
create index idx_orders_billid on orders(bill_id);
By your where statement, I assume your looking for orders that have no bills.
If that's the case you don't need to do a join to the bills table as they would by definition not exist.
You will find
SELECT * FROM orders
WHERE orders.bill_id IS NULL;
A much better performing query.
Edit:
Sorry I missed your "I want to list all the bills that have no order associated with." when reading the question. As #gordon pointed out an index would certainly help. However if changing the scheme is feasible I would rather have a nullable bill.order_id column instead of a order.bill_id because you won't need a left join, an inner join would suffice to get order bills as it would be a quicker query for your other assumed requirements.

Returns distinct record in a joins query - Rails 4

I'm trying to get and display an order list including the current status.
#orders = Order.joins(order_status_details: :order_status)
.order('id DESC, order_status_details.created_at DESC')
.select("orders.id, order_status_details.status_id, order_statuses.name, order_status_details.created_at")
It works good but is returning all the rows with order ids duplicated like this:
+----+-----------+----------------------+---------------------+
| id | status_id | name | created_at |
+----+-----------+----------------------+---------------------+
| 8 | 1 | Pending | 2016-01-31 16:33:30 |
| 7 | 3 | Shipped | 2016-02-01 05:01:21 |
| 7 | 2 | Pending for shipping | 2016-01-31 05:01:21 |
| 7 | 1 | Pending | 2016-01-31 04:01:21 |
+----+-----------+----------------------+---------------------+
The correct answer must return uniques ids, for the example above should be the first and second row.
I was already trying with distinct on select, .distinct, .uniq and .group but I'm getting an error.
Thanks.
First of all, I believe your model is "An Order has many OrderStatusDetail". So that is the reason why you have several different name in your result.
So you can modify the query like this:
#orders = Order.joins(order_status_details: :order_status)
.order('id DESC, order_status_details.created_at DESC')
.where('order_status_details.id IN (SELECT MAX(id) FROM order_status_details GROUP BY order_id)')
.select("orders.id, order_status_details.status_id, order_statuses.name, order_status_details.created_at")
Ideally, the where condition is used for selecting just the expected id of order_status_details, I use min_id for example, you can modify it as needed

mysql update with subquery 2 level deep

Thanks for taking a look at this question. I'm kind of lost and hope someone can help me. Below is a update query i would like to run.
This query now returns an error:
1054 - Unknown column 'spi.et_cross_rank' in 'where clause'
Some background:
from table: tmp_ranking_tbl
I would like to get the nth(spi.et_return_rank) record
for a group with value x (spi.et_cross_rank)
SET #rownum=0;
UPDATE STRToer_Poule_indeling spi
SET spi.team_id = (SELECT R.team_poule_id
FROM (SELECT #rownum:=#rownum+1 AS rownum, trt.team_poule_id
FROM tmp_ranking_tbl trt
WHERE trt.overal_rank = spi.et_cross_rank
ORDER BY trt.punten DESC, (trt.goals_voor - trt.goals_tegen) DESC, trt.goals_voor DESC) R
WHERE R.rownum = spi.et_return_rank)
WHERE spi.et_ronde = v_et_ronde
AND spi.poule_id IN (SELECT row_id FROM STRToer_Poules WHERE toernooi_onderdeel_id=v_onderdeel_id) ;
Data in tmp_ranking_tbl looks like:
team_poule_id | punten | goals_voor | goals_tegen | overal_rank
65 | 6 | 10 | 10 | 2
69 | 6 | 9 | 10 | 2
75 | 7 | 11 | 4 | 2
84 | 6 | 6 | 8 | 2
112 | 5 | 7 | 7 | 2
Thanks in advance for the help!
Update after question in comment about the goal, i'll try to keep it short. :-)
This query is used on a website to keep scores of a tournament. Sometimes you have an odd number of teams going to the next round. At that point I want to select the best number 3(spi.et_cross_rank) team across poules. This is setting saved in the STRToer_Poule_indeling with what rank per poule and the 1st, 2nd or nth team(spi.et_return_rank). The table tmp_ranking_tbl is filled with all rank 3 teams across the poules. When this if filled I would like the 1st or 2nd, depedining on the setting in STRToer_Poule_indeling, record to return.
Subset of structure the STRToer_Poule_indeling table
row_id | team_id | et_ronde | et_cross_rank | et_return_rank
1 | null | 1 | 3 | 1
Just check if you have a column named et_cross_rank on your table STRToer_Poule_indeling
The problem seems to be that SQL can't find that column on your table.
Hope it helps.

Select a row for every date in the table, no matter the data

I have the following 3 tables in my database:
noobs
id
name
img_url
associations_id
noobs_has_points
noobs_id
points_id
points
id
amount
create_time (as UNIX timestamp)
I want to get a result for every day (such as FROM_UNIXTIME(points.create_time,'%Y-%m-%d')). And in that result I want the noobs.id and his amount of points so SUM(points.amount). So whether a noob has actually scored points on that day doesn't matter, if he did not I would want a row with 0 in there as the amount, so that for every day I get to see how many points each noob scored.
However, I have no idea how to get this result. I have tried some things with left/right (or unioned) joins but I don't get the result I want. Can anyone help me with this?
Example results:
day | points.amount | noobs.id
2015-04-11 | 3 | 1
2015-04-11 | 0 | 2 (no points scored, no entry in database)
2015-04-12 | 0 | 1 (no points scored, no entry in database)
2015-04-12 | 1 | 2
Some sample data from the three tables:
Noobs
id | name | img_url | associations_id
1 | Rien | NULL | 1
2 | Peter| NULL | 1
noobs_has_points
noobs_id | points_id
1 | 1
2 | 3
points
id | amount | create_time
1 | 3 | 1428779292
2 | 1 | 1428805351
Because there may be no dara for a given day for a given noob, you need a way to generate date values. Unfortunately, mysql doesn't have a built-in way to do this. You can code a range into the query with a series if unions as a subquery, but it's ugly and not scalable.
I recommend creating a table to hold date values:
create table dates(_date date not null primary key);
And populating it with lots of dates (say everything from 1970-2020).
Then you can code:
select _date day, sum(p.amount) total, n.id
from dates d
cross join noobs n
left join noobs_has_points np on np.noob_id = n.id
left join points p on p.id = np.points_id
and date(p.create_time) = _date
where _date between ? and ?
group by 1, 3
The cross join gives every noob a result for every date in the specified range, while to left joins ensure a zero for days without points for the noob.