mysql return two minimum values - mysql

I have a table named: workers and a table named: schedule with the following format:
workers:
| id | name | vacationA | vacationB | workhistory |
| 1 | Florin | 2017-05-05 | 2017-05-25 | 2010-01-01 |
| 2 | Andrei | 2017-06-05 | 2017-06-25 | 2010-01-01 |
| 3 | Alexandra | 2017-07-05 | 2017-07-25 | 2010-01-01 |
| 4 | Emilia | 2017-08-05 | 2017-08-25 | 2010-01-01 |
| 5 | Nicoleta | 2017-09-05 | 2017-09-25 | 2010-01-01 |
+----+-----------+------------+------------+-------------+
schedule:
| day | month | name | shifts |
+-----+-------+-----------+--------+
| 1 | 6 | Florin | 0 |
| 1 | 6 | Andrei | 1 |
| 1 | 6 | Alexandra | 2 |
| 1 | 6 | Emilia | 3 |
| 1 | 6 | Nicoleta | 4 |
+-----+-------+-----------+--------+
I need to interrogate table "workers" to give me 2 random names, with minimum shifts number, and workers should not be in vacation period. Also work history must be greater than 18 MONTHS.
In this case, the query i need should return Florin and Andrei.
This is what I've got so far, but it doesn't work as supposed:
SELECT name FROM workers WHERE (CURDATE() NOT BETWEEN vacationA AND vacationB) AND workhistory > (DATE_SUB(CURDATE(), INTERVAL 18 MONTH)) AND name IN (SELECT name FROM schedule ORDER BY shifts LIMIT 2) ORDER BY RAND() LIMIT 2;
This query returns
1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'.
Thank you!

As you have got name column in schedule table already (although it's not a good design), you don't need a join. You can just use ORDER BY with LIMIT,.e.g.
SELECT name
FROM schedule
WHERE day ? AND month = ? --Remove this if there is no crriteria
ORDER BY shifts
LIMIT 2;

The obvious answer is just to sort the table by the number of shifts and grab the first two entries:
SELECT name FROM schedule ORDER BY shifts ASC LIMIT 2
I notice, however, that you already have an ORDER BY clause, so it seems you want the results in random order.
If you need the random order as well, then wrap the whole thing in a subquery like this:
SELECT name FROM (SELECT name FROM schedule ORDER BY shifts ASC LIMIT 2) ORDER BY RAND()

Related

Selecting the most recent result from one table joining to another

I have two tables.
One table contains customer data, like name and email address. The other table contains a log of the status changes.
The status log table looks like this:
+-------------+------------+------------+
| customer_id | status | date |
+-------------+------------+------------+
| 1 | Bought | 2018-07-01 |
| 1 | Bought | 2018-07-02 |
| 2 | Ongoing | 2018-07-03 |
| 3 | Ongoing | 2018-07-04 |
| 1 | Not Bought | 2018-07-05 |
| 4 | Bought | 2018-07-06 |
| 4 | Not Bought | 2018-07-07 |
| 4 | Bought | 2018-07-08 | *
| 3 | Cancelled | 2018-07-09 |
+-------------+------------+------------+
And the customer data:
+-------------+------------+
| id | name | email |
+-------------+------------+
| 1 | Alex | alex#home |
| 2 | John | john#home |
| 3 | Simon | si#home |
| 4 | Philip | phil#home |
+-------------+------------+
I would like to select the customer's who have "Bought" in July (07). But exclude customers who's status has changed from "Bought" anything other most recently.
The result should be just one customer (Philip) - all the others have had their status change to something other than Bought most recently.
I have the following SQL:
SELECT
a.customer_id
FROM
statuslog a
WHERE
DATE(a.`date`) LIKE '2018-07-%'
AND a.status = 'Bought'
ORDER BY a.date DESC
LIMIT 1
But that is as far as I have got! The above query only returns one result, but essentially there could be more than one.
Any help is appreciated!
Here is an approach that uses a correlated subquery to get the most recent status record:
SELECT sl.customerid
FROM wwym_statuslog sl
WHERE sl.date = (SELECT MAX(sl2.date)
FROM wwym_statuslog sl2
WHERE sl2.customer_id = sl.customer_id AND
sl2.date >= '2018-07-01' AND
sl2.date < '2018-08-01'
) AND
sl.status = 'Bought'
ORDER BY sl.date DESC
LIMIT 1;
Notes:
Use meaningful table aliases! That is, abbreviations for the table names, rather than arbitrary letters such as a and b.
Use proper date arithmetic. LIKE is for strings. MySQL has lots of date functions that work.
In MySQL 8+, you would use ROW_NUMBER().

Calculating difference of two values and sorting based on the result with mysql

game table:
| id | group_id | user_id | last_update | bonus |
| 1 | 4 | 1 | 2017-01-22 00:06:10 | 0 |
| 2 | 4 | 1 | 2017-01-12 00:11:34 | 300 |
| 3 | 4 | 1 | 2017-01-02 00:30:44 | -111 |
| 3 | 4 | 1 | 2017-02-02 00:21:44 | 4330 |
| 3 | 4 | 6 | 2017-01-02 01:02:27 | 30 |
| 3 | 4 | 6 | 2017-01-07 11:22:37 | 40 |
| 3 | 4 | 6 | 2017-03-04 11:22:37 | 0 |
I want to calculate bonus of the current date minus the bonus of the first day of the current month for every user of a given group.
The wanted output:
| user_id | january (last day bonus - first day bonus) |
| 5 | 1400 |
| 19 | 1377 |
| 1 | 806 |
| 14 | 140 |
| 50 | 14 |
Currently, I'm getting bonuses of the given month (1 query), calculating the difference between the last and first ones. I have 4000 users, so I'm performing 4000 queries to do what I want and it's too slow.
Is it possible to do that with only mysql?
Perhaps the "most correct" way would be to use variables or complex subqueries. However, another way that should work is to use group_concat()/substring_index():
select user_id, date_format(last_update, '%Y-%m') as yyyymm,
substring_index(group_concat(bonus order by last_update desc), ',', 1) as last_bonus,
substring_index(group_concat(bonus order by last_update asc), ',', 1) as first_bonus,
(substring_index(group_concat(bonus order by last_update desc), ',', 1) -
substring_index(group_concat(bonus order by last_update asc), ',', 1)
) as bonus_diff
from t
group by user_id, yyyymm;
Note that this converts the bonus to a string -- and then back again to a number for the calculation. That is why I might call this "quick-and-dirty" or a "hack". However, it should work and the conversions are safe because the values start out as numbers.
Second, group_concat() has a default limit of 1024 bytes. That should not be a problem for these aggregations -- unless you have hundreds of rows for a user within a month.

mysql return two minimum values with limit

I have a table named: workers and a table named: schedule with the following format:
workers:
| id | name | vacationA | vacationB | workhistory |
| 1 | Florin | 2017-05-05 | 2017-05-25 | 2010-01-01 |
| 2 | Andrei | 2017-06-05 | 2017-06-25 | 2010-01-01 |
| 3 | Alexandra | 2017-07-05 | 2017-07-25 | 2010-01-01 |
| 4 | Emilia | 2017-08-05 | 2017-08-25 | 2010-01-01 |
| 5 | Nicoleta | 2017-09-05 | 2017-09-25 | 2010-01-01 |
+----+-----------+------------+------------+-------------+
schedule:
| day | month | name | shifts |
+-----+-------+-----------+--------+
| 1 | 6 | Florin | 0 |
| 1 | 6 | Andrei | 1 |
| 1 | 6 | Alexandra | 2 |
| 1 | 6 | Emilia | 3 |
| 1 | 6 | Nicoleta | 4 |
+-----+-------+-----------+--------+
I need to interrogate table "workers" to give me 2 random names, with minimum shifts number, and workers should not be in vacation period. Also work history must be greater than 18 MONTHS.
In this case, the query i need should return Florin and Andrei.
This is what I've got so far, but it doesn't work as supposed:
SELECT name
FROM workers
WHERE (CURDATE() NOT BETWEEN vacationA AND vacationB) AND
workhistory > (DATE_SUB(CURDATE(), INTERVAL 18 MONTH)) AND
name IN (SELECT name FROM schedule ORDER BY shifts LIMIT 2)
ORDER BY RAND() LIMIT 2;
This query returns
1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'.
Thank you!
Join to schedule instead and then use LIMIT 2:
SELECT w.name
FROM workers w
INNER JOIN schedule s
ON w.name = s.name
WHERE CURDATE() NOT BETWEEN w.vacationA AND w.vacationB AND
w.workhistory > DATE_SUB(CURDATE(), INTERVAL 18 MONTH)
ORDER BY s.shifts
LIMIT 2;
I don't understand the random ordering, because you are only returning two worker records, and those records are not chosen randomly, but rather belong to the smallest shift numbers.

MySQL - select average of column A for first N entries from column B

I have a ratings table, where each user can add one rating a day. But each user might miss several days between ratings.
I'd like to get the average rating for each user_id's first 7 entries of created_at.
My table:
mysql> desc entries;
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| rating | tinyint(4) | NO | | NULL | |
| user_id | int(10) unsigned | NO | MUL | NULL | |
| created_at | timestamp | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
Ideally I'd just get something like:
+------------+------------------+
| day | average_rating |
+------------+------------------+
| 1 | 2.53 |
+------------+------------------+
| 2 | 4.30 |
+------------+------------------+
| 3 | 3.67 |
+------------+------------------+
| 4 | 5.50 |
+------------+------------------+
| 5 | 7.23 |
+------------+------------------+
| 6 | 6.98 |
+------------+------------------+
| 7 | 7.22 |
+------------+------------------+
The closest I've been able to get is:
SELECT rating, user_id, created_at FROM entries ORDER BY user_id asc, created at desc
Which isn't very close at all...
Is it even possible? Will the performance be terrible? It's something that would need to run every time a web page is loaded, so would it be better to just run this once a day and save the results? (to another table!?)
edit - second attempt
Working towards a solution, I think this would get the rating for each user's first day:
select rating from entries where user_id in
(select user_id from entries order by created_at limit 1);
But I get:
ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
So now I'm going to play around with JOIN to see if that helps.
edit - third attempt, getting closer
I found this stackoverflow post, which is closer to what I want.
select e1.* from entries e1 left join entries e2
on (e1.user_id = e2.user_id and e1.created_at > e2.created_at)
where e2.id is null;
It gets the rating for the first day for each user.
Next step is to work out how to get days 2 to 7. I can't use 1.created_at > e2.created_at for that, so I'm really confused now.
edit - fourth attempt
Okay, I think it's not possible. Once I worked out how to turn off 'full group by' mode, I realised I'll probably need to use a subquery with limit <user_id>, <day_num>, for which I get:
ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
My current method is to just get the entire table, and use PHP to calculate the average for each day.
If I understand correctly you want to take the last 7 ratings the user gave, ordered by the date they gave the rating. The last 7 ratings of one user may fall on different days to another user, however they will be averaged together regardless of date.
First we need to order the data by user and date and give each user their own incrementing row count. I do this by adding two variables, one for the last user id and one for the row number:
select e.created_at,
e.rating,
if(#lastUser=user_id,#row := #row+1, #row:=1) as row,
#lastUser:= e.user_id as user_id
from entries e,
( select #row := 0, #lastUser := 0 ) vars
order by e.user_id asc,
e.created_at desc;
If the previous user_id is different we reset the row counter to 1. The result from this is:
+---------------------+--------+------+---------+
| created_at | rating | row | user_id |
+---------------------+--------+------+---------+
| 2017-01-10 00:00:00 | 1 | 1 | 1 |
| 2017-01-09 00:00:00 | 1 | 2 | 1 |
| 2017-01-08 00:00:00 | 1 | 3 | 1 |
| 2017-01-07 00:00:00 | 1 | 4 | 1 |
| 2017-01-06 00:00:00 | 1 | 5 | 1 |
| 2017-01-05 00:00:00 | 1 | 6 | 1 |
| 2017-01-04 00:00:00 | 1 | 7 | 1 |
| 2017-01-03 00:00:00 | 1 | 8 | 1 |
| 2017-01-02 00:00:00 | 1 | 9 | 1 |
| 2017-01-01 00:00:00 | 1 | 10 | 1 |
| 2017-01-13 00:00:00 | 1 | 1 | 2 |
| 2017-01-11 00:00:00 | 1 | 2 | 2 |
| 2017-01-09 00:00:00 | 1 | 3 | 2 |
| 2017-01-07 00:00:00 | 1 | 4 | 2 |
| 2017-01-05 00:00:00 | 1 | 5 | 2 |
| 2017-01-03 00:00:00 | 1 | 6 | 2 |
| 2017-01-01 00:00:00 | 1 | 7 | 2 |
| 2017-01-13 00:00:00 | 1 | 1 | 3 |
| 2017-01-01 00:00:00 | 1 | 2 | 3 |
| 2017-01-03 00:00:00 | 1 | 1 | 4 |
| 2017-01-01 00:00:00 | 1 | 2 | 4 |
| 2017-01-02 00:00:00 | 1 | 1 | 5 |
+---------------------+--------+------+---------+
We now simply wrap this in another statement to select the avg where the row number is less than or equal to seven.
select e1.row day, avg(e1.rating) avg
from (
select e.created_at,
e.rating,
if(#lastUser=user_id,#row := #row+1, #row:=1) as row,
#lastUser:= e.user_id as user_id
from entries e,
( select #row := 0, #lastUser := 0 ) vars
order by e.user_id asc,
e.created_at desc) e1
where e1.row <=7
group by e1.row;
This outputs:
+------+--------+
| day | avg |
+------+--------+
| 1 | 1.0000 |
| 2 | 1.0000 |
| 3 | 1.0000 |
| 4 | 1.0000 |
| 5 | 1.0000 |
| 6 | 1.0000 |
| 7 | 1.0000 |
+------+--------+

MySQL query SELECT FROM 2 tables, COUNT the most used

I have this 2 tables and I need to return the moset used office. Note: 1 office can be used by more than 1 guys and the column ido from TableB is populate from TableA
Probaly is a query with group by and desc limit 1
TableA
| ido| office | guy |
---------------------
| 1 | office1| guy1|
| 2 | office2| guy2|
| 3 | office1| guy3|
| 4 | office1| guy4|
| 5 | office5| guy5|
| 6 | office2| guy6|
TableB
| idb| vizit | ido|
---------------------
| 1 | date | 4 |
| 2 | date | 2 |
| 3 | date | 5 |
| 4 | date | 6 |
| 5 | date | 1 |
| 6 | date | 6 |
Thanks!
You were correct in that GROUP BY, LIMIT and DESC are useful here; it leads to a fairly straight forward query;
SELECT TableA.office
FROM TableA
JOIN TableB
ON TableA.ido = TableB.ido
GROUP BY TableA.office
ORDER BY COUNT(*) DESC
LIMIT 1
What it does is basically create rows with all valid combinations, counting the number of generated rows per office. A plain descending sort by that count will give you the most frequently used office.
An SQLfiddle to test with.