I have a database for a chat application.
CREATE TABLE Users (uid int PRIMARY KEY, name text, phone text );
CREATE TABLE Messages (recipient int REFERENCES Users(uid), sender int
REFERENCES Users(uid), time timestamp NOT NULL, message text NOT NULL,
PRIMARY KEY (recipient, sender, time));
http://www.sqlfiddle.com/#!9/bd36d1
I want to define, for each of the 5 users which have sent the most messages, the average length of messages that have been sent by this user.
I have written the following query:
SELECT avg(strlen(message))
FROM Messages
WHERE sender IN
(SELECT *
FROM (SELECT sender, COUNT(sender) AS NumberOfMessages
FROM Messages
GROUP BY sender) AS MessagesPerSender
ORDER BY NumberOfMessages DESC
LIMIT 5)
To start with, is this query correct? Does it give me the desired result? The problem is I can't run it at all cause I get the error:
"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"
Not the right approach for mysql this may do
select sender,avg(length(message)),count(*)
from messages
group by sender
order by avg(length(message)) desc limit 5;
+--------+----------------------+----------+
| sender | avg(length(message)) | count(*) |
+--------+----------------------+----------+
| 1 | 9.0000 | 1 |
| 9 | 5.5000 | 2 |
| 2 | 5.0000 | 1 |
+--------+----------------------+----------+
3 rows in set (0.00 sec)
Note this may not deal with draws in the way you want.
You had 2 errors in your code:
first of all you cannot use strlen in MYSQL. That is an Microsoft
SQL Server dialect Instead you need to use length.
Secondly, in the subquery you used, you were using two columns
instead on one. This will cause the query to fail because the equals
operator needs to be equal to the value in only one column.
So here is your query:
select u.name, avg(length(m.message)), count(*)
from Messages m
inner join Users u on m.sender = u.uid
group by u.name
order by avg(length(m.message)) desc limit 5;
I improved on P. Salmon's answer since I provided you with the name of the sender rather than their ID.
Hope this helps :)
To find out, I have changed the DMBS from MySQL to Postgres, which supports inner limit. Your query has correct syntax, except the strlen() function, the correct one is length().
However, your query fails for a simple reason: you are doing a where sender in (subquery), although your subquery returns two fields. The in operator only works with single field queries. Moreover, your subquery is composed of two queries, which can be simplified to one. The following query works on Postgres 9.6, and should work on whatever version of MySQL with inner limit support:
SELECT avg(length(message))
FROM Messages
WHERE sender IN (
SELECT sender
FROM Messages
GROUP BY sender
ORDER BY COUNT(sender) DESC
LIMIT 5
)
It produces the following result when run on your sample data:
+----------+
| avg |
+----------+
| 6.25 |
+----------+
Working SQL Fiddle (Postgres 9.6): http://www.sqlfiddle.com/#!17/bd36d/6/0
Related
I have the issue of getting the records out of database in the specific condintion. I have table 'test' I want to get the listing from sorted by driverid and table 'drivers' which i use to adjust sorting of the listing from 'test' table.
My query:
SELECT * FROM test JOIN drivers ON test.driverid=drivers.driverid ORDER BY queno
Table 'drivers' looks like:
driver | driverid | queno
-------------------
drv1 | 15 | 3
drv2 | 30 | 1
drv3 | 40 | 2
Problem is when there is no value assigned to 'driverid' in 'test' table then these results are listed at very beginning. I would like to have these listed at the end
How to achieve that? Thx in advance!
You can make driver.driverid primary key (PK) and test.driverid foreign key (FK) and enforce data integrity. This will also eliminate your problem.
Place a minus sign (-) before the column name and switch the ASC to DESC or DESC to ASC order (opposite to what you want).
try this:-
SELECT * FROM test JOIN drivers ON test.driverid=drivers.driverid ORDER BY -queno DESC;
Note:- While this may work well for numbers and dates, it may not be the best solution to sort fields with alpha or alphanumeric values
I found the working solution elswhere:
MySQL: Order by field, placing empty cells at end
SELECT * FROM test JOIN drivers ON test.driverid=drivers.driverid ORDER BY if(queno = '' or queno is null,1,0), queno
I need to make several select statements to get simple data (only one row containing one or several fields for each select).
Simplified example:
select name, price from article where id=125
select log, email from user where uid=241
I want to process only one single statement from php side (or: I do NOT want to prepare several statements, execute several statements, catch and handle exceptions for each execution and finally fetch result for each statement...).
I tried:
select * from (
(select name, price from article where id=125) as a,
(select log, email from user where uid=241) as b
)
which works great if every subselect returns values:
name | price | log | email
------------------------------------------
dummy | 12,04 | john | john#example.com
But if one of the subselects returns empty, the whole select returns empty.
What I want is: null values for empty resulting subselects.
I tried many things with ifnull() and coalesce(), but couldn't get the awaited result (I know how to use them with null values, but I didn't find a way to deal with them in the case of an empty result set).
I finally found a solution with left joins:
select * from (
(select 1) as thisWillNeverReturnEmpty
left join (select name, price from article where id=125) as a on 1
left join (select log, email from user where uid=241) as b on 1
)
which works perfectly even if one of the subqueries returns empty (or even both, therefore the "select 1").
Another way I found on SO would be to add a count(*) in each subquery to make sure there's a value.
But it all looks quite dirty and I can't believe there's no simple way just using something like ifnull().
What is the right way to do it?
The best way I finally found was:
select * from (
(select count(*) as nbArt, name, price from article where id=125) as a,
(select count(*) as nbUser, log, email from user where uid=241) as b
)
This way, no subquery ever returns empty, which solves the problem (there's always at least a "zero" count followed by null values).
Sample result when no article is found:
nbArt | name | price | nbUser | log | email
----------------------------------------------------------------
0 | null | null | 1 | john | john#example.com
I'm trying to write a query that returns a fixed number of results in a group concat. I don't think it's possible with a group concat, but I'm having trouble figuring out what sort of subquery to add.
Here's what I would like to do:
Query
select id,
group_concat(concat(user,'-',time) order by time limit 5)
from table
where id in(1,2,3,4)
group by 1
When I remove the "limit 5" from the group concat, the query works but spits out way too much information.
I'm open to structuring the query differently. Specific ID numbers will be supplied by the user of the query, and for each ID specified, I would like to list a fixed number of results. Let me know if there is a better way to achieve this.
Not sure the exact result set you want, but check out this SO post:
How to hack MySQL GROUP_CONCAT to fetch a limited number of rows?
As another example, I tried out the query/solution provided in the link and came up with this:
SELECT user_id, SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT date_of_entry),',',5) AS logged_dates FROM log GROUP BY user_id;
Which returns:
user_id | logged_dates
1 | "2014-09-29,2014-10-18,2014-10-05,2014-10-12,2014-10-19"
2 | "2014-09-12,2014-09-03,2014-09-23,2014-09-22,2014-10-13"
3 | "2014-09-10"
6 | "2014-09-29,2014-09-27,2014-09-26,2014-09-25"
8 | "2014-09-26,2014-09-30,2014-09-27"
9 | "2014-09-28"
13 | "2014-09-29"
22 | "2014-10-12"
The above query will return every user id that has logged something, and up to 5 dates that the user has logged. If you want more or less results form the group concat, just change the number 5 in my query.
Following up, and merging my query with yours, I get:
SELECT user_id, SUBSTRING_INDEX(GROUP_CONCAT(date_of_entry ORDER BY date_of_entry ASC),',',3) AS logged_dates FROM log WHERE user_id IN(1,2,3,4) GROUP BY user_id
Which would return (notice that I changed the number of results returned from the group_concat):
user_id | logged_dates
1 | "2014-09-16,2014-09-17,2014-09-18"
2 | "2014-09-02,2014-09-03,2014-09-04"
3 | "2014-09-10"
I have a table hardBrake with the following schema :
-------------------- --------------------
| Column Name | Type |
-------------------- --------------------
| id | CHAR(36) primary |
-------------------- --------------------
| vehicleId | CHAR(36) |
-------------------- --------------------
| address | VARCHAR(50) |
-------------------- --------------------
| time | TIMESTAMP |
-------------------- --------------------
Now when I am running my query and trying to group it using time column with day() function and used EXPLAIN EXTENDED on the select query it shows using temporary and filesort along with using where and using index.
I am using time + vehicleId column as index and my select query is :
select count(1),CONVERT_TZ(time,'+00:00', :offset) as dateOfIncident
from hardBrake
where vehicleId in (vehicleIds)
and time between NOW() - INTERVAL 30 DAY and NOW()
group by day(dateOfIncident )
order by time DESC;
offset field and vehicleId field I am passing from my java code that is the time zone difference between users time zone and db time zone that is in GMT and vehicleIds of customer respectively.
Is it possible to remove temporary and filesort for a query that uses group by on a function???
A sneak peak of my problem :
I want to get hardBrake incident for a customer date wise in his time zone. As an alternative can I do any changes in java side without having to go for time zone in mysql.
I am afraid this query cannot avoid the temporary table.
Refer to this link: http://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
The most important preconditions for using indexes for GROUP BY are that all GROUP BY columns reference attributes from the same index, and that the index stores its keys in order .....
Since the query is using a function CONVERT_TZ(time,'+00:00', :offset) to obtain GROUP BY values, MySql cannot retrieve these values from the index, it must calculate them "on the fly" using the function, and must store them in the temporary table.
The same problem is with the filesort,
read this link: http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following:
..........
..........
- You have different ORDER BY and GROUP BY expressions.
The query has different GROUP BY and ORDER BY expressions:
group by day(dateOfIncident )
order by time DESC;
therefore MySql cannot use the index, and must use the filesort.
I have the following schema:
id | order_ref | description | price
Currently I have the following duplicate issue:
1 | 34567 | This is the description | 19.99
2 | 34567 | This is the description | 13.99
This was due to the data I was importing having the description for each item duplicated. Is there a way I can keep the first row, and then UPDATE the description on subsequent (up to approx 20 rows) to be 'AS ABOVE'?
1 | 34567 | This is the description | 19.99
2 | 34567 | - AS ABOVE - | 13.99
Thanks
-------UPDATED
UPDATE documents_orders_breakdown
SET `desc` = '- AS ABOVE -'
WHERE NOT id IN (SELECT id
FROM documents_orders_breakdown AS D
WHERE D.`desc` <> `desc`
ORDER BY D.id
LIMIT 1)
But this returns [Err] 1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
--------UPDATED
UPDATE documents_orders_breakdown
SET `desc` = '- AS ABOVE -'
WHERE NOT id IN (SELECT MIN(id)
FROM documents_orders_breakdown AS t
WHERE t.`desc` = `desc`)
This now returns [Err] 1093 - You can't specify target table 'documents_orders_breakdown' for update in FROM clause
If this is a one-time thing, performance is not a big issue. You can run an UPDATE on all the records that are not returned by a SELECT with a LIMIT of 1.
UPDATE the_table
SET description = '- AS ABOVE -'
WHERE NOT id IN (SELECT id
FROM the_table t
WHERE t.description = the_table.description
ORDER BY t.id
LIMIT 1)
This query assumes you want to keep the description of the record whose id comes first (hence the ORDER BY).
Since you can't use LIMIT in subqueries, you can work around that by using the aggregate function MIN:
UPDATE the_table
SET description = '- AS ABOVE -'
WHERE NOT id IN (SELECT MIN(id)
FROM the_table t
WHERE t.description = the_table.description)
(Let's hope you can mix MIN and subqueries ;)
Apparently you can't SELECT from the table you're UPDATEing in MySQL. A workaround is to use an implicit temporary table. This is bad for performance, but, again, given this is a one-time thing, that's not a big concern.
UPDATE the_table
SET description = '- AS ABOVE -'
WHERE NOT id IN (SELECT m FROM (SELECT MIN(id) AS m
FROM the_table t
WHERE t.description = the_table.description) AS temp)
Relational datebases do not have a notion of subsequent. Records in a table are not in any particular order. If you do not specify an order in a SELECT query, you have to assume that the records are retrieved in an order that you do not expect.
The comment Oswald made about ordering (or lack thereof) of the rows is very important. You have no garuntee, period, that unsorted rows selected out of this table will be in the order you expect. This means that unless you specify the existing in table order every single time, things could be tagged 'AS ABOVE' even when this does not reflect reality. In addition, none of the provided solutions so far will deal with any out-of-sequence records properly.
Overall, this sounds more like a database design issue (specifically, a normalization problem), than a query issue.
Ideally, the descriptions would be extracted to some master datatable (along with the necessary ids). Then, the choice about the description to use is left to when the 'SELECT' runs. This has the added benefit of making the 'AS ABOVE' safe for changes in ordering.
So, assuming that each instance of the order_ref column should have a different description (barring the 'AS ABOVE' bit), the tables can be refactored as followed:
id | order_ref | price
=======================
1 | 34567 | 19.99
2 | 34567 | 13.99
and
order_ref_fk | description
==========================================
34567 | "This is the description"
At this point, you join to the description table normally. Displaying a different description is usually a display issue regardless, to be handled by whatever program you have outputting the rows to display (not directly in the database).
If you insist on doing this in-db, you could write the SELECT in this vein:
SELECT Orders.id, Orders.order_ref, Orders.price,
COALESCE(Dsc.description, 'AS ABOVE')
FROM Orders
LEFT JOIN (Description
JOIN (SELECT order_ref, MIN(id) AS id
FROM Orders
GROUP BY order_ref) Ord
ON Ord.order_ref = Description.order_ref_fk) Dsc
ON Dsc.order_ref_fk = Orders.order_ref
AND Dsc.id = Orders.id
ORDER BY Orders.order_ref, Orders.id