I have a database with many rows and I would like to consecutively aggregate say 10 rows and calculate the average of one column. So row 1 to 10 will be average value no. one, row 11 to 20 will be average value no. two, etc.
Can this be done in MySql?
You'll need to GROUP BY FLOOR(something/10) to group each 10 rows. A primary autoincrement key without gaps would be best for this.
SELECT FLOOR(something/10) as start_id, AVG(yourAmount)
FROM yourTable
GROUP BY FLOOR(something/10)
I managed to solve the ordering issue by using aliases and group/order by those in one simple query.
SELECT FLOOR(id/10) AS id_tmp, id AS sorter, AVG(col_1) AS col_1_avg
FROM my_table
GROUP BY id_tmp
ORDER BY sorter DESC
I'm not sure why this works but in MySQL 5.0 it works anyway.
Related
I am trying to do a simple test where I'm pulling from a table the information of a specific part number as such:
SELECT *
FROM table_name
WHERE part_no IN ('abc123')
This returns 25 rows. Now I want to count the number that meet the "accepted" condition in a specific column but the result is limited to only the 10 most recent. My approach is to write it as follows:
Select Count(*)
FROM table_name
WHERE part_no IN ('abc123') AND lot IN ('accepted')
ORDER BY date DESC
LIMIT 10
I'm having a hard time to get the ORDER BY and LIMIT operations to work. I could use help just getting it to limit appropriately, and I can figure out the rest from there.
Edit: I understand that the operations are happening on the COUNT which only returns one row with a value; but I put the second clip to show where I am stuck in my thought process.
Your query SELECT Count(*) FROM ... will always return exactly one row.
It's not 100% clear what exactly you want to do, but if you want to know how many of the last 10 have been accepted, you could use a subquery - something like:
SELECT COUNT(*) FROM (
SELECT lot
FROM table_name
WHERE part_no IN ('abc123')
ORDER BY date DESC
LIMIT 10
)
WHERE lot IN ('accepted')
The inner query will return the 10 most recent rows for part abc123, then the outer query will count the accepted ones.
There are also other solution (for example, you could have the inner query output a field that is 0 when the part is not accepted and 1 when the part is accepted, then take the sum). Depending on which exact dialect/database you are using, you may also have more elegant options.
Select count returns ONE ROW therefore the ORDER BY and the LIMIT will not work on the results
I am trying to understand how Pagination works in mySQL. Considering I have a lot of data that is to be retrieved based on select query how does adding different columns in the select statement change the pagination?
e.g.
Select name from employee; vs Select name, employeeId from employee;
Will using employeeId in the select field help in retrieving data in more efficient manner even though that field in not required. Adding it as employeeId is indexed.
Thanks
Pagination deals with rows, not columns, and is as simple as a LIMIT clause:
LIMIT 10 OFFSET 10
Would give you 10 rows starting from the 10th record.
There is no implicit pagination in MySQL. If you are looking to implement pagination based on your query results, the LIMIT clause may come handy.
For example:
select name from employee limit a,b;
will return b rows from the table employee. These would be row numbers a+1 through a+b
select name from employee 0, 10
would return rows 1 through 10.
Using any number of columns will not affect the way you limit the number of rows which is normally referred to as pagination.
I was recently asked this question in an interview.
I tried this in mySQL, and got the same results(final results).
All gave the number of rows in that particular table.
Can anyone explain the major difference between them.
Nothing really, unless you specify a field in a table or an expression within parantheses instead of constant values or *
Let me give you a detailed answer. Count will give you non-null record number of given field. Say you have a table named A
select 1 from A
select 0 from A
select * from A
will all return same number of records, that is the number of rows in table A. Still the output is different. If there are 3 records in table. With X and Y as field names
select 1 from A will give you
1
1
1
select 0 from A will give you
0
0
0
select * from A will give you ( assume two columns X and Y is in the table )
X Y
-- --
value1 value1
value2 (null)
value3 (null)
So, all three queries return the same number. Unless you use
select count(Y) from A
since there is only one non-null value you will get 1 as output
COUNT(*) will count the number of rows, while COUNT(expression) will count non-null values in expression and COUNT(column) will count all non-null values in column.
Since both 0 and 1 are non-null values, COUNT(0)=COUNT(1) and they both will be equivalent to the number of rows COUNT(*). It's a different concept, but the result will be the same.
Now - they should all perform identically.
In days gone by, though, COUNT(1) (or whatever constant you chose) was sometimes recommended over COUNT(*) because poor query optimisation code would make the database retrieve all of the field data prior to running the count. COUNT(1) was therefore faster, but it shouldn't matter now.
Since the expression 1 is a constant expression, they should always produce the same result, but the implementations might differ as some RDBMS might check whether 1 IS NULL for every single row in the group. This is still being done by PostgreSQL 11.3 as I have shown in this article.
I've benchmarked queries on 1M rows doing the two types of count:
-- Faster
SELECT count(*) FROM t;
-- 10% slower on PostgreSQL 11.3
SELECT count(1) FROM t;
One reason why people might use the less intuitive COUNT(1) could be that historically, it was the other way round.
The result will be the same, however COUNT(*) is slower on a lot of production environments today, because in production the db engines can live decades. I prefer to use COUNT(0), someone use COUNT(1), but definitely not COUNT(*) even if its lets say safe to use on modern db engines, I would not depend on the engine, especially if its only one character difference, also the code will be more portable.
count(any integer value) is faster than count(*) ---> gives all counts including null values
count(column_name) omits null
Ex-->
column name=> id
values => 1 1 null null 2 2
==> count(0), count(1), count(*) -----> result is 6 only
==> count(id) ----> result is 4
Let's say we have table with columns
Table
-------
col_A col_B
System returns all column (null and non-null) values when we query
select col_A from Table
System returns column values which are non-null when we query
select count(col_A) from Table
System returns total rows when we query
select count(*) from Table
Mysql5.6 👇
InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way. There is no performance difference.
12.19.1 Aggregate Function Descriptions
Official doc is the fastest way after I found many different answers.
COUNT(*), COUNT(1) , COUNT(0), COUNT('Y') , ...
All of the above return the total number of records (including the null ones).
But COUNT('any constant') is faster than COUNT(*).
I'm trying to find the sum of values in a particular column for the last ten rows selected by some criteria and ordered by some column. I tried the obvious:
SELECT SUM(column) AS abc FROM table WHERE criteria ORDER BY column DESC LIMIT 10
However this seems to sum the entire column!?
So after playing around this seems to work:
SELECT SUM(column) AS abc FROM (SELECT column FROM table WHERE criteria ORDER BY column DESC LIMIT 10) AS abc
My questions...
Why doesn't the more intuitive approach work?
I could access the result by using $data[0], but I prefer to have some meaningful variable. So why do I need to do AS abc twice?
Is there a tidier/better way to do the job?
I'm quite inexperienced with SQL queries so I would really appreciate any help.
Because mysql runs query in the following order:
FROM->WHERE->GROUP BY->HAVING->ORDER BY->LIMIT.
So limit will be applied after grouping and will filter groups but not ordinary rows.
Regarding abs twice: it's necessary to add alias for all derived queries. This is mysql rule.
What's the most efficient way to select the last n number of rows in a table using mySQL? The table contains millions of rows, and at any given time I don't know how large the table is (it is constantly growing). The table does have a column that is automatically incremented and used as a unique identifier for each row.
SELECT * FROM table_name ORDER BY auto_incremented_id DESC LIMIT n
Actually the right way to get last n rows in order is to use a subquery:
(SELECT id, title, description FROM my_table ORDER BY id DESC LIMIT 5)
ORDER BY tbl.id ASC
As this way is the only I know that will return them in right order. The accepted answer is actually a solution for "Select first 5 rows from a set ordered by descending ID", but that is most probably what you need.
(Similar to "marco"s answer,)
my fav is the max()-function of MySQL too, in a simple one-liner, but there are other ways of sure:
SELECT whatever FROM mytable WHERE id > (SELECT max(id)-10 FROM mytable);
... and you get "last id minus 10", normally the last 10 entries of that table.
It's a short way, to avoid the a error 1111 ("Invalid use of group function") not only if there is a auto_increment-row (here id).
The max()-function can be used many ways.
Maybe order it by the unique id descending:
SELECT * FROM table ORDER BY id DESC LIMIT n
The only problem with this is that you might want to select in a different order, and this problem has made me have to select the last rows by counting the number of rows and then selecting them using LIMIT, but obviously that's probably not a good solution in your case.
Use ORDER BY to sort by the identifier column in DESC order, and use LIMIT to specify how many results you want.
You would probably also want to add a descending index (or whatever they're called in mysql) as well to make the select fast if it's something you're going to do often.
This is a lot faster when you have big tables because you don't have to order an entire table.
You just use id as a unique row identifier.
This is also more eficient when you have big amounts of data in some colum(s) as images for example (blobs). The order by in this case can be very time and data consuming.
select *
from TableName
where id > ((select max(id) from TableName)-(NumberOfRowsYouWant+1))
order by id desc|asc
The only problem is if you delete rows in the interval you want. In this case you would't get the real "NumberOfRowsYouWant".
You can also easily use this to select n rows for each page just by multiplying (NumberOfRowsYouWant+1) by page number when you need to show the table backwards in multiple web pages.
Here you can change table name and column name according your requirement . if you want to show last 10 row then put n=10,or n=20 ,or n=30 ...etc according your requirement.
select * from
(select * from employee
Order by emp_id desc limit n)
a Order by emp_id asc;