I have a simple MYSQL table with about 5 columns or so. The row size of the table changes quite frequently.
One of these columns is named has_error, and is a column that has a value of either 1 or 0.
I want to create a single SQL query that will be the equivalent of the following simple equation:
(Number of rows with has_error = 1/Total number of rows in table) * 100
I can create the individual SQL queries (see below), but not sure how to put it all together.
SELECT COUNT(*) AS total_number_of_rows FROM my_table
SELECT COUNT(*) AS number_of_rows_with_errors FROM My_table WHERE has_error = 1
This is easy because you can just use avg(has_error):
SELECT AVG(has_error) * 100
FROM My_table;
Related
is there a way to select couple columns using a limit but in mean time get the total count of rows in that table
SELECT col1,col2 FROM table where col3=0 limit 500
just like above. what to add to get the total of rows in the table
thank you
Try this, you only need to use one column in the count
SELECT COUNT(col3) FROM table where col3=0
And here's a reference for more information MySQL COUNT
Note, to keep it simple, you have to run two queries, this one for the count and yours for the records
Your question lacks a precise problem statement.
My understanding of your question is as follows:
All of the following is necessary:
Need to get every row from the table, such that col3=0.
Need to get the total number of rows in the table (whether they satisfy col3=0 or not).
Need to limit your result set to columns col1, col2, and have at most 500 rows.
Need to execute a single query, rather than two separate queries.
If this is correct interpretation of your question, then I would propose the following solution:
SELECT col1,col2, (SELECT COUNT(*) FROM table) AS total_rows FROM table where col3=0 limit 500
Such query would produce result where at most 500 rows from your table satisfy the condition col3=0 are present alongside total_rows, which tells the number of all the rows in the table.
Example CSV of the result:
clo1,col2,total_rows
a,b,1000
c,d,1000
d,e,1000
according to found_rows()
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17; expect them to be removed in a future version of MySQL. As a replacement, considering executing your query with LIMIT, and then a second query with COUNT(*) and without LIMIT to determine whether there are additional rows
instead of these queries:
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();
Use these queries instead:
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT COUNT(*) FROM tbl_name WHERE id > 100;
COUNT(*) is subject to certain optimizations. SQL_CALC_FOUND_ROWS causes some optimizations to be disabled.
you can check these questions
how-to-get-number-of-rows
mysql-what-is-the-row-count-information-function-for-select
I am having two table
Table 1 having a field
id
book_ids
1
1,2,3
Table 2 have all the book Ids
select *
from table 2
where book_id in (select book_ids from table 1 where id=1) ;
this statement not returning all the book ids from table 2 having id 1,2,3
Can anyone help
You could use the FIND_IN_SET() function:
select *
from table 2
where FIND_IN_SET(book_id, (select book_ids from table 1 where id=1)) > 0;
Read the documentation I linked to for details on how that function works.
But only do this if your table remains small. Using this function spoils any opportunity to optimize the query with an index, so the larger your table gets, the performance will grow worse and worse.
Also FIND_IN_SET() doesn't work the way you expect if there are spaces in your comma-separated list.
try to store table1 values in rows of table not in a same field.
and then your SELECT IN works.
I have a table in MySQL which I want to query parallel by executing multiple select statements that select
non-overlapping equal parts from the table, like:
1. select * from mytable where col between 1 and 1000
2. select * from mytable where col between 1001 and 2000
...
The problem is that the col in my case is varchar. How can I split the query in this case?
In Oracle we can operate with NTILE in combination with rowids. But I didn't find a similar approach in case of MySQL.
That's why my thinking is to hash the col value and mod it by the number of equal parts I want to have.
Or instead of hashing, dynamically generated rownums could be used.
What would be an optimal solution considering that the table big (xxxM rows) and I want to avoid full table
scans for each of the queries?
You can use limit for the purpose of paging, so you will have:
1. select * from mytable limit 0, 1000
2. select * from mytable limit 1000, 1000
you can use casting for varchar column to integer like this cast(col as int)
Regards
Tushar
Without scanning fulltable, it will produce results
SELECT * FROM mytable
ORDER BY ID
OFFSET 0 ROWS
FETCH NEXT 100 ROWS ONLY
Can I divide the values in a column by the number of rows returned in the query, using a single query?
For example, I select some numeric value from a table based on some condition:
select value from table where ...
Let's say it returns N rows and I want the returned values divided by the number of rows returned:
select value / N from table where ...
It can be done with several queries (e.g. after the first query, I query the number of rows and do the query again). But can it be done in a single query with some SQL trick without query condition duplication, so that the actual selection (the WHERE part) which may be complicated runs only once?
You can do it in a single query, but as far as I know, with mysql you have to repeat the condition:
select
value/#cnt from
table1
INNER JOIN (select #cnt = count(*)
FROM table1 WHERE [the same condition as in main query]) ON (1=1)
WHERE condition
Or you can just
SELECT value/(SELECT COUNT(*) FROM table1 WHERE ...)
FROM table1
WHERE ...
I believe optimizer should generate the same execution plan for both queries.
I need to select sample rows from a set. For example if my select query returns x rows then if x is greater than 50 , I want only 50 rows returned but not just the top 50 but 50 that are evenly spread out over the resultset. The table in this case records routes - GPS locations + DateTime.
I am ordering on DateTime and need a reasonable sample of the Latitude & Longitude values.
Thanks in advance
[ SQL Server 2008 ]
To get sample rows in SQL Server, use this query:
SELECT TOP 50 * FROM Table
ORDER BY NEWID();
If you want to get every n-th row (10th, in this example), try this query:
SELECT * From
(
SELECT *, (Dense_Rank() OVER (ORDER BY Column ASC)) AS Rank
FROM Table
) AS Ranking
WHERE Rank % 10 = 0;
Source
More examples of queries selecting random rows for other popular RDBMS can be found here: http://www.petefreitag.com/item/466.cfm
Every n'th row to get 50:
SELECT *
FROM table
WHERE row_number() over() MOD (SELECT Count(*) FROM table) / 50 == 0
FETCH FIRST 50 ROWS ONLY
And if you want a random sample, go with jimmy_keen's answer.
UPDATE:
In regard to the requirement for it to run on MS SQL, I think it should be changed to this (no MS SQL Server around to test though):
SELECT TOP 50 *
FROM (
SELECT t.*, row_number() over() AS rn, (SELECT count(*) FROM table) / 50 AS step
FROM table t
)
WHERE rn % step == 0
I suggest that you add a calculated column to your resultset on selection that is obtained as a random number, and then select the top 50 sorted by that column. That will give you a random sample.
For example:
SELECT TOP 50 *, RAND(Id) AS Random
FROM SourceData
ORDER BY Random
where SourceData is your source data table or view. This assumes T-SQL on SQL Server 2008, by the way. It also assumes that you have an Id column with unique ids on your data source. If your ids are very low numbers, it is a good practice to multiply them by a large integer before passing them to RAND, like this:
RAND(Id * 10000000)
If you want an statically correct sample, tablesample is a wrong solution. A good solution as I described in here based on a Microsoft Research paper, is to create a materialized view over your table which includes an additional column like
CAST( ROW_NUMBER() OVER (...) AS BYTE ) AS RAND_COL_, then you can add an index on this column, plus other interesting columns and get statistically correct samples for your queries fairly quickly. (by using WHERE RAND_COL_ = 1).