LIMIT specialize - mysql

SELECT *, IFNULL(parent, id) AS p, IFNULL(reply_comment_id, id) AS r
FROM article_comments ORDER BY p ASC, r ASC, date DESC
I want use LIMIT. If I have more row, I want limit the query by "p".
In image: ("p": 1, 1, 1, 1, 1) – this is ONE, ("p": 2, 2, 2, 2, 2, 2, 2) – this is TWO...
For example, I want: if I use LIMIT 1, then only show ("p": 1, 1, 1, 1, 1).

Here is the query I would use:
SELECT * FROM
FROM article_comments
WHERE id = 1 OR parent = 1
ORDER BY parent ASC, reply_comment_id ASC, `date` DESC
Note that I see no need for your calculated columns here. When ordering by parent and reply_comment_id in ascending fashion, the NULL values will be first in sort.
Make sure you have indexes on parent, reply_comment_id, and date fields (in addition to id, which I assume is primary key)
Also, you might want to consider using a datetime field for your date column, rather than unix timestamp. It is much user friendly when viewing the data in the database, as well as when trying to query date ranges/ So can write your filter like this:
WHERE `date` BETWEEN '2012-01-01 00:00:00' AND '2012-12-31 23:59:59'
Instead of having to due timestamp conversions.

Related

MySQL - Pull most recent value within date range for group of IDs

I have the query below
SELECT SUM(CAST(hd.value AS SIGNED)) as case_count
FROM historical_data hd
WHERE hd.tag_id IN (45,109,173,237,301,365,429)
AND hd.shift = 1
AND hd.timestamp BETWEEN '2018-04-10' AND '2018-04-11'
ORDER BY TIMESTAMP DESC
and with this I'm trying to select a SUM of the value for each of the IDs passed, during the time frame in the BETWEEN statement - but the most recent respective to that timeframe. So the end result would be a SUM of the case_count values for each ID passed in at the last timestamp the ID has i nthat date range.
I am having trouble figuring out HOW to accomplish this. My historical_data table is HUGE, however I do have very specific indexing on it that allows the queries to function fairly well - as well as partitioning on the table by YEAR.
Can anyone provide a pointer on how to get the data I need? I'd rather not loop over the list of IDs and run this query without the SUM and a LIMIT 1, but I guess I can if that's the only way.
Here is one method:
SELECT SUM(CAST(hd.value AS SIGNED)) as case_count
FROM historical_data hd
WHERE hd.tag_id IN (45, 109, 173, 237, 301, 365, 429) AND
hd.shift = 1 AND
hd.timestamp = (SELECT MAX(hd2.timestamp)
FROM historical_data hd
WHERE hd2.tag_id = hd.tag_id AND
hd2.shift = hd.shift AND
hd2.timestamp BETWEEN '2018-04-10' AND '2018-04-11'
);
The optimal index for this query is on historical_data(shift, tag_id, timestamp).

Query Database Accurately Based on Timestamp

I am currently having an accuracy issue when querying price vs. time in a Google Big Query Dataset. What I would like is the price of an asset every five minutes, yet there are some assets that have an empty row for an exact minute.
For example, with VEN vs ICX which are two cryptocurrencies, there might be a time at which price data is not available for a specific second. In my query, I am querying a database for every 300 seconds and taking the price data, yet some assets don't have a timestamp for 5 minutes and 0 seconds. Thus, I would like the get the last known price: a good price to use would be 4 minutes and 58 seconds.
My query right now is:
SELECT MIN(price) AS PRICE, timestamp
FROM [coin_data]
WHERE coin="BTCUSD" AND TIMESTAMP_TO_SEC(timestamp) % 300 = 0
GROUP BY timestamp
ORDER BY timestamp ASC
This query results in this sort of gap in specific places:
Row((10339.25, datetime.datetime(2018, 2, 26, 21, 55, tzinfo=<UTC>)))
Row((10354.62, datetime.datetime(2018, 2, 26, 22, 0, tzinfo=<UTC>)))
Row((10320.0, datetime.datetime(2018, 2, 26, 22, 10[should be 5 for 5 min], tzinfo=<UTC>)))
This one should not be 10 in the last column as that is the minutes place and it should read 5 mins.
In order to select a row that has a 5 minute mark/timestamp if it exists, or the closest existing entry, you can use "(analytic) window functions"(uses OVER()) instead of aggregate functions(uses GROUP BY), as following:
group all rows into "separate" 5 minute groups
sort them by proximity to the desired time
select the first row from each partition.
Here I am using OVER clause to create the "window frames" and sorts the rows in them. Then RANK() numbers all rows in each window frame as they are sorted.
Standard SQL
WITH
data AS (
SELECT *,
CAST(FLOOR(UNIX_SECONDS(timestamp)/300) AS INT64) AS timegroup
FROM
`coin_data` )
SELECT min(price) as min_price, timestamp
FROM
(SELECT *, RANK() OVER(PARTITION BY timegroup ORDER BY timestamp ASC) AS rank
FROM data)
WHERE rank = 1
group by timestamp
ORDER BY timestamp ASC
Legacy SQL
SELECT MIN(price) AS min_price, timestamp
FROM (
SELECT *,
RANK() OVER(PARTITION BY timegroup ORDER BY timestamp ASC) AS rank,
FROM (
SELECT *,
INTEGER(FLOOR(TIMESTAMP_TO_SEC(timestamp)/300)) AS timegroup
FROM [coin_data]) AS data )
WHERE rank = 1
GROUP BY timestamp
ORDER BY timestamp ASC
It seems that you have many prices for the same time stamp in which case you may want to add another field to OVER clause.
OVER(PARTITION BY timegroup, exchange ORDER BY timestamp ASC)
Notes:
Consider migrating to Standard SQL, which is the preferred SQL dialect for querying data stored in BigQuery. You can do that on single query basis, so you don't have to migrate everything at the same time.
My idea was to provide a general query that would illustrate the principle so I don't filter for empty rows, because it's not clear if they are null or empty string and it's not really necessary for the answer.

MySQL fastest way to search by DATE if a record exist

I have found many way to search a mysql record by DATE
Method 1:
SELECT id FROM table WHERE datetime LIKE '2015-01-01%' LIMIT 1
Method 2 (same as method 1 + ORDER BY):
SELECT id FROM table WHERE datetime LIKE '2015-01-01%' ORDER BY datetime DESC LIMIT 1
Method 3:
SELECT id FROM table WHERE datetime BETWEEN '2015-01-01' AND '2015-01-01 23:59:59' LIMIT 1
Method 4:
SELECT id FROM table WHERE DATE_FORMAT( datetime, '%y.%m.%d' ) = DATE_FORMAT( '2015-01-01', '%y.%m.%d' )
Method 5 (I think is the slowest):
SELECT id FROM table WHERE DATE(`datetime`) = '2015-01-01' LIMIT 1
What is the fastest?
In my case the table has 1 million rows, and the date to search is always recent.
The fastest of the methods you've mentioned is
SELECT id
FROM table
WHERE datetime BETWEEN '2015-01-01' AND '2015-01-01 23:59:59'
LIMIT 1
This is made fast when you create an index on the datetime column. The index can be random-accessed to find the first matching row, and then scanned until the last matching row. So it's not necessary to read the whole table, or even the whole index. And, when you use LIMIT 1, it just reads the single row. Very fast, even on an enormous table.
Your other means of search apply a function to each row:
datetime LIKE '2011-01-01%' casts datetime as a string for each row.
Methods 3,4, and 5 all use explicit functions like DATE() on the contents of each row.
The use of these functions defeats the use of indexes to find your data.
Pro tip: Don't use BETWEEN for date arithmetic because it handles the ending condition poorly. Instead use
WHERE datetime >= '2015-01-01'
AND datetime < '2015-01-02'
This performs just as well as BETWEEN and gets you out of having to write the last moment of 2015-01-01 explicitly as 23:59:59. That isn't correct with higher precision timestamps anyway.
The fastest way, assuming there's in index on the datetime column, is a variant of method 3 except both range values are datetime literals:
SELECT id FROM table
WHERE datetime BETWEEN '2015-01-01 00:00:00' AND '2015-01-01 23:59:59'
LIMIT 1
Using literal of the same type as the column means there won't be any casting of the column to perform comparison, giving the best chance of using an index on the column. I have used this in production to great effect.

MySQL incorrect order of DECIMAL?

When I select data from a MySQL table and order it by a DECIMAL column DESCENDING, this is the order:
3, 2, 1, -1, 0
Why is this so?
How to correctly set the order so that it is:
3, 2, 1, 0, -1 ?
EDIT
Actually, the problem is with NULL data. This is the order it does:
3, 2, 1, -1, NULL, NULL
This is the desired order:
3, 2, 1, NULL, NULL, -1
use COALESCE in your ORDER BY clause
SELECT *
FROM tableName
ORDER BY COALESCE(columnName, 0) DESC
SQLFiddle Demo
use the following syntax for that
ORDER BY column DESC
If you're ordering by a column with nulls, they always end up at the beginning or end of your results. Null means unknown, not zero.
If you want to treat them as zero you can use ifnull(column, 0) in your select statement.

Rotate table data with out update data

SELECT * FROM `your_table` LIMIT 0, 10
->This will display the first 1,2,3,4,5,6,7,8,9,10
SELECT * FROM `your_table` LIMIT 5, 5
->This will show records 6, 7, 8, 9, 10
I want to Show data 2,3,4,5,6,7,8,9,10,1 and
next day 3,4,5,6,7,8,9,10,1,2
day after next day 4,5,6,7,8,9,10,1,2,3
IS IT POSSIBLE with out updating any data of this table ???
You can do this using the UNION syntax:
SELECT * FROM `your_table` LIMIT 5, 5 UNION SELECT * FROM `your_table`
This will first select rows within your limit, and then combine the remainder from the second select. Note that you don't need to set a limit on the second select statement:
The default behavior for UNION is that duplicate rows are removed from the result. The optional DISTINCT keyword has no effect other than the default because it also specifies duplicate-row removal. With the optional ALL keyword, duplicate-row removal does not occur and the result includes all matching rows from all the SELECT statements.
I don't think this might be achieved using a simple Select (I may be wrong). I think you'll need a stored procedure.
You've tagged this as Oracle, though your SQL syntax would be invalid for Oracle because it doesn't support LIMIT
However, here's a solution that will work in Oracle:
select *
from ( select rownum as rn,
user_id
from admin_user
order by user_id
) X
where X.rn > :startRows
and X.rn <= :startRows + :limitRows
order by case when X.rn <= :baseRef
then X.rn + :limitRows
else
X.rn
end ASC
;
where :startRows and :limitRows are the values for your LIMIT, and :baseRef is a value between 0 and :limitRows-1 that should be incremented/cycled on a daily basis (ie on day 1 it should be 0; on day 2, 1; on day 10, 9; on day 11 you should revert to 0). You could actually use the current date, converted to Julian and take the remainder when divided by :limitRows to automate calculating :baseRef
(substitute your own column and table names as appropriate)
Well, it might be a little bit late for the author of the question, but could be useful for people.
Short answer: It is possible to do the "spin" like author asked.
Long answer: [I'm going to explain for MySQL first - where I tested this]
Let's imagine that we have table your_table (INT rn, ...). What you want is to sort in specific way ("spin" with beginning at the rn=N). First condition of ordering is rn >= N desc. The idea (at least how I understand this) is we change the order from asc to desc and split our table in two parts (<N and >=N). Then we order this back by rn but asc order. It will execute sorting for each group independently. So here is our query:
select * from your_table where rn between 1 and 10
order by rn >= N desc, rn asc;
If you don't have rn column - you always can use the trick with parameter
select t.*, #rownum := #rownum + 1 AS rn
from your_table t,
(SELECT #rownum := 0) r
where #rownum < 10 /* here be careful - we already increased by 1 the rownum */
order by #rownum >=N - 1 desc, /* another tricky place (cause we already increased rownum) */
#rownum asc;
I don't know if the last one is efficient, though.
For Oracle, you always can use rownum. And I believe that you will have the same result (I didn't test it!).
Hope it helps!