"A numeric filter is not working with one index while is working with another same index. What could be the reason?" - redisearch

RediSearch Numeric filter stopped working suddenly for the one index in production. While its working in another same index in same server.
Schema is as provided below and I am trying to do filter on field ExpireAt
I have already tried below queries. all providing 0 documents.
ft.search 9194_Cache "#ExpireAt:[0 inf]" Limit 0 2
ft.search 9194_Cache "#ExpireAt:[-1 inf]" Limit 0 2
ft.search 9194_Cache "#ExpireAt:[637024139268750000 637024139268750000]" Limit 0 2
------------Sample Data-Top 2 Records are like below--------
Below query with result.
While same is working for other index. check below result.

Related

MySQL - Convert single column into an array

I have a column of data, e.g. as follows:
select league_id from leagues
This gives me a single column (league_id) and 100+ rows for that column.
I want to convert it into a single cell (1 row, 1 column) with the following structure:
[1001, 1002, 42022, 203412, 24252, etc..]
Essentially converting the rows into one big array.
There must be a way of doing it but can't see how.
I'm using MariaDB 10.2.
You can use the GROUP_CONCAT() function for that.
Usage is straightforward:
id
val
1
1001
2
1002
3
42022
4
203412
5
24252
SELECT group_concat(val)
FROM tab
gives you
group_concat(val)
1001,1002,42022,203412,24252
See db<>fiddle.
(Note: Before MariaDB 10.3.3 you cannot use the LIMIT clause with GROUP_CONCAT, in case you should need that).

MYSQL slow duration or fetch time depending on "distinct" command

I have a pretty small, simple MYSQL table for holding precalculated financial data. The table looks like:
refDate | instrtument | rate|startDate |maturityDate|carry1|carry2|carry3
with 3 indices defined as:
unique unique_ID(refDate,instrument)
refDate (refDate)
instrument (instrument)
rows right now is about 10 million, though for each refDate, there are only about 5000 distinct instruments right now
I have a query that self joins on this table to generate an output like:
refDate|rate instrument=X | rate instrument = Y| rate instrument=Z|....
basically returning time series data which I can then do my own analytics in.
Here is the problem: my original query looked like:
Select distinct AUDSpot1yFq.refDate,AUDSpot1yFq.rate as 'AUDSpot1yFq',
AUD1y1yFq.rate as AUD1y1yFq
from audratedb AUDSpot1yFq inner join audratedb AUD1y1yFq on
AUDSpot1yFq.refDate=AUD1y1yFq.refDate
where AUDSpot1yFq.instrument = 'AUDSpot1yFq' and
AUD1y1yFq.instrument = 'AUD1y1yFq'
order by AUDSpot1yFq.refDate
Note, in this particular query for timing below, I was actually getting 10 different instruments, which means the query was much longer but followed this same pattern of naming, inner joins, and where statements.
This was slow, in workbench I time it as 7-8 second duration (but near 0 fetch time, as I have workbench on the machine running the server). When I stripped the distinct, the duration drops to 0.25-0.5 seconds (far more manageable) and when I stripped the "order by" it got even faster (<0.1 seconds, at which point I don't care). But my Fetchtime exploded to ~7 seconds. So in total, I gain nothing but it has all become a Fetch time issue. When I run this query from the python scripts that will be doing the lifting and work, I get roughly the same timing whether I include distinct or not.
when I run an explain on my cut down query (which has the horrid fetch time) I get:
1 SIMPLE AUDSpot1yFq ref unique_ID,refDate,instrument instrument 39 const 1432 100.00 Using where
1 SIMPLE AUD1y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD2y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD3y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD4y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD5y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD6y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD7y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD8y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
1 SIMPLE AUD9y1yFq ref unique_ID,refDate,instrument unique_ID 42 historicalratesdb.AUDSpot1yFq.refDate,const 1 100.00 Using where
I now realize distinct is not required, and order by is something I can throw out and sort in pandas when I get the output to a dataframe. That is great. But I don't know how to get the Fetch time down. I'm not going to win any competency competitions on this website, but I have searched as much as I can and can't find a solution for this issue. Any help is greatly appreciated.
~cocoa
(I had to simplify the table aliases in order to read it:)
Select distinct
s.refDate,
s.rate as AUDSpot1yFq,
y.rate as AUD1y1yFq
from audratedb AS s
join audratedb AS y on s.refDate = y.refDate
where s.instrument = 'AUDSpot1yFq'
and y.instrument = 'AUD1y1yFq'
order by s.refDate
Index needed:
INDEX(instrument, refDate) -- To filter and sort, or
INDEX(instrument, refDate, rate) -- to also "cover" the query.
That assumes the query is not more complex than you said. I see that the EXPLAIN already has many more tables. Please provide SHOW CREATE TABLE audratedb and the entire SELECT.
Back to your questions...
DISTINCT is done one of two ways: (1) sort the table, then dedup, or (2) dedup in a hash in memory. Keep in mind that you are dedupping all 3 columns (refDate, s.rate, y.rate).
ORDER BY is a sort after gathering all the data. However, with the suggested index (not the indexes you had), the sort is not needed, since the index will get the rows in the desired order.
But... Having both DISTINCT and ORDER BY may confuse the Optimizer to the point where it does something 'dumb'.
You say that (refDate,instrument) is UNIQUE, but you do not mention a PRIMARY KEY, nor have you mentioned which Engine you are using. If you are using InnoDB, then PRIMARY KEY(instrument, refDate), in that order, would further speed things up, and avoid the need for any new index.
Furthermore, it is redundant to have (a,b) and also (a). That is, your current schema does not need INDEX(refDate), but by changing the PK, you would not need INDEX(instrument), instead.
Bottom line: Only
PRIMARY KEY(instrument, refDate),
INDEX(refDate)
and no other indexes (unless you can show some query that needs it).
More on the EXPLAIN. Notice how the Rows column says 1432, 1, 1, ... That means that it scanned an estimated 1432 rows of the first table. This is probably far more than necessary because of lack of a proper index. Then it needed to look at only 1 row in each of the other tables. (Can't get better than that.)
How many rows in the SELECT without the DISTINCT or ORDER BY? That tells you how much work was needed after doing the fetching and JOINing. I suspect it is only a few. A "few" is really cheap for DISTINCT and ORDER BY; hence I think you were barking up the wrong tree. Even 1432 rows would be very fast to process.
As for the buffer_pool... How big is the table? Do SHOW TABLE STATUS. I suspect the table is more than 1GB, hence it cannot fit in the buffer_pool. Hence raising that cache size would let the query run in RAM, not hitting the disk (at least after it gets cached). Keep in mind that running a query on a cold cache will have lots of I/O. As the cache warms up, queries will run faster. But if the cache is too small, you will continue to need I/O. I/O is the slowest part of the processing.
I hope you have at least 6GB of RAM; otherwise, 2G could be dangerously large. Swapping is really bad for performance.
The question doesn't mention existing indexes, or show the output from an EXPLAIN for any of the queries.
The quick answer to improve performance is to add an index:
... ON audratedb (instrument,refdate,rate)
To answer why we'd want to add that index, we'd need to understand how MySQL processes SQL statements, what operations are possible, and which are required. To see how MySQL is actually processing your statement, you need to use EXPLAIN to see the query plan.

How to write search(list) queries in Mysql

There is a search page in webapplication(Pagination is used : 10 records per page). Database used : Mysql. Table has around 1000 00records.Query is tuned as in query is using index (checked Explain plan).Result set that fetches around 17000 rows and takes around 5 sec .Can any please suggest how to optimize search Query.(Note : Tried to use limit but query time did not improve).
Query Eg:
Select * from abc
Join def on abc.id=def.id
where date >= '2013-09-03'
and date <='2014-10-01'
and def.state=1
-- id on both table is indexed
-- date and state column cannot be indexed as they have low SI.

distinct with highest value using django orm

I have records in a table like that
ID Name priority
1 MyString 12
2 Search 20
3 MyString 50
4 MyString 10
5 Search 7
I want to get distinct rows with highest priority value. For example, the above example should give the following result
ID Name priority
2 Search 20
3 MyString 50
I was going through the docs and found out that distinct on columns cannot be found out on mysql. So I tried to perform a group-by and sort (descending on priority column).
I tried this
model_name.objects.all().values('name','priority','id').annotate(Count('search_name')).order_by('-priority')
But I am not getting the desired result. Is it possible to do this in a single orm query.
I am using django 1.6 and mysql as my database.
Have you tried adding the .distinct() method to your query? Django ORM has the distinct() method. I hope this helps you:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#distinct

Why it takes more time for MySQL to get records starting with higher number? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I speed up a MySQL query with a large offset in the LIMIT clause?
In our application we are showing records from MySQL on a web page. Like in most such applications we use paging. So the query looks like this:
select * from sms_message
where account_group_name = 'scott'
and received_on > '2012-10-11' and
received_on < '2012-11-30'
order by received_on desc
limit 200 offset 3000000;
This query takes 104 seconds. If I only change offset to low one or remove it completely, it's only half a second. Why is that?
There is only one compound index, and it's account_group_name, received_on and two other columns. Table is InnoDB.
UPDATE:
Explain returns:
1 SIMPLE sms_message ref all_filters_index all_filters_index 258 const 2190030 Using where
all_filters_index is 4-columns filter mentioned above.
Yes this is true, time increases as offset value increases and the reason is because offset works on the physical position of rows in the table which is not indexed. So to find a row at offset x, the database engine must iterate through all the rows from 0 to x.