Cursor Based Pagination Across Multiple Tables [closed] - mysql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have 3 tables that are accessed as individual feeds and also a group feed.
For individual feeds, I can implement cursor-based pagination based on each row's unique id.
How would I implement cursor-based pagination for the group feed, which basically combines all 3 tables into 1 feed?
Each table has unique ids and a timestamp for when it was created (although this is not unique).
I've considered using the timestamp as some sort of pointer, for example, results after a particular timestamp, but this could lead to missing results, as if you requested 10 rows, after a timestamp, and these rows all had the same timestamp, as did another 20 rows, when you perform a subsequent request, you will miss those following 20 rows.
How can this problem be tackled?

Window functions.
MySQL 8.0 introduced support for standard SQL window functions. See https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
SELECT *
FROM (
SELECT ..., ROW_NUMBER() OVER () AS rownum
FROM <multiple tables joined>
) AS t
WHERE rownum BETWEEN ? and ?
No need for LIMIT. You just use parameters to select the range of rows corresponding to the current "page" you want to view.
If you answer "but I haven't upgraded to MySQL 8.0 yet," then I would say now you have a good reason to upgrade.

Related

MySQL - need help .One table or multiple tables? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I must to make a decision if I split my table in more tables or I keep all in one table. According to my calculation if I will keep all in one table my table will have estimated 300.000 rows per year. Some people say to me to split table for every year. example 2019_table..
Some people say to split table in 4 tables(subcategories). I need an advice how to do it.
This is my current table https://ibb.co/jfZMKQJ
300K records is not really a large amount, and even over a decade, it is only 3 million records, which also is not very large. Assuming you can tune your database with appropriate indices, I don't see any reason to split into multiple tables. Even if you did have the need for this, you could try something like partitioning the table first (see the documentation).
300K records is not a large amount. Instead of splitting the tables, you better have to put an index on your datetime field assuming it is one of the fields you will use to filter your data.
See this answer for more details: Is it a good idea to index datetime field in mysql?

Search for a recent added keywords [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Say if there is a Table that has 100 Million Records.. More and more data keeps getting updated now and then. Your mission is to search for a recently added keyword say "srinu" from that table each and every 30 seconds and to display it.
What is the efficient way to do this ?
No need to write any code. Just give your views/thoughts on this.
This is a rather abstract question and will have a lot of opinionated answers.
What is the criteria for "recently added"?
If I needed a quick query to see what records were added within the last 30 seconds
I would create a trigger and a secondary lookup table
after update and after insert insert into recently_added;
and create an event to delete from recently_added where the datetime field is less than than 30 seconds ago and run it every 30 seconds
This step can be moved to the trigger and a criteria added to the select instead
This way I would SELECT * FROM recently_added if there were no records found I know that no records were updated within the last 30 seconds. Otherwise all of the keywords updated within the last 30 seconds are listed
Use Elasticsearch. First index data into database using any river like MongoDB River Plugin for ElasticSearch. So that if any new data is added to database it automatically syncs with elasticsearch and from there you can search the recent doc added.

Alternative method for using DISTINCT in mySQL [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I retrieved records from Sphinx index through DISTINCT method using sphinx for large number of records...For the backup I'm going to retrieve data from mySQL for bulk records..
Is the same DISTINCT query works well for the mySQL table also..Or is there any other way for retrieving data from mySQL instead of DISTINCT for getting data from large data table without time delay...
If indexed, there isn't a difference between distinct and group by, some would argue that Distinct doesn't have to sort, however you can get this as well with the group by, if you order by null in mysql, so I would say they aren't different at all.
However, perhaps I am missing the question all together.

Using TOP, Limit while fetching data using ID from database [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I read somewhere on Internet that using TOP(MSSQL) or LIMIT(MySQL) in your query is best practice.
He explained: If database has millions record and if you use limit then database algorithm will stop filtering other data when it gets record you requested in LIMIT or TOP.
My question is when you fetch record using WHERE condition with ID, So LIMIT or TOP does make any difference as there'll be unique id in database.
PK is applied on that column
SELECT TOP 1 *
FROM TABLE_MASTER
WHERE ID = 10`
OR
SELECT *
FROM TABLE_MASTER
WHERE ID = 10 LIMIT 1`
If this question already asked Please give me link as I was unable to find stackoverflow thread.
If you have a WHERE clause picking a specific row by a unique id, then the query is already restricted. It will scan only the single row matching the specific value. There is no benefit to using TOP or LIMIT in this case.
If someone says to you, "feature X is best practice" that doesn't mean you should use feature X even when it makes no difference.
Using TOP or LIMIT is useful if you have no condition in the WHERE clause, or a condition that would match a very large number of rows. Instead of returning thousands (or even millions) of rows you don't need, you can restrict the quantity of rows.
If there is any chance that ID is not unique there is the possibility that more than one record could be found, having no LIMIT or TOP statement could mess up your code if you only expect one record. As such, it usually doesn't hurt to put the LIMIT / TOP statement in there just in case. If the ID is already a unique PK it won't make any difference on an efficiently coded database engine (aka pretty much all of them).

Can I replace this code to use IN operator as well? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Our code is using create/drop table, while generating VR4 queue orders in our database.
When number of websites is less than 250, code is using IN operator and generating reports. ce.website_id in (" . (join ",", #{$website_id}) . ")
When we have more than 250 websites, our code is creating tables (name like Temp_tablename) and using table joining instead of IN operator. Can I replace this code to use IN operator as well? Will there be any performance issue, if IN operator is used with more input values?
As mentioned by Stan, using a temporary table rather than a large IN is the preferred way to go.
When MySQL gets a large data block from the user it stores it in a temporary table and uses a JOIN to look through it. This is easier for MySQL to do than to actually look for each of your values in the IN SQL part.
You can skip this temporary table, by first storing in a table your web site list:
REPLACE INTO tblWebSitesToHandle
(Session_ID, WebSite)
VALUES
('**unique_number**', '**website_id**'),
('**unique_number**', '**website_id**'), ...
Where unique_number will be some number you chose, and then toss away once the query ends - but it will help you manage the list of websites to handle for your query
Then in your SQL that you are currently using instead of IN (...) you will do a JOIN to this table and select from it the relevant Session_ID record.
After that is done, just remove from tblWebSitesToHandle the Session_ID data, it is no longer needed (I believe).