SQL: paging without repetition from an dynamic table - mysql

I have an SQL table called articles from which I load rows divided by pages. My SQL is
SELECT ...
ORDER BY $orderCol DESC
LIMIT $offset, $numPerPage
On page one the limit is 0, $numPerPage, page two it's $numPerPage, 2 * $numPerPage etc.
The problem: When a new row is inserted before page 2 is loaded, the last article from page 1 will be the first article in page 2 etc. How can I avoid this?
I thought about adding a WHERE clause to select articles starting from the last $orderCol, but this field is not unique (it's a date in my case) so I'll miss articles with the same value here. The primary index is also a problem because it's not ordered the same way as $orderCol
It's not necessary that the newly added row will appear at any point. This will require a refresh.

Your LIMIT should be something like below rather. Define a $page variable which will change from 1 .. no.of pages you want.
LIMIT $offset, $page * $numPerPage
OK, in that case you will have to re-calculate your $pages and $numPerPage variable every time (on refresh) and define the paging accordingly.

The solution I found is to add a condition in the WHERE clause.
Rahul suggested that I should count the new articles since the previous query and add it to the offset. This was tricky because since counting the articles would imply that I can run a query that stops where the previous one began, and if I had a way to do that, I could've just as easily make the second query start where the first ended.
So I realized I needed a new column. I called it date-added and the new condition is, date-added < $time-of-first-query for all subsequent queries. Then, the offset is just the number of articles written so far.

Related

Order by then select incrementally

I have a table of > 250k rows of 'names' (and ancillary info) which I am displaying using jQuery Datatables.
My Users can choose any 'name' (Row), which is then flagged as 'taken' (and timestamped).
A (very) cut down version of the table is:
Key, Name, Taken, Timestamp
I would like to be able to display the 'taken' rows (in timestamp order) first and then the untaken records in their key order [ASC] next.
The problem would be simple, but, because of size constraints (both visual UI & data set size) My display mechanism paginates - 10 / 20 / 50 / 100 rows (user choice)
Which means a) the total number of 'taken' will vary and b) the pagination length varies.
Thus I can see no obvious method of keeping track of the pagination.
(My Datatable tells me the count of the start record and the length of the displayed records)
My SQL (MySQL) at this level is weak, and I have no idea how to return a record set that accounts for the 'taken' offset without some kind of new (or internal MySQL) numeric indices to paginate to.
I thought of:
Creating a temporary table with the key and a new numeric indices on
each pagination.
Creating a trigger that re-ordered the table when the row was
'taken'.
Having a "Running order" column that was updated on each new 'taken'
Some sort of cursor based procedure (at this point my hair was
ruffled as the explanations shot straight over the top of my head!)
All seem excessive.
I also thought of doing a lot of manipulation in PHP (involving separate queries, dependant on the pagination size, amount of names already taken, and keeping a running record of the pagination position.)
To the Human Computer (Brain) the problem is untaxing - but translating it into SQL has foxed me, as has coming up with a fast alternative to 1-3 (the test case on updating the "Running order" solution took almost three minutes to complete!)
It 'feels' like there should be a smart SQL query answer to this, but all efforts with ORDER BY, LIMITS, and the like fall over unless I return the whole dataset and do a lot of nasty counting.
Is there something like a big elephant in the room I am missing - or am I stuck with the hard slog to get what I need.
A query that displays the 'taken' rows (in timestamp order) first and then the untaken records in their key order [ASC] next:
SELECT *
FROM `table_name`
ORDER BY `taken` DESC, IF(`taken` = 1, `Timestamp`, `Key`) ASC
LIMIT 50, 10
The LIMIT values: 10 is the page size, 50 is the index of the first element on page 6.
Change the condition on IF(taken = 1,Timestamp,Key) with the correct condition to match the values you store in column taken. I assumed you store 1 when the row is 'taken' and 0 otherwise.

How to make turnover on Mysql database records

I am a website developer and I need help for an analyse: My (future) website is more or less a villa directory. People can add their villas there. Each villa will be stored in database.
I need to show 15 villas per page but I want a "turn over" (not sure it's the correct word in English) of the villas: every hour the villa that appears first on first page becomes the last villa of last page (so every villa rank increase of 1 except the first one that become the last). I want every villa to have the same chance (more or less) to appear on the first page. I don’t want a totally random system.
I need help on how to make a simple system that would not take a lot of resources (should be working with a few millions of records).
Note: I don’t want to use the ID of the villa because if a person posts 3 different villas at the same time, they will be all shown next to each other.
My proposition:
I create a field (INTEGER) called “random_order” for each villa and I put a random number between 0 and Max(INTEGER) and I create an Index on the column “random_order”.
Then to get the records in the order I want, I store (dunno where yet) a variable that point to a record in the index. Then every hours, I increase by 1 this variable (with a modulo).
I’m not an expert on indexes so I’m not really sure if it’s possible to do it and how to do it. I don’t know if there is a better way to do it as well…
Could you please tell me if this is correct or if you have better ideas?
Thank you
Another thing you could do, is store a count variable - from 0 to MAX, and constantly update that. Then query the server for the top 15 villas (using ORDER BY ASC/DESC) on (random_order + count). This will prevent the need to update the column every hour - only the count variable needs to be updated.
EDIT:
First you would get the count (from where you have stored it) and store it in a variable - count.
Then execute a query like
SELECT *, (random_order + <count>)%MAX_VAL AS villa_order
FROM villa_table
ORDER BY villa_order ASC
LIMIT 15
This will prevent constant unnecessary updations to your indexed column.
EDIT 2:
Ok after further analyzing, this is how i would do this.
Execute a simple select query
SELECT * FROM villa_table
WHERE random_order > count
ORDER BY random_order
LIMIT 15
If the number of rows in the result set is < 15 then fill in the remaining records from the beginning using.
SELECT *
FROM villa_table
ORDER BY random_order ASC
LIMIT <number of rows to be filled>
Even on 20m rows on an indexed column this takes < .5s.

MYSQL LIMIT. Is it possible to skip certain rows?

Sorry if this question is confusing.
I have inherited a site that is already built, so I can't really do anything too drastic.
The MYSQL query on a certain page uses LIMIT to only show the relevant entries like this:
comtitlesub.idcts = %s LIMIT 1,3
Skipping the first record and displaying the following three records.
I have been asked to add a new record, which is fine, but this is record number 7. Records 5 and 6 are not supposed to display on this page so changing the query to:
comtitlesub.idcts = %s LIMIT 1,6
displays all 6 records as you would expect.
One confusing thing is that I have altered the ID's for each of the records so that my new one is ID 4, and yet this did not make a difference.
Is there a simple way to 'skip' the unwanted records or am I approaching this from the wrong direction?
add "order by comtitlesub.idcts" at the end of you query, but before the limit clause.
... comtitlesub.idcts = %s ORDER BY comtitlesub.idcts LIMIT 1,6
basically, changing the id doesn't reorder them, rows are stored in order they have been created, and retrieved that way by default.
Andrew, LIMIT will delimit according to a specific order, in your case, coincidentally the default order was the same than ID order, now that you've changed it, you will need to order by ID:
ORDER BY comtitlesub.idcts
I believe the easiest course would be to modify your WHERE clause to exclude the rows you want excluded. For example:
WHERE comtitlesub.idcts = %s AND someothercol NOT IN ('cat','frog','kazoo')
Ideally for maintainability, you would want someothercol to hold stable data rather than a numeric ID which might change as your application data changes.

Consistent random ordering in a MySQL query

I have a database of pictures and I want to let visitors browse the pictures. I have one "next" and one "previous" link.
But what I want is to show every visitor anther order of the pictures. How can I do that? If I will use ORDER BY RANDOM() I will show sometimes duplicate images.
Can someone help me please? Thank you!
You can try to use seed in random function:
SELECT something
FROM somewhere
ORDER BY rand(123)
123 is a seed. Random should return the same values.
The problem arises from the fact that each page will run RAND() again and has no way of knowing if the returned pictures have already been returned before. You would have to compose your query in such a way that you can filter out the pictures already presented on the previous pages, so that RAND() will have fewer options to choose from.
An idea would be to randomize the pictures, select the IDs, store the IDs in the session, then SELECT using those IDs. This way, each user will have the pictures randomized, but they will be able to paginate through them without re-randomizing them on each page.
So, something like:
SELECT id FROM pictures ORDER BY RAND() LIMIT x if you don't have the IDs in the session already
Store the IDs in the session
SELECT ... FROM pictures WHERE id IN (IDs from session) LIMIT x
Another idea is to store in session the IDs that the user already saw and filter them out. For example:
SELECT ... FROM pictures ORDER BY RAND() LIMIT x if the session doesn't contain any ID
Append the IDs from the current query to the session
SELECT ... FROM pictures WHERE id NOT IN (IDs from session) ORDER BY RAND() LIMIT x
Another way seems to be to use a seed, as izi points out. I have to say I didn't know about the seed, but it seems to return the exact same results for the exact same value of the seed. So, run your usual query and use RAND(seed) instead of RAND(), where "seed" is a unique string or number. You can use the session ID as a seed, because it's guaranteed to be unique for each visitor.
You can seed the random function as suggested by izi, or keep track of visited images vs non-visited images as suggested by rdineiu.
I'd like to stress that neither option will perform well, however. Either will lead you to sorting your entire table (or the part of it of interest) using an arbitrary criteria and extracting the top n rows, possibly with an offset. It'll be dreadfully slow.
Thus, consider for a moment how important it is that every visitor should get a different image order. Probably, it'll be not that important, as long as things look random. Assuming this is the case, consider this alternative...
Add an extra float field to your table, call it sort_ord. Add an index on it. On every insert or update, assign it a random value. The point here is to end up with a seemingly random order (from the visitor's standpoint) without compromising performance.
Such a setup will allow you to grab the top n rows and paginate your images using an index, rather than by sorting your entire table.
At your option, have a cron job periodically set a new value:
update yourtable
set sort_ord = rand();
Also at your option, create several such fields and assign one to visitors when they visit your site (cookie or session).
This will solve:
SELECT DISTINCT RAND() as rnd, [rest of your query] ORDER BY rnd;
Use RAND(SEED). From the docs: "If a constant integer argument N is specified, it is used as the seed value." (http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_rand).
In the example above the result order is always the same. You simply change the seed (351) and you get a new random order.
SELECT * FROM your_table ORDER BY RAND(351);
You can to change the seed every time the user hits the first page.
Without seeing the SQL I'd guess you could try SELECT DISTINCT...

'Natural sorting' with MySQL?

I'm trying to query a Wordpress database and get the post titles to sort in a correct order.
The titles are formatted like this: Title 1, Title 2.. I need to sort them in ascending order, how can I do this? If I just sort them ascending they will come out like: 1,10,11...
Right now my order by statement is this but it does nothing:
ORDER BY CONVERT(p.post_title,SIGNED) ASC;
Per-row functions are a bad idea in any database that you want to scale well. That's because they have to perform the calculation on every row you retrieve every time you do a select.
The intelligent DBA's way of doing this is to create a whole new column containing the computed sort key, and use an insert/update trigger to ensure it's set correctly. The means the calculation is performed only when needed and amortises its cost across all selects.
This is one of the few cases where it's okay to revert from third normal form since the use of the triggers prevents data inconsistency. Hardly anyone complains about the disk space taken up by their databases, the vast majority of questions concern speed.
And, by using this method and indexing the new column, your queries will absolutely scream along.
So basically, you create another column called natural_title mapped as follows:
title natural_title
----- -------------
title 1 title 00001
title 2 title 00002
title 10 title 00010
title 1024 title 01024
ensuring that the mapping function used in the trigger allows for the maximum value allowed. Then you use a query like:
select title from articles
order by natural_title asc
If the # is always at the end like that you can do some string manipulation to make it work:
SELECT *, CAST(RIGHT(p.post_title,2) AS UNSIGNED) AS TITLE_INDEX
FROM wp_posts p
ORDER BY TITLE_INDEX asc
Might have to tweak it a bit assuming you may have 100+ or a 1000+ numbers as well.