Limit the result starting from a specific row with a given Id? - mysql

I want to write a query to select a subset of a table, only starting from a given id.
I know about limit x, y, but x here is the number of the raw to start from. But in my case I want to start from a specific Id, no matter what its location inside the table.
What I mean is that the query below selects from row number 5, but I want it to select 10 records from row with id, say 213odin2d211d21:
SELECT * FROM my_table Limit 5, 10
I can't find a way to do this. Any help will be appreciated.
Note that, the Id here is a mix of strings and integers. So I can't do
SELECT * FROM <table> WHERE id > (id)

What you want to do is not possible. By default, records in the database are not ordered. Without ORDER BY you can't expect the server to return your queries in any particular order. Since you are saying, that you store some kind of digit/char identifier as your id, for which less then and greater then are not defined, it is not clear which records "follow" your specific record.
You will either have to:
Define another column to sort your records on, or
Define a behaviour for comparing your ids (What is "less then"? What is "greater then"?)
That being said, you can of course define that you want to sort your id just like sorting strings! In this case, you can use STRCMP() to compare two strings. Your query would look like this:
SELECT * FROM <table> WHERE STRCMP(id,?) = 1 ORDER BY id LIMIT 10
This will select the first 10 records, with id "greater than" ?.

Related

Select n last linesChubb from table

Is it possible to grab the last n inserts in a relational table without using a Date field ?
For example in the table Author:
Author(authid, f_name, l_name)
Also, authid is not a natural number (eg. 1,2,3,4..) but a string (example: JohnM32015)
I am using MySQL.
If the authid is auto-increment then you can do
select * from author
order by authid desc
To get only a limited number of records use top n in SQL-Server or limit n in MySQL for instance.
A table doesn't have a guaranteed sort order. A query does, if it explicitly defines one. If, for example, your records have an incrementing authid value then the last N inserts would be the highest N values for that column. So you'd order by that column descending and take the top N:
SELECT * FROM Author ORDER BY authid DESC LIMIT 10
However you define "the last N", you specify that definition in your query in a descending sort order and take the top N records from that result.
If, as you say, you want the "most recent" 10 records, you would write your query just like the other answers say, but order by the field(s) that defines your case for "most recent". Like this:
SELECT * FROM MyTable ORDER BY <your date field(s)> DESC LIMIT 10;
Disclaimer: If you don't have data in the table to define whatever should be the "most recent" records (like a "DateInserted" field, or an auto-incrementing field), then you don't really have an easy way of doing this with SQL.

Which row's fields are returned when Grouping with MySQL?

I have a MySQL table with the fields id and string. ids are unique. strings are varchars and are non-unique.
I perform the following query:
SELECT id, string, COUNT( * ) AS frequency
FROM table
GROUP BY string
ORDER BY frequency DESC, id ASC
Questions
Assume the table contains three rows with identical string values, and ids 1, 2, and 3.
Which id is going to be returned ( 1, 2, or 3 )?
Which id is this query going to ORDER BY ( Same as is returned? ... see question 1 )?
Can you control which id is returned / used for ordering? eg. Return the largest id, or the first id from a GROUP.
What I'm ultimately trying to do is get a frequency occurrence for identical strings, order by that frequency, highest to lowest, and on a frequency tie, order by id with the smallest id from the group returned / ordered by. I made the situation more generic to figure out how MySQL handles this situation.
Which id is going to be returned ( 1, 2, or 3 )?
A: The server will choose for all the records that have the same name the id it wants (most likely the fastest to fetch, which is unpredictable). To cite the official documentation:
The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
Much more information in this link.
Which id is this query going to ORDER BY ( Same as is returned? ... see question 1 )?
It makes no sense to find out in what order the data retrieved will be returned as you can't predict the result you are going to get. However, it is very likely that you get the result sorted by the unpredictable ID column.
Can you control which id is returned / used for ordering? eg. Return the largest id, or the first id from a GROUP.
You should be assuming at this point that you can't. Read again the documentation.
Making things even more clear: You can't predict the result of an improperly used GROUP BY clause. The main issue with MySQL is that it allows you to use it in a non-standard way but you need to know how to make use of that feature. The main point behind it is to group by fields that you know will always be the same. EG:
SELECT id, name, COUNT( * ) AS frequency
FROM table
GROUP BY id
Here, you know name will be unique as id functionally determines name. So the result you know is valid. If you grouped also by name this query would be more standard but will perform slightly worse in MySQL.
As a final note, take into account that, in my experience the results in those non-standard queries for the selected and non-grouped fields are usually the ones that you would get applying a GROUP BY and then an ORDER BY on that field. That is why so many times it seems to work. However, if you keep testing you will eventually find out that this happens 95% of the time. And you can not rely on that number.
The documentation says that when not grouping by all non-aggregate columns, one row for each unique combination if the grouped by columns is returned. The row selected is up to the server - ie "random"
However, in practice it is the first row encountered during processing. You can control which is encountered first by selecting from an inner query that is ordered in the order of preference of return.
For example to get the lowest id for each name (yes, undocumented, blah blah, but it works!):
SELECT id, name, COUNT( * ) AS frequency
FROM (select * from table order by id) x
GROUP BY name
ORDER BY frequency DESC, id ASC
I personally am comfortable relying on this behaviour and have never seen or heard of it behaving differently in real life. Many shun this as undocumented and "risky", but if it works, it works.

How to grab rows which contain a dual id reference

I have a messages table:
messages:
id(int)
send_id(int)
receive_id(int)
And I want to be able to select rows from this only when a->b and b->a exist, e.g.:
id send_id recieve_id
0, 15, 16
1, 16, 15
So that basically one message has been passed to each person. How would I be able to go about selecting just one of those two rows (either send or receive), and all of those for a specific id.
I want to only return results that have this duality.
My code currently uses a nested SELECT and doesn't work at all as needed.
You can achieve the result by taking advantage MySQL's LEAST and GREATEST built-in functions.
SELECT *
FROM messages
WHERE (LEAST(send_id, recieve_id), GREATEST(send_id, recieve_id), id)
IN
(
SELECT LEAST(send_id, recieve_id) as x,
GREATEST(send_id, recieve_id) as y,
MAX(id) msg_ID
FROM messages
GROUP BY x, y
);
SQLFiddle Demo
MySQL Comparison Operator (LEAST/GREATEST)
You have to define an additional synthesized column for this. Different alternatives: permanent as an index (fast), temporary if just for a selection once a month or on-the-fly inside the actual query.
Whatever alternative, that column should contain both ids, ordered in a numerical way and concatenated, maybe by some separation character like -. Now when you make a uniqueness restriction to that column only one of the two candidates can be entered into the result, the second one is rejected because it would violate that uniqueness rule.
The trick is the ordered concatenation instead of a normal combined index that would allow both variants due to the different order of ids.

Whats wrong with this MYSQL query

I have the following SQL query , it seems to run ok , but i am concerned as my site grows it may not perform as expected ,I would like some feeback as to how effective and efficient this query really is:
select * from articles where category_id=XX AND city_id=XXX GROUP BY user_id ORDER BY created_date DESC LIMIT 10;
Basically what i am trying to achieve - is to get the newest articles by created_date limited to 10 , articles must only be selected if the following criteria are met :
City ID must equal the given value
Category ID must equal the given value
Only one article per user must be returned
Articles must be sorted by date and only the top 10 latest articles must be returned
You've got a GROUP BY clause which only contains one column, but you are pulling all the columns there are without aggregating them. Do you realise that the values returned for the columns not specified in GROUP BY and not aggregated are not guaranteed?
You are also referencing such a column in the ORDER BY clause. Since the values of that column aren't guaranteed, you have no guarantee what rows are going to be returned with subsequent invocations of this script even in the absence of changes to the underlying table.
So, I would at least change the ORDER BY clause to something like this:
ORDER BY MAX(created_date)
or this:
ORDER BY MIN(created_date)
some potential improvements (for best query performance):
make sure you have an index on all columns you querynote: check if you really need an index on all columns because this has a negative performance when the BD has to build the index. -> for more details take a look here: http://dev.mysql.com/doc/refman/5.1/en/optimization-indexes.html
SELECT * would select all columns of the table. SELECT only the ones you really require...

MySQL: SELECT(x) WHERE vs COUNT WHERE?

This is going to be one of those questions but I need to ask it.
I have a large table which may or may not have one unique row. I therefore need a MySQL query that will just tell me TRUE or FALSE.
With my current knowledge, I see two options (pseudo code):
[id = primary key]
OPTION 1:
SELECT id FROM table WHERE x=1 LIMIT 1
... and then determine in PHP whether a result was returned.
OPTION 2:
SELECT COUNT(id) FROM table WHERE x=1
... and then just use the count.
Is either of these preferable for any reason, or is there perhaps an even better solution?
Thanks.
If the selection criterion is truly unique (i.e. yields at most one result), you are going to see massive performance improvement by having an index on the column (or columns) involved in that criterion.
create index my_unique_index on table(x)
If you want to enforce the uniqueness, that is not even an option, you must have
create unique index my_unique_index on table(x)
Having this index, querying on the unique criterion will perform very well, regardless of minor SQL tweaks like count(*), count(id), count(x), limit 1 and so on.
For clarity, I would write
select count(*) from table where x = ?
I would avoid LIMIT 1 for two other reasons:
It is non-standard SQL. I am not religious about that, use the MySQL-specific stuff where necessary (i.e. for paging data), but it is not necessary here.
If for some reason, you have more than one row of data, that is probably a serious bug in your application. With LIMIT 1, you are never going to see the problem. This is like counting dinosaurs in Jurassic Park with the assumption that the number can only possibly go down.
AFAIK, if you have an index on your ID column both queries will be more or less equal performance. The second query will need 1 less line of code in your program but that's not going to make any performance impact either.
Personally I typically do the first one of selecting the id from the row and limiting to 1 row. I like this better from a coding perspective. Instead of having to actually retrieve the data, I just check the number of rows returned.
If I were to compare speeds, I would say not doing a count in MySQL would be faster. I don't have any proof, but my guess would be that MySQL has to get all of the rows and then count how many there are. Altough...on second thought, it would have to do that in the first option as well so the code will know how many rows there are as well. But since you have COUNT(id) vs COUNT(*), I would say it might be slightly slower.
Intuitively, the first one could be faster since it can abort the table(or index) scan when finds the first value. But you should retrieve x not id, since if the engine it's using an index on x, it doesn't need to go to the block where the row actually is.
Another option could be:
select exists(select 1 from mytable where x = ?) from dual
Which already returns a boolean.
Typically, you use group by having clause do determine if there are duplicate rows in a table. If you have a table with id and a name. (Assuming id is the primary key, and you want to know if name is unique or repeated). You would use
select name, count(*) as total from mytable group by name having total > 1;
The above will return the number of names which are repeated and the number of times.
If you just want one query to get your answer as true or false, you can use a nested query, e.g.
select if(count(*) >= 1, True, False) from (select name, count(*) as total from mytable group by name having total > 1) a;
The above should return true, if your table has duplicate rows, otherwise false.