SELECT n-th row WHERE field = x value - mysql

format(sql, sizeof(sql), "SELECT * FROM `datab` WHERE License = %s", searchPlate);
Querying with this format will give me all the rows with this result, but what i'm trying to do is take for ex. the third or fifth or even tenth row that has this result, not all of the rows. How can i do this?

Something like this should work in MySQL:
format(sql, sizeof(sql), "SELECT * FROM `datab` WHERE License = %s ORDER BY IDColumnNameGoesHere LIMIT %d, 1", searchPlate, MyDesiredRowInteger);
Some points:
%d might not be right, use the correct symbol for an integer.
You are most definitely using a specific RDBMS, you must look into the documentation and find out. SQL is a standard, MySQL ans SQL-Server etc are implementations of that standard. You must find out which implementation you are using.
Sticking variables into strings like you have done leaves you very vulnerable to SQL injection. You should always parameterize your queries.
LIMIT is specific to MySQL if you are using a different RDBMS you will have to go another route. For example SQL-Server you can use TOP, but as this only has one parameter you will need to use min or max in addition to only get the one record you desire.

Related

MySQL 5.7 RAND() and IF() without LIMIT leads to unexpected results

I have the following query
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table LIMIT 20) t
which returns something like this:
That's exactly what you would expect. However, as soon as I remove the LIMIT 20 I receive highly unexpected results (there are more rows returned than 20, I cut it off to make it easier to read):
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table) t
Side notes:
I'm using MySQL 5.7.18-15-log and this is a highly abstracted example (real query is much more difficult).
I'm trying to understand what is happening. I do not need answers that offer work arounds without any explanations why the original version is not working. Thank you.
Update:
Instead of using LIMIT, GROUP BY id also works in the first case.
Update 2:
As requested by zerkms, I added t.res = 0 and t.res + 1 to the second example
The problem is caused by a change introduced in MySQL 5.7 on how derived tables in (sub)queries are treated.
Basically, in order to optimize performance, some subqueries are executed at different times and / or multiple times leading to unexpected results when your subquery returns non-deterministic results (like in my case with RAND()).
There are two easy (and likewise ugly) workarounds to get MySQL to "materialize" (aka return deterministic results) these subqueries: Use LIMIT <high number> or GROUP BY id both of which force MySQL to materialize the subquery and return the expected results.
The last option is turn off derived_merge in the optimizer_switch variable: derived_merge=off (make sure to leave all the other parameters as they are).
Further readings:
https://mysqlserverteam.com/derived-tables-in-mysql-5-7/
Subquery's rand() column re-evaluated for every repeated selection in MySQL 5.7/8.0 vs MySQL 5.6

How can a SET column be queried in MySQL while ignoring ordering?

I have a MySQL DB with a table that has a SET type column with the following definition:
CREATE TABLE t (
col SET('V','A','L','U','E')
)
I would like to write a SELECT query that returns all the rows where col equals to ('A','L','E')
This can be done by writing the following query:
SELECT * FROM t WHERE c = 'A,L,E'
The query that i would like to write is one that will return the same result also for an non ordered input like 'L','A','E'
I couldn't find an elegant way to do so and couldn't find anything that can help me in the official documentation
You can fix nacho's suggestion using the following:
WHERE floor(pow(2,FIND_IN_SET('A',c)-1))+
floor(pow(2,FIND_IN_SET('L',c)-1))+
floor(pow(2,FIND_IN_SET('E',c)-1))=c
This is by no means an "elegant solution"... I would rather use a simpler one if possible.
FIND_IN_SET provides the position in the enum, so we have to raise 2 by this number to get the internal representation of the SET value.
The floor() function is used to keep the expression 0 when find_in_set returns 0.
Note that you still have the risk of false positives when checking against illegal SET values (e.g. looking for 'A','L','E' and 'X' will return true)
You need to use the FIND IN SET
SELECT *
FROM t
WHERE FIND_IN_SET('A',c)>0 AND FIND_IN_SET('L',c)>0 AND FIND_IN_SET('E',c)>0
I donĀ“t know if this will work but you can also try:
SELECT *
FROM t
WHERE FIND_IN_SET('A,L,E',c)>0
Another possible approach is to check each item separately + check that the sizes of the groups match (the assumption is that the searched set has no repetitions):
SELECT *
FROM t
WHERE FIND_IN_SET('A',c)>0 AND FIND_IN_SET('L',c)>0 AND FIND_IN_SET('E',c)>0 AND BIT_COUNT(c) = 3

MySQL Add Column that Summarizes data from Another Column

I have a column in MySQL table which has 'messy' data stored as text like this:
**SIZE**
2
2-5
6-25
2-10
26-100
48
50
I want to create a new column "RevTextSize" that rewrites the data in this column to a pre-defined range of values.
If Size=2, then "RevTextSize"= "1-5"
If Size=2-5, then "RevTextSize"= "1-5"
If Size=6-25, then "RevTextSize"="6-25"
...
This is easy to do in Excel, SPSS and other such tools, but how can I do it in the MySQL table?
You can add a column like this:
ALTER TABLE messy_data ADD revtextsize VARCHAR(30);
To populate the column:
UPDATE messy_data
SET revtextsize
= CASE
WHEN size = '2' THEN '1-5'
WHEN size = '2-5' THEN '1-5'
WHEN size = '6-25' THEN '6-25'
ELSE size
END
This is a brute-force approach, identifying each distinct value of size and specifying a replacement.
You could use another SQL statement to help you build the CASE expression
SELECT CONCAT(' WHEN size = ''',d.size,''' THEN ''',d.size,'''') AS stmt
FROM messy_data d
GROUP BY d.size
Save the result from that into your favorite SQL text editor, and hack away at the replacement values. That would speed up the creation of the CASE expression for the statement you need to run to set the revtextsize column (the first statement).
If you want to build something "smarter", that dynamically evaluates the contents of size and makes an intelligent choice, that would be more involved. If was going to do that, I'd do it in the second statement, generating the CASE expression. I'd prefer to review that, befor I run the update statement. I prefer to have the update statement doing something that's easy to understand and easy to explain what it's doing.
Use InStr() to locate "-" in your string and use SUBSTRING(str, pos, len) to get start & End number. Then Use Between clause to build your Case clause.
Hope this will help in building your solution.
Thanks

mySQL: Can one rely on the implicit ORDER BY done by mySQL when using an IN-Statement?

I just noticed that,
when i execute the following query:
SELECT * FROM tbl WHERE some_key = 1 AND some_foreign_key IN (2,5,23,8,9);
the results come back in the same order they where given in the IN-Statement List,
e.g. the row with some_foreign_key = 2 is the first row returned,
the one with
some_foreign_key = 9 is the last and so on.
This is exactly the opposite behaviour of what this guy describes:
MySQL WHERE IN - Ordering
Can one rely on this behaviour or modify it via some mySQL Server setting?
I know common wisdom is "no ORDER BY Clause" == "RDBMS can sort however it pleases",
but in my current Task at hand this behaviour is quite helpful (really large import)
and it would be great if i could rely on it.
EDIT: I know about the ORDER BY FIELD Trick already, just wanted to know if i can safely avoid the ORDER BY Clause by setting some config somewhere.
ORDER BY FIELD(some_foreign_key, 2, 5, 23, 8, 9)
isn't really that tough to implement - unless you're really simplifying this example. And as you already know it's the only way to be 100% sure of the output ordering.

Use SQL Server FTS Stemmer

Is there any way to directly access the stemmer used in the FORMSOF() option of a CONTAINS Full Text Search query so that it returns the stems/inflections of an input word, not just those derivations that exist in a search column.
For example, the query
SELECT * FROM dbo.MyDB WHERE contains(CHAR_COL,'FORMSOF(INFLECTIONAL, prettier)')
returns the stem "pretty" and other inflections such as "prettiest" if they exists in the CHAR_COL column. What I want is to call the FORMSOF() function directly without referencing a column at all. Any chance?
EDIT:
The query that met my needs ended up being
SELECT * FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY group_ID ORDER BY GROUP_ID) ord, display_term
from sys.dm_fts_parser('FORMSOF( FREETEXT, running) and FORMSOF(FREETEXT, jumping)', 1033, null, 1)) a
WHERE ord=1
Requires membership in the sysadmin
fixed server role and access rights to
the specified stoplist.
No. You can not do this. You can't get an access to stemmer directly.
You can get an idea of how it works by looking into Solr source code. But it might (and I guess will) be different from the one implemented in MS SQL FT.
UPDATE: It turns out that in SQL Server 2008 R2 you can do something quite close to what you want. A special table-valued UDF was added:
sys.dm_fts_parser('query_string', lcid, stoplist_id, accent_sensitivity)
it allows you to get a tokenization result (i.e. the result after applying word breaking, thesaurus and stop list application). So in case you feed it 'FORMSOF(....)' it will give you the result you want (well, you will have to process result set anyway). Here's corresponding article in MSDN.