Filter Dataset using SQL Query - mysql

I am using Zeos and mysql in my delphi project.
what I would like to do is filter dataset using a textbox.
to do that, I am using following query in textbox 'OnChange' Event:
ZGrips.Active := false;
ZGrips.SQL.Clear;
ZGrips.SQL.Add('SELECT Part_Name, Description, OrderGerman, OrderEnglish FROM Part');
ZGrips.SQL.Add('WHERE Part_Name LIKE ' + '"%' + trim(txt_search.Text) + '%"');
ZGrips.Active := true;
after I run and type first character in textbox, I get empty dataset in my DBGrid,
so DBGrid is showing nothing, then If I type second character I get some result in DBGrid. and even more strange behavior: if I will use AS Clause in my SQL Query like:
Part_Name AS blablabla,
Description AS blablabla,
OrderGerman AS OG,
OrderEnglish AS OE
in that case DBGrid is showing only 2 columns: Part_Name and Description, I dont understand why it is ignoring 3rd and 4th columns.
thanks for any help in advance.

Always use parameters
Firstly you need to use parameters, otherwise your query will break or worse when the user enters the wrong characters in the search box.
See: How does the SQL injection from the "Bobby Tables" XKCD comic work?
Parameters also makes you query faster, because the database engine only have to decode the query once.
If you change a parameter the engine will know that the query itself has not changed and will not re-decode it.
Don't use clear and add
Just supply the SQL as text in one go, it's faster.
This is esp. true in a loop, outside the loop you will not notice the difference.
Your code should read something like:
procedure TForm1.SetupSearch; //run this only once.
var
SQL: string;
begin
ZGrips.Active:= false;
SQL:= 'SELECT Part_Name, Description, OrderGerman, OrderEnglish FROM Part' +
'WHERE Part_Name LIKE :searchtext'); //note no % here.
ZGrips.SQL.Text:= SQL; //don't use clear and don't use SQL.Add.
end;
//See: http://docwiki.embarcadero.com/Libraries/XE2/en/Vcl.StdCtrls.TEdit.OnChange
procedure TForm1.Edit1Change(Sender: TObject);
begin
if Edit1.Modified then begin
Timer1.Active:= true;
end;
end;
procedure TForm1.Timer1Timer(Sender: TObject);
begin
Timer1.Active:= false;
if Edit1.Text <> ZGrips.Params[0].AsString then begin
ZGrips.Params[0].AsString:= Edit1.Text + '%'
ZGrips.Active:= true;
end;
end;
Use a timer
As per #MartinA's suggestion, use a timer and start the query only ever so often.
The wierd behaviour you're getting maybe because you're stopping and reactivating a new query before the old one has had time to finish.
The Params[index: integer] property is a bit faster than the ParamsByName property.
Although this does not really matter outside a loop.
Allow the database to use an index!
Using only a trailing wildcard % is faster than using a leading wildcard because the database can only use an index is there is a trailing wildcard.
If you want to use a leading wildcard, then consider storing the data in reverse order and use a trailing wildcard instead.
Full-text indexes are much better than like
Of course if you use both a leading and a trailing wild card then you have to use a full-text index.
In MySQL you'll than use the MATCH AGAINST syntax,
see: Differences between INDEX, PRIMARY, UNIQUE, FULLTEXT in MySQL?
and: Which SQL query is better, MATCH AGAINST or LIKE?
The lastest versions of MySQL support full-text indexes in InnoDB.
Remember to never use MyISAM, it's unreliable.

Related

Unknown column in field list

I'm trying to insert some information to MySQL with Pascal, but when I run the program I get the error
unknown column 'mohsen' in field list
This is my code
procedure TForm1.Button1Click(Sender: TObject);
var
aSQLText: string;
aSQLCommand: string;
namee:string;
family:string;
begin
namee:='mohsen';
family:='dolatshah';
aSQLText:= 'INSERT INTO b_tbl(Name,Family) VALUES (%s,%s)';
aSQLCommand := Format(aSQLText, [namee, family]);
SQLConnector1.ExecuteDirect(aSQLCommand);
SQLTransaction1.Commit;
end;
How can I solve this problem?
It's because your
VALUES (%s,%s)
isn't surrounding the namee and family variable contents by quotes. Therefore, your back-end Sql engine thinks your mohsen is a column name, not a value.
Instead, use, e.g.
VALUES (''%s'',''%s'')
as in
Namee := 'mohsen';
Family := 'dolatshah';
aSQLText:= 'INSERT INTO b_tbl(Name,Family) VALUES (''%s'',''%s'')';
aSQLCommand := Format(aSQLText,[namee,family]);
In the original version of my answer, I explained how to fix your problem by "doubling up" single quotes in the Sql you were trying to build, because it seemed to me that you were having difficulty seeing (literally) what was wrong with what you were doing.
An alternative (and better) way to avoid your problem (and the one I always use in real life) is to use the QuotedStr() function. The same code would then become
aSQLText := 'INSERT INTO b_tbl (Name, Family) VALUES (%s, %s)';
aSQLCommand := Format(aSQLText, [QuotedStr(namee), QuotedStr(family)]);
According to the Online Help:
Use QuotedStr to convert the string S to a quoted string. A single quote character (') >is inserted at the beginning and end of S, and each single quote character in the string is >repeated.
What it means by "repeated" is what I've referred to as "doubling up". Why that's important, and the main reason I use QuotedStr is to avoid the Sql db-engine throwing an error when the value you want to send contains a single quote character as in O'Reilly.
Try adding a row containing that name to your table using MySql Workbench and you'll see what I mean.
So, not only does using QuotedStr make constructing SQL statements as strings in Delphi code less error-prone, but it also avoid problems at the back-end, too.
Just in case this will help anybody else I had the same error when I was parsing a python variable with a sql statement and it had an if statement in i.e.
sql="select bob,steve, if(steve>50,'y','n') from table;"
try as I might it coming up with this "unknown column y" - so I tried everything and then I was about to get rid of it and give it up as a bad job until I thought I would swap the " for ' and ' for "..... Hoooraaahh it works!
This is the statement that worked
sql='select bob,steve, if(steve>50,"y","n") from table;'
Hope it helps...
To avoid this sort of problem and SQL injection you should really look into using SQL parameters for this, not the Pascal format statement.

MySQL Add Column that Summarizes data from Another Column

I have a column in MySQL table which has 'messy' data stored as text like this:
**SIZE**
2
2-5
6-25
2-10
26-100
48
50
I want to create a new column "RevTextSize" that rewrites the data in this column to a pre-defined range of values.
If Size=2, then "RevTextSize"= "1-5"
If Size=2-5, then "RevTextSize"= "1-5"
If Size=6-25, then "RevTextSize"="6-25"
...
This is easy to do in Excel, SPSS and other such tools, but how can I do it in the MySQL table?
You can add a column like this:
ALTER TABLE messy_data ADD revtextsize VARCHAR(30);
To populate the column:
UPDATE messy_data
SET revtextsize
= CASE
WHEN size = '2' THEN '1-5'
WHEN size = '2-5' THEN '1-5'
WHEN size = '6-25' THEN '6-25'
ELSE size
END
This is a brute-force approach, identifying each distinct value of size and specifying a replacement.
You could use another SQL statement to help you build the CASE expression
SELECT CONCAT(' WHEN size = ''',d.size,''' THEN ''',d.size,'''') AS stmt
FROM messy_data d
GROUP BY d.size
Save the result from that into your favorite SQL text editor, and hack away at the replacement values. That would speed up the creation of the CASE expression for the statement you need to run to set the revtextsize column (the first statement).
If you want to build something "smarter", that dynamically evaluates the contents of size and makes an intelligent choice, that would be more involved. If was going to do that, I'd do it in the second statement, generating the CASE expression. I'd prefer to review that, befor I run the update statement. I prefer to have the update statement doing something that's easy to understand and easy to explain what it's doing.
Use InStr() to locate "-" in your string and use SUBSTRING(str, pos, len) to get start & End number. Then Use Between clause to build your Case clause.
Hope this will help in building your solution.
Thanks

MySQL Query Tuning - Why is using a value from a variable so much slower than using a literal?

UPDATE: I've answered this myself below.
I'm trying to fix a performance issue in a MySQL query. What I think I'm seeing, is that assigning the result of a function to a variable, and then running a SELECT with a compare against that variable is relatively slow.
If for testings sake however, I replace the compare to the variable with a compare to the string literal equivalent of what I know that function will return (for a given scenario), then the query runs much faster.
For example:
...
SET #metaphone_val := double_metaphone(p_parameter)); -- double metaphone is user defined
SELECT
SQL_CALC_FOUND_ROWS
t.col1,
t.col2,
...
FROM table t
WHERE
t.pre_set_metaphone_string = #metaphone_val -- OPTION A
t.pre_set_metaphone_string = 'PRN' -- OPTION B (Literal function return value for a given name)
If I use the line in option A, the query is slow.
If I use the line in option B, then the query is fast as you would expect any simple string compare to be.
Why?
Was finished writing the question when the answer hit me, so posting anyway for knowledge sharing!
I realised that the return value of the metaphone function was UTF8.
The compare to a latin1 field was obviously incurring a fairly heavy performance overhead.
I replaced the variable assignment with:
SET #metaphone_val:= CONVERT(double_metaphone(p_parameter) USING latin1);
Now the query runs as fast as I would expect.

Creating a stored procedure in SQL Server 2008 that will do a "facebook search"

I'm trying to implement a facebook search in my system (auto suggest while typing).
I've managed to code all the ajax stuff, but I'm not sure how to query the database.
I've created a table called People which contains the fields: ID, FirstName, LastName, MiddleName, Email.
I've also created a FTS-index on all those fields.
I want to create a stored procedure that will get as a parameter the text inserted in the query box and returns the suggestions.
For example, When I will write in the textbox the query "Joh Do"
It will translate to the query:
select * from People where contains(*, '"Joh*"') and contains(*, '"Do*"')
Is there a way to do that in stored procedure?
P.S
I've tried to use the syntax
select * from People where contains(*,'"Joh*" and "Do*"')
but it didn't returned the expected results, probably because it needs to search the words on different fields. Is there a way to fix that?
Thanks.
Try
select *
from People
where (FirstName Like '%'+ #FirstName + '%') and
(MiddleName Like '%'+ #MiddleName + '%') and
(LastName Like '%'+ #LastName + '%')
Also you may want to restrict the results to only return a maximum of say 10 by using:
select top 10
EDIT 1:
OK I now understand the problem better. I would use dynamic sql thus:
First create a split function e.g. Example Split function using XML trick
Then use dynamic sql:
declare #tstr varchar (500)
set #tstr = ''
select #tstr =#tstr + ' Contains(*, ''"'+ val + '*")' + ' and '
from dbo.split(#SearchStr, ' ')
set #tstr = left(#tstr,len(#tstr)-4)
select #tstr
Declare #dsql as varchar(500)
set #dsql = 'select * from People where '+ #tstr
exec (#dsql)
Also please note as per Remus, be aware of SQL Injections, the use of sp_executesql (instead of EXEC) would be better.
The problem is the open list nature of the argument. I can be Joh, it can be Joh Do, it can be Joh Do Na and so on and so forth. You have two main alternatives:
parse the input in the web app (in ASP I assume) and then call an appropriate procedure for the number of entries (ie. exec usp_findSuggestions1 'Joh', exec usp_findSuggestions2 'Joh', 'Do', exec usp_findSuggestions1 'Joh', 'Do', 'Na'). The first procedure uses 1 contains, the second has 2 contains .. and contains ... and the last has 3. This may look totally ugly from a DRY, code design and specially code maintenance pov, but is actually the best solution as far as T-SQL is concerned, due primarily to the plan stability of these queries.
pass the input straight into a single stored procedure, where you can split it into components and build a dynamic T-SQL query with as many contains as necessary.
Both solutions are imperfect. Ultimately, you have two problems, and both have been investigated before to quite some depth:
the problem of passing a list to a T-SQL procedure. See Arrays and Lists in SQL Server 2005 and Beyond
the problem of an undetermined number of conditions in the WHERE clause, see The Curse and Blessings of Dynamic SQL
The AJAX Toolkit has the "AutoComplete" control that provides this functionality out of the box. It is very simple to use.
Look at a sample here

Using SQL to determine word count stats of a text field

I've recently been working on some database search functionality and wanted to get some information like the average words per document (e.g. text field in the database). The only thing I have found so far (without processing in language of choice outside the DB) is:
SELECT AVG(LENGTH(content) - LENGTH(REPLACE(content, ' ', '')) + 1)
FROM documents
This seems to work* but do you have other suggestions? I'm currently using MySQL 4 (hope to move to version 5 for this app soon), but am also interested in general solutions.
Thanks!
* I can imagine that this is a pretty rough way to determine this as it does not account for HTML in the content and the like as well. That's OK for this particular project but again are there better ways?
Update: To define what I mean by "better": either more accurate, performs more efficiently, or is more "correct" (easy to maintain, good practice, etc). For the content I have available, the query above is fast enough and is accurate for this project, but I may need something similar in the future (so I asked).
The text handling capabilities of MySQL aren't good enough for what you want. A stored function is an option, but will probably be slow. Your best bet to process the data within MySQL is to add a user defined function. If you're going to build a newer version of MySQL anyway, you could also add a native function.
The "correct" way is to process the data outside the DB since DBs are for storage, not processing, and any heavy processing might put too much of a load on the DBMS. Additionally, calculating the word count outside of MySQL makes it easier to change the definition of what counts as a word. How about storing the word count in the DB and updating it when a document is changed?
Example stored function:
DELIMITER $$
CREATE FUNCTION wordcount(str LONGTEXT)
RETURNS INT
DETERMINISTIC
SQL SECURITY INVOKER
NO SQL
BEGIN
DECLARE wordCnt, idx, maxIdx INT DEFAULT 0;
DECLARE currChar, prevChar BOOL DEFAULT 0;
SET maxIdx=char_length(str);
SET idx = 1;
WHILE idx <= maxIdx DO
SET currChar=SUBSTRING(str, idx, 1) RLIKE '[[:alnum:]]';
IF NOT prevChar AND currChar THEN
SET wordCnt=wordCnt+1;
END IF;
SET prevChar=currChar;
SET idx=idx+1;
END WHILE;
RETURN wordCnt;
END
$$
DELIMITER ;
This is quite a bit faster, though just slightly less accurate. I found it 4% light on the count, which is OK for "estimate" scenarios.
SELECT
ROUND (
(
CHAR_LENGTH(content) - CHAR_LENGTH(REPLACE (content, " ", ""))
)
/ CHAR_LENGTH(" ")
) AS count
FROM documents
Simple solution for some similar cases (MySQL):
SELECT *,
(CHAR_LENGTH(student)-CHAR_LENGTH(REPLACE(student,' ','')))+1 as 'count'
FROM documents;
You can use the word_count() UDF from https://github.com/spachev/mysql_udf_bundle. I ported the logic from the accepted answer with a difference that my code only supports latin1 charset. The logic would need to be reworked to support other charsets. Also, both implementations always consider a non-alphanumeric character to be a delimiter, which may not always desirable - for example "teacher's book" is considered to be three words by both implementations.
The UDF version is, of course, significantly faster. For a quick test I tried both on a dataset from Project Guttenberg consisting of 9751 records totaling about 3 GB. The UDF did all of them in 18 seconds, while the stored function took 63 seconds to process just 30 records (which UDF does in 0.05 seconds). So the UDF is roughly 1000 times faster in this case.
UDF will beat any other method in speed that does not involve modifying MySQL source code. This is because it has access to the string bytes in memory and can operate directly on bytes without them having to be moved around. It is also compiled into machine code and runs directly on the CPU.
Well I tried to use the function defined above and it was great, except one scenario.
In English you have strong use of ' as part of the word. The function above, at least to me, counted "haven't" as 2.
So here is my little correction:
DELIMITER $$
CREATE FUNCTION wordcount(str TEXT)
RETURNS INT
DETERMINISTIC
SQL SECURITY INVOKER
NO SQL
BEGIN
DECLARE wordCnt, idx, maxIdx INT DEFAULT 0;
DECLARE currChar, prevChar BOOL DEFAULT 0;
SET maxIdx=CHAR_LENGTH(str);
WHILE idx < maxIdx DO
SET currChar=SUBSTRING(str, idx, 1) RLIKE '[[:alnum:]]' OR SUBSTRING(str, idx, 1) RLIKE "'";
IF NOT prevChar AND currChar THEN
SET wordCnt=wordCnt+1;
END IF;
SET prevChar=currChar;
SET idx=idx+1;
END WHILE;
RETURN wordCnt;
END
$$