MySQL Add Column that Summarizes data from Another Column - mysql

I have a column in MySQL table which has 'messy' data stored as text like this:
**SIZE**
2
2-5
6-25
2-10
26-100
48
50
I want to create a new column "RevTextSize" that rewrites the data in this column to a pre-defined range of values.
If Size=2, then "RevTextSize"= "1-5"
If Size=2-5, then "RevTextSize"= "1-5"
If Size=6-25, then "RevTextSize"="6-25"
...
This is easy to do in Excel, SPSS and other such tools, but how can I do it in the MySQL table?

You can add a column like this:
ALTER TABLE messy_data ADD revtextsize VARCHAR(30);
To populate the column:
UPDATE messy_data
SET revtextsize
= CASE
WHEN size = '2' THEN '1-5'
WHEN size = '2-5' THEN '1-5'
WHEN size = '6-25' THEN '6-25'
ELSE size
END
This is a brute-force approach, identifying each distinct value of size and specifying a replacement.
You could use another SQL statement to help you build the CASE expression
SELECT CONCAT(' WHEN size = ''',d.size,''' THEN ''',d.size,'''') AS stmt
FROM messy_data d
GROUP BY d.size
Save the result from that into your favorite SQL text editor, and hack away at the replacement values. That would speed up the creation of the CASE expression for the statement you need to run to set the revtextsize column (the first statement).
If you want to build something "smarter", that dynamically evaluates the contents of size and makes an intelligent choice, that would be more involved. If was going to do that, I'd do it in the second statement, generating the CASE expression. I'd prefer to review that, befor I run the update statement. I prefer to have the update statement doing something that's easy to understand and easy to explain what it's doing.

Use InStr() to locate "-" in your string and use SUBSTRING(str, pos, len) to get start & End number. Then Use Between clause to build your Case clause.
Hope this will help in building your solution.
Thanks

Related

Using CASE WHEN Expression with where condition SQL Server

I have my new Spring Boot project with SQL Server and I need to replace my MySQL native query on the Repository method in my old project with SQL Server native query. It's a complex query with the case when expression in where condition. When I try testing that query in SQL Server Management Studio it shows errors like the image below.
enter image description here
And here's my old native query use with MySQL on the Repository method I want to replace it with SQL Server
enter image description here
Please help me to find the solution.
Thank you in advance!!
This is what you have and what you should have posted as text within your question. As text it becomes searchable and copyable by people trying to help YOU.
case when #num = 1 then p.merchant_name = #query else 1=1 end
CASE is an expression in TSQL. It is not a control-of-flow construct like it is in many other languages. To use an "optional" filter, you need to construct a boolean expression using CASE which handles the "optional" attribute correctly. Often this is done with a bit more complexity using CASE like this:
case when #num = 1 and p.merchant_name <> #query then 0 else 1 end = 1
So here, CASE is used to return a value that can be tested in a comparison. There is no magic in using 0 or 1. Use any values of any type.
When #num is 1 and the values do NOT match, the THEN branch (0) is returned.
When #num is 1 and the values match, the ELSE branch (1) is returned.
When #num is anything but 1, the ELSE branch (1) is returned.
So when the CASE expression returns 0 (really - anything but 1), the row is ignored (removed from the resultset).
Given that your query is actually constructed in an application, you should considering dynamically building the query and adding parameters as needed. That will likely generate a more efficient query that can be better optimized by the database engine. Alternatively you can review this kitchen sink discussion and Erland's discussion of dynamic search conditions. TBH it looks like someone used #num as a kludge to avoid adding parameters for the eight specific filter values. If I want to filter on both merchant name and store name, I can't with this approach.

Removing addresses from Column

I have a column within my database that holds text similar to this
CNEWS # Trinidad : "By Any Means Necessary" Watson Duke Swims And Sails To Toco http://somewebsitehere.com
What can I do to remove the entire http address from the column? Please note that some links may be broken so it may have http:// somewebsitehere.com
I was thinking of using a substring index but not sure that would work.
You could use whichever your favorite programming language is to iterate through the rows in the table, pluck out that column, apply a regular expression replacement rule to it, then update the row in the table with the new value.
Here is some pseudo-code:
theRows = SELECT * FROM TheTable WHERE 1;
foreach row in theRows
BEGIN
oldColumnValue = row[theColumnName]
// Removes any link appearing at the end of the column
newColumnValue = oldColumnValue.replace(/http:\/\/[^\s]*$/, '')
UPDATE TheTable SET theColumnName = newColumnValue WHERE id = row[id]
END
For something as small and specific as this, you could use perl with the DBI library to connect to mySQL. Here's a useful resource on regular expressions if you want to go more into it: http://www.regular-expressions.info/perl.html

How can a SET column be queried in MySQL while ignoring ordering?

I have a MySQL DB with a table that has a SET type column with the following definition:
CREATE TABLE t (
col SET('V','A','L','U','E')
)
I would like to write a SELECT query that returns all the rows where col equals to ('A','L','E')
This can be done by writing the following query:
SELECT * FROM t WHERE c = 'A,L,E'
The query that i would like to write is one that will return the same result also for an non ordered input like 'L','A','E'
I couldn't find an elegant way to do so and couldn't find anything that can help me in the official documentation
You can fix nacho's suggestion using the following:
WHERE floor(pow(2,FIND_IN_SET('A',c)-1))+
floor(pow(2,FIND_IN_SET('L',c)-1))+
floor(pow(2,FIND_IN_SET('E',c)-1))=c
This is by no means an "elegant solution"... I would rather use a simpler one if possible.
FIND_IN_SET provides the position in the enum, so we have to raise 2 by this number to get the internal representation of the SET value.
The floor() function is used to keep the expression 0 when find_in_set returns 0.
Note that you still have the risk of false positives when checking against illegal SET values (e.g. looking for 'A','L','E' and 'X' will return true)
You need to use the FIND IN SET
SELECT *
FROM t
WHERE FIND_IN_SET('A',c)>0 AND FIND_IN_SET('L',c)>0 AND FIND_IN_SET('E',c)>0
I don´t know if this will work but you can also try:
SELECT *
FROM t
WHERE FIND_IN_SET('A,L,E',c)>0
Another possible approach is to check each item separately + check that the sizes of the groups match (the assumption is that the searched set has no repetitions):
SELECT *
FROM t
WHERE FIND_IN_SET('A',c)>0 AND FIND_IN_SET('L',c)>0 AND FIND_IN_SET('E',c)>0 AND BIT_COUNT(c) = 3

Filter Dataset using SQL Query

I am using Zeos and mysql in my delphi project.
what I would like to do is filter dataset using a textbox.
to do that, I am using following query in textbox 'OnChange' Event:
ZGrips.Active := false;
ZGrips.SQL.Clear;
ZGrips.SQL.Add('SELECT Part_Name, Description, OrderGerman, OrderEnglish FROM Part');
ZGrips.SQL.Add('WHERE Part_Name LIKE ' + '"%' + trim(txt_search.Text) + '%"');
ZGrips.Active := true;
after I run and type first character in textbox, I get empty dataset in my DBGrid,
so DBGrid is showing nothing, then If I type second character I get some result in DBGrid. and even more strange behavior: if I will use AS Clause in my SQL Query like:
Part_Name AS blablabla,
Description AS blablabla,
OrderGerman AS OG,
OrderEnglish AS OE
in that case DBGrid is showing only 2 columns: Part_Name and Description, I dont understand why it is ignoring 3rd and 4th columns.
thanks for any help in advance.
Always use parameters
Firstly you need to use parameters, otherwise your query will break or worse when the user enters the wrong characters in the search box.
See: How does the SQL injection from the "Bobby Tables" XKCD comic work?
Parameters also makes you query faster, because the database engine only have to decode the query once.
If you change a parameter the engine will know that the query itself has not changed and will not re-decode it.
Don't use clear and add
Just supply the SQL as text in one go, it's faster.
This is esp. true in a loop, outside the loop you will not notice the difference.
Your code should read something like:
procedure TForm1.SetupSearch; //run this only once.
var
SQL: string;
begin
ZGrips.Active:= false;
SQL:= 'SELECT Part_Name, Description, OrderGerman, OrderEnglish FROM Part' +
'WHERE Part_Name LIKE :searchtext'); //note no % here.
ZGrips.SQL.Text:= SQL; //don't use clear and don't use SQL.Add.
end;
//See: http://docwiki.embarcadero.com/Libraries/XE2/en/Vcl.StdCtrls.TEdit.OnChange
procedure TForm1.Edit1Change(Sender: TObject);
begin
if Edit1.Modified then begin
Timer1.Active:= true;
end;
end;
procedure TForm1.Timer1Timer(Sender: TObject);
begin
Timer1.Active:= false;
if Edit1.Text <> ZGrips.Params[0].AsString then begin
ZGrips.Params[0].AsString:= Edit1.Text + '%'
ZGrips.Active:= true;
end;
end;
Use a timer
As per #MartinA's suggestion, use a timer and start the query only ever so often.
The wierd behaviour you're getting maybe because you're stopping and reactivating a new query before the old one has had time to finish.
The Params[index: integer] property is a bit faster than the ParamsByName property.
Although this does not really matter outside a loop.
Allow the database to use an index!
Using only a trailing wildcard % is faster than using a leading wildcard because the database can only use an index is there is a trailing wildcard.
If you want to use a leading wildcard, then consider storing the data in reverse order and use a trailing wildcard instead.
Full-text indexes are much better than like
Of course if you use both a leading and a trailing wild card then you have to use a full-text index.
In MySQL you'll than use the MATCH AGAINST syntax,
see: Differences between INDEX, PRIMARY, UNIQUE, FULLTEXT in MySQL?
and: Which SQL query is better, MATCH AGAINST or LIKE?
The lastest versions of MySQL support full-text indexes in InnoDB.
Remember to never use MyISAM, it's unreliable.

ORDERBY "human" alphabetical order using SQL string manipulation

I have a table of posts with titles that are in "human" alphabetical order but not in computer alphabetical order. These are in two flavors, numerical and alphabetical:
Numerical: Figure 1.9, Figure 1.10, Figure 1.11...
Alphabetical: Figure 1A ... Figure 1Z ... Figure 1AA
If I orderby title, the result is that 1.10-1.19 come between 1.1 and 1.2, and 1AA-1AZ come between 1A and 1B. But this is not what I want; I want "human" alphabetical order, in which 1.10 comes after 1.9 and 1AA comes after 1Z.
I am wondering if there's still a way in SQL to get the order that I want using string manipulation (or something else I haven't thought of).
I am not an expert in SQL, so I don't know if this is possible, but if there were a way to do conditional replacement, then it seems I could impose the order I want by doing this:
delete the period (which can be done with replace, right?)
if the remaining figure number is more than three characters, add a 0 (zero) after the first character.
This would seem to give me the outcome I want: 1.9 would become 109, which comes before 110; 1Z would become 10Z, which comes before 1AA. But can it be done in SQL? If so, what would the syntax be?
Note that I don't want to modify the data itself—just to output the results of the query in the order described.
This is in the context of a Wordpress installation, but I think the question is more suitably an SQL question because various things (such as pagination) depend on the ordering happening at the MySQL query stage, rather than in PHP.
My first thought is to add an additional column that is updated by a trigger or other outside mechanism.
1) Use that column to do the order by
2) Whatever mechanism updates the column will have the logic to create an acceptable order by surrogate (e.g. it would turn 1.1 into AAA or something like that).
Regardless...this is going to be a pain. I do not evny you.
You can create function which have logic to have human sort order like
Alter FUNCTION [dbo].[GetHumanSortOrder] (#ColumnName VARCHAR(50))
RETURNS VARCHAR(20)
AS
BEGIN
DECLARE #HumanSortOrder VARCHAR(20)
SELECT #HumanSortOrder =
CASE
WHEN (LEN(replace(replace(<Column_Name>,'.',''),'Figure ',''))) = 2
THEN
CONCAT (SUBSTRING(replace(replace(<Column_Name>,'.',''),'Figure ',''),1,1),'0',SUBSTRING(replace(replace(<Column_Name>,'.',''),'Figure ',''),2,2))
ELSE
replace(replace(<Column_Name>,'.',''),'Figure ','')
END
FROM <Table_Name> AS a (NOLOCK)
WHERE <Column_Name> = #ColumnName
RETURN #HumanSortOrder
END
this function give you like 104,107,119,10A, 10B etc as desired
And you can use this function as order by
SELECT * FROM <Table_Name> ORDER BY GetHumanSortOrder(<Column_Name>)
Hope this helps