MYSQL — Number formatting issue - mysql

I'm working on a gig right now where the client wants the user to be able to search for a product by product code.
A product code is formatted like so: 123.4567.89
So, the search box should return that product whether the user enters the number with the periods, without the periods, or with spaces.
So, all of the following should return the product: 123.4567.89, 123456789, 123 4567 89.
My current query looks like so:
SELECT *
FROM products
WHERE product_code LIKE '%$search_code%'"
I'm at a loss as to how I would revise that to include all the different possibilities of how a user would input these numbers.
Thanks in advance for any help.

[Front End] Limit the characters the user can enter. Only allow periods and spaces. Don't allow any alpha characters (if all your product SKUs are numerical).
[Middle Tier] After the form is posted, double check the data for extraneous characters on the back end. If somehow the client managed to bipass the validation on the front end, you can catch it on the back end. Use a simple search and replace in your language of choice.
[Database/Back-End] Once the data is restricted to only numeric digits and you send the SKU to your database query, strip out all periods on your products table. If you know you only use periods to store the SKUs, just search excluding them, e.g.
SELECT *
FROM products
WHERE REPLACE(product_code,'.','') = #productCode
Avoid wildcard %% searches, they're expensive.

You have to normalize the number input by the user before doing the search, that is: make it have the same format as the number stored in the database.
For instance, if the numbers are stored in the database without the periods (like 123456789), you have to pre-process the number input by the user to also remove the periods, spaces and any other characters from it.
Edit: if the numbers are stored in the database with the periods, than you also need to normalize them by removing the periods as #HertzaHaeon pointed in his answer.

How about removing dots and spaces in both the database code and the searcg input, so you have just the digits? Something like this:
WHERE REPLACE(product_code, '.', '') LIKE '%formatted_search_code%'
For the search input, you can strip everything but digits form it with a regular expression or simple substring replacement.

I think the best solution to your problem is to format the search value so that is matches the format used in your database. But than you will only find the product, if the user fills in the whole product number. If this is not the desired solution and you want to be able to let the user fill in any part of a product code and find al the products that have a code containing that that I think you should filter out the periods in your database.
The fasted solution would be to do it actually in your database. Remove the dots from your product code or add an extra field containing the product code without dots. This will speed up the query when the dataset gets larger.
If you not want to do that you can always filter out the dots in the search query:
REPLACE(product_code,'.','') LIKE '%$search_code%'
This will do the thrick but can be very slow when the dataset gets bigger.

I work with this all the time with social-security numbers. My solution is to take your input string, strip out and characters that are not digits, make sure that the string is the proper length and then use the substring function to break the string up and then put it back together with the delimiters. If you're using PHP, the function might look like this this:
<?php
function FormatProductCode($String) {
$String = preg_replace("/[^0-9]/", "", $String);
if ($String == "") {
return null;
}
$String = str_pad($String, 9, "0", STR_PAD_LEFT);
return substr($String, 0, 3) . "." . substr($String, 3, 4) . "." . substr($String, 7, 2);
}
?>
Use this function any time that you need to input data into the database or compare data.

Related

How can I group data by matching identical cells in same column then counting instances of a related column?

data output
I am pretty new to Webi and am having an issue creating a variable. I'm trying to check if there is more than 1 email address for each entity legacy account number and if 1 of the contact names contains "Annual Report". So when I flag each entity legacy account number for no email only the ones without a contact name that contains "Annual Report" will be pulled. In the example above only the yellow groups should be called no email. Right now all of them are being pulled into no email. I have tried using if and match as those are what I am most familiar with. Does anyone have any suggestions?
There are number of ways you could do this. I am going to give an example using two variables, but you could easily combine them into one.
Has No Email Var=If(Match(Upper([Contact EmailAddress]); "NOEMAIL*"); 1; 0)
Annual Report Contact Name Var=If(Match(Upper([Contact Name]); "ANNUAL REPORT*"); 1; 0)
Then you would apply a report filter with two components...
Has No Email Var = 1
AND
Annual Report Contact Name Var = 0
Let me explain a few things...
The purpose of the Upper function is the Match function is case sensitive. If you know your email address are always lower case then you could remove that the Upper function and have it match on "noemail*".
It is significant that I only have a asterisk ("*") at the end of the string being sought. That will only find a match where the corresponding column value starts with that string. If you want it to be true whenever the string is found anywhere in the column being searched you would be asterisks on both ends.
You could also put limiting criteria in your query filter. But here is where thing can get confusing. Within the query filter you can choose the Matches pattern operator. However, the wildcard character is different ("%" rather than "*") and you do not put double-quotes around your search text. So you would have some thing like this...
Contact EmailAddress Matches pattern noemail%
AND
Contact Name Different from pattern Annual Report%
I am sure you noticed I didn't convert the search text to uppercase. In the Query Panel Web Intelligence is case-insensitive and would likely follow the case-sensitivity of the database of the source data. All of our databases are case-insensitive so if yours is case-sensitive you may need to play around this this a bit. Or just go with the approach of creating the variables and report filters as I initially laid out.
If you want a wildcard for a single character rather than multiple characters (which is what "*" and "%" will do) you need to use a "?" within your variable definition or a "_" in your query filter.
Hope this helps,
Noel

Remove string with wildcard in Notepad++

I'm trying to merge multiple JSON data sets into one large data set, due to a max limit of 100 on the server I'm pulling them from.
The easiest way to do this would be to eliminate the end of one set and the beginning of the next and replace it with "," so that there would be only one open and close to the entire large set. This is what appears between the last entry of one set and the first entry of the next currently:
],"version":"1.0"}{"error":"OK","limit":100,"offset":100,"number_of_page_results":100,
"number_of_total_results":20235,"status_code":1,"results":[
Again, I need that entire string replaced with just a comma, but the problem I'm encountering is that I had to change the offset between each data set to grab the next 100 entries, so the "offset":100, is different in each string ("offset":200, "offset":300, etc.). I can't seem to get wildcards to cooperate. I suspect it has something to do with all the brackets that are already in the string.
Any help would be appreciated. Thank you.
A regular expression that matches the whole input you provided (provided there's no new line characters) is:
\],"version":"1\.0"\}\{"error":"OK","limit":[0-9]+,"offset":[0-9]+,"number_of_page_results":[0-9]+,"number_of_total_results":[0-9]+,"status_code":[0-9]+,"results":\[
It will get any digits in place off all the numbers in your sample (except version).

run a between query using wildcards

Using Microsoft Access 2010
I have a field for [box_no]. I need to run a query to get a list of all box numbers within a range. Here is my issue....several box numbers have a letter in front of them (typically the letter "T"), several do not. If I use *Like* '*'+[Search Box Number]+'*' in the query I have no problem searching for a box with or without a letter. I can use *Between [beginning box number] And [ending box number]* in the query to retrieve a range of box numbers, as long as I include the corresponding letter(s). Is there a string or something I can write to get the result I want?
EXAMPLE: I want to retrieve a report for box numbers 732913000 to 732914000. 732913000 through 73213055 do not have a letter in the beginning. 73213056 has the letter T in the beginning (T73213056). I need to make sure all box numbers appear in the report, regardless of the beginning character.
I hope this makes sense! :-)
You can set up a function in VBA to strip out the leading character if its not numeric and then use that function in your query.
The function would be;
Function StripChar(BoxNumber As String) As String
If IsNumeric(Left(BoxNumber, 1)) Then
StripChar = BoxNumber
Else
StripChar = Right(BoxNumber, Len(BoxNumber) - 1)
End If
End Function
You can then use the function in your query;
SELECT BoxNumber, StripChar([BoxNumber]) AS Stripped
FROM <YourTable> WHERE (StripChar([BoxNumber]) Between 100 And 200));
You could probably put the whole thing together using SQL but it's probably easier to work with this because you can easily amend the VBA function to do the adaptation.

access 2003 text display leading zero

We are currently using Access 2003. I have a vehicle maintenance log which has a text field for vehicle numbers. The field needs to be text as it will have a R in it when a vehicle is retired. They would like to have the field be a four digit number with leading zero's. So vehicle 22 would be displayed in the table and reports as 0022 and when retired it would be R0022. I have tried to change the format of the field to "0000" and "0000"# but neither of these will display the leading zero's.
Do you really want you users to manually edit that field?
I don't like solutions like this, because it's error-prone (unless you check a lot of things to make sure that no one enters invalid data) and feels unelegant to your users.
I would just save the following in the table:
the vehicle number in a numeric field (22, not 0022)
a boolean field which indicates if the vehicle is retired
This is much easier for your users to work with:
they can just enter new vehicle numbers without having to think about leading zeros
to retire a vehicle, they just need to set a checkbox, instead of putting the right letter in front of the vehicle number
Plus, showing the desired number R0022 now becomes just a matter of displaying/formatting the data from the table:
Public Function GetDisplayNo(VehicleNo As Integer, IsRetired As Boolean) As String
GetDisplayNo = IIf(IsRetired, "R", "") & Right("0000" & VehicleNo, 4)
End Function
You can use this function like this:
GetDisplayNo(22, True) returns R0022
GetDisplayNo(22, False) returns 0022
And if you need to display a list of vehicle numbers in a report or a continuous form, you can directly use this function in the underlying query:
SELECT
Vehicles.VehicleNo,
Vehicles.IsRetired,
GetDisplayNo([VehicleNo],[IsRetired]) AS DisplayNumber
FROM Vehicles;

Searching Mysql for similar string

People have different ideas of how to search for the same term.
For example Tri-Valley, Trivalley, Tri Valley (and possibly even incorrect spellings)
Currently that search is done like this
SELECT * FROM `table` WHERE `SchoolDistrict` LIKE '%tri valley%';
Is there an easy way to say 'space dash or no space' without writing out three like statements?
It seems like it could easily be done:
SELECT * FROM `table` WHERE `SchoolDistrict` LIKE '%tri%valley%';
But this only works if the initial input is 'tri-valley' or 'tri valley' If the initial input is 'trivalley' I have no idea where to place the % (theoretically that is, actually, I do, as we are only looking at about a dozen different school districts, but I'm looking to solve the larger problem)
You could consider using SOUNDEX, or SOUNDS LIKE if you have a lot of incorrect spellings. If you've got a lot of rows (or even if you don't), it might be wise to store the output of the SOUNDEX in an additional column.
I'd also recommend -- in the interests of accuracy -- introducing a separate table with an authoritative list of school districts, and run a query to find those which aren't in that list.
MySQL has a function called Sounds like.
link text
An alternative here is to recast the problem from search to select, if possible. Instead of letting your users enter free-form text to choose a school district, if you have a set of school districts generate a dropdown (or set of cascading dropdowns if the list is large, say by county, then by school district) and allow the user to select the appropriate one. Use this both for "searching" and for data entry to eliminate non-canonical entries. Obviously this only works when you can enumerate all of the entries.
Alternatively you could allow the user to choose a starts with or contains type search and simply generate the appropriate SQL ('tri%' or '%tri%') based on the selected search type. If the user understands that the search type is starts with or contains, they will likely adjust their search string until it yields the results they need.
The second statement you posted should do the trick:
SELECT * FROM 'table' WHERE 'SchoolDistrict' LIKE '%tri%valley%';
What you should do before you pass the search term into the select statement is to replace all characters and spaces with the % sign. For example,
SearchTerm = SearchTerm.Replace(" ","%");