How to create a table of wildcards - mysql

I have a table called blacklisted_usernames. These are usernames with wildcards in them that aren't allowed to be registered onto my site.
create table blacklisted_usernames (
name varchar(64) not null
);
Some dummy data:
insert into blacklisted_usernames (name) values
('%admin%'),
('king%'),
('bad'),
('%cool');
The % indicates the same thing as the wildcard in the LIKE function in MySQL. I want to create an efficient case insensitive query which tells me if a username is blacklisted or not. For example is the username AdminJohn allowed? The answer would be no, because of %admin% being in the blacklisted_usernames table.
I understand I can do something like
SELECT 1
WHERE 'AdminJohn' LIKE '%admin%'
or 'AdminJohn' LIKE 'king%'
or 'AdminJohn' LIKE 'bad'
or 'AdminJohn' LIKE '%cool'
But I am manually typing out all the LIKE's. I also don't think it would be efficient if I created a loop checking it 1 by 1. Is there a way I can make this into an automatic but efficient way of checking against the names in blacklisted_usernames table and determining if a username is allowed?

select 1 from blacklisted_usernames where 'AdminJohn' like name limit 1

Another approach is to use a REGEXP to test them all in a single test:
WHERE 'AdminJohn' REGEXP 'admin|^king|^bad$|cool$'
(Note how the wildcards go away on some and "anchors" are needed on the others.)
Probably this is faster than the original OR+LIKEs or the JOIN+LIKEs. When checking one name REGEXP will be plenty fast (no table scan).

If you need to check all existing names against the blacklisted patterns
SELECT usernames.name,
blacklisted_usernames.name blacklisted_pattern
FROM usernames
JOIN blacklisted_usernames
ON usernames.name LIKE blacklisted_usernames.name
Pay attention - this is complete tablescan, the indices won't be used, so the query will be slow.
You may need to check all existing names against newly added/altered patterns - in this case add according WHERE by blacklisted_usernames table (for example, by created_at or updated_at column).
If you need to check currently created username against all patterns then use the solution provided by ysth.

Related

SQL OR statement in Wildcard

My current MySql wild card is LIKE '%A%B%'. This can return values that contain A and B.
Can anyone suggest how can I alter the wildcard statement to return values that contain either A or B.
Thanks in advance.
You can add as many like operator you want within the parenthesis with OR condition like below
select * from tablename where (column_name like '%test%' or same_column_name like '%test1%' or
same_column_name like '%test2%' or same_column_name like '%test3%')
For more info have a look at the below link.
SQL Server using wildcard within IN
Hope that helps you
You can use REGEXP
select * from Table1 where some_column REGEXP '[AB]'
there are lots of different ways in writing this as a regular expression, the above basically means containing A or B.
Generally you want to avoid using REGEXP and LIKE '%something' because the do not use indexes. Thus for large tables these operations would be unusable. When you want to do a search of this kind it's always best to stop and ask: "Have I got the best database design?", "Can I use full text search instead?"

using regex in mysql based on middle of string value

I have a system that uses personalized URLs (i.e. JohnSmith.MyWebsite.com). In my database, these values are stored in the "purl" column.
If six months from now, I get another john smith I need to put into my system, I simply add a 1 to his name so that his purl becomes JohnSmith1.MyWebsite.com.
My database has grown so large that checking for this manually is a real time consumer. So, I'd like to make a quick app where I can enter in names, then check against the database to return the number I should add onto the end.
How can I use mysql to search if JohnSmith[ANY NUMBER].MyWebsite.com exists while not getting a positive hit on a purl like JohnSmithson1234.MyWebsite.com?
So basically, I need an exact match on the name, and domain, but need to get the latest number used so I can add 1 to it.
You could add additional field to your database with the number of times each subdomain is created
for example
JohnSmith.MyWebsite.com - 5
This would mean that you have to create JohnSmith6.MyWebsite.com, and after you create it, update the field to
JohnSmith.MyWebsite.com - 6
Or you can do 'order by purl DESC' like other users suggested, but if you use this method, add index to the purl field.
Sql Server does allow [0-9] to match one digit from 0-9. You might want to use
johnsmith[0-9]%.MyWebsite.com
to allow for more digits (though this would also match something like johnsmith123fooledyou.MyWebsite.com)
MySQL doesn't do regex searches like you're asking. However, you can easily do this in the application logic. Do something like this:
SELECT * FROM table WHERE purl LIKE 'JohnSmith%';
Then loop over the results in your app and see if you have anything with numbers the purl column.
Also, you would be well served to downcase everything in the purl column since DNS is case insensitive and MySQL is not. You may have times where johnSmith is being searched for but JohnSmith is in the DB and you will have no results.
EDIT:
Apparently MySQL does allow regex searches. To get the one with the highest number add an "ORDER BY purl DESC LIMIT 1"

MySQL Sounds like to ignore 'The,a' etc

I have a table which includes names and each have a unique itemcode field, however I also have another table which we use as the root table for everything, this extra table is an extra I've recently added.
Because the names in my root table differ from the names in this new table, I need a bridge table, which I've created using a query which performs a INSERT..SELECT where the name in one table is equal to another. This is great, however it's limited my results because some names include The or A at the beginning so I'm missing them. So now I've changed my a = b query to a SOUNDS LIKE however that's only included names with differences at the end.
What I'm looking for is a way of ignoring a certain set of words such as:
The
A
Be
And
Etc and use the rest of the name? I can't do a 'LIKE %%' because that would capture too much.
You can remove the words The, A, Be etc by using the like statement. Ensure you have spaces on either side of the search terms so that it matches whole words, and not partial words.
SELECT * FROM namestable where names Like '% The %'

How to search either on id or name for certain purchase orders

We would like to filter purchase orders either based on purchase order id (primary key) or name of the purchase order using a single search box.
We used the like parameter to search on the name field, but it doesn't seem to work on the primary key. It works only when we use the equal operator for id(s). But it would be preferable if we can filter purchase orders using like for id(s). How to do this?
create table purchase_orders (
id int(11) primary key,
name varchar(255),
...
)
Option 1
SELECT *
FROM purchase_orders
WHERE id LIKE '%123%'; -- tribute to TemporaryNickName
This is horrible, performance-wise :)
Option 2a
Add a text column which receives a string version of id. Maybe add some triggers to populate it automatically.
Option 2b
Change the type of id column to CHAR or VARCHAR (I believe CHAR should be preferred for a primary key).
In both 2a. and 2b. cases, add an index (maybe a FULLTEXT one) to this column.
I think LIKE should work. I assume that your SQL wasn't correctly written.
Let's assume that you have order name "ABCDEF" then you can find this using the following query structure.
SELECT id FROM purchase_orders WHERE name LIKE '%CD%';
To explain it, % sign means it's a wildcard. As a result this query is going to select any String that contains "CD" inside of it.
According to the table structure, varchar can contain 255 characters. I think this is quite a large string and it's probably going to consume a lot of resources and going to take more time to search something using SQL functions like LIKE. You can always search it by id
WHERE id = something. This is much faster way btw
, but I don't think order id is an user friendly data, instead I would let users to use product name. My recommendation is to use apache Lucene or MySQL's full text search feature (which can improve search performance).
Apache lucene
MySQL Full text search function
These are tools built to search certain pattern or word through list of large strings in much faster way. Many websites use this to build their own mini search engines. I found mysql full text search function requires pretty much no learning curve and straight forward to use =D

Selecting a column that is also a keyword in MySQL

For some reason, the developers at a new company I'm working for decided to name their columns "ignore" and "exists". Now when I run MySQL queries with those words in the where clause, I get a syntax error; however, I can't seem to figure out how to reference those columns without running into an error. I tried setting them as strings, but that doesn't make any sense.
Help?
Also, is there a term for this kind of mismatch?
put the names in backticks:
`ignore`, `exists`
If you're working across multiple tables or databases you need to escape the database name, table name, and field name separately (if each matches a keyword):
SELECT * FROM `db1`.`table1`
LEFT JOIN `db2`.`table2` on `db1`.`table1`.`field1`=`db2`.`table2`.`field2`
Only the portions that actually match a keyword have to be escaped, so things like:
select * from `db1`.table
are ok too.
The official term is "idiocy" :-) You can put backticks around the names such as
`ignore`
but I would give serious consideration to changing the names if possible. Backticks are not standard SQL, and I prefer my column names to be a little more expressive. For example, ignoreThisUser or orderExists (the general rule I try to follow is to have a noun and a verb in there somewhere).
Interestingly, some DBMS' can figure out not to treat it as a reserved word based on context. For example, DB2/z allows the rather hideous:
> CREATE TABLE SELECT ( SELECT VARCHAR(10) );
> INSERT INTO SELECT VALUES ('HELLO');
> SELECT SELECT FROM SELECT;
SELECT
---------+---------+---------+--------
HELLO
DSNE610I NUMBER OF ROWS DISPLAYED IS 1