Partial MySQL index from End of Field - mysql

A partial index helps to have smaller indexes, and makes INSERTs faster.
For instance
CREATE TABLE wine (
name VARCHAR(100),
...
INDEX (name(8)));
While names are something like
Chateau Mouton-Rothschild
Chateau Mouton-Cadet
Chateau Petrus
Chateau Lafite
Chateau Lafleur
...
In this (example) list, Chateau appears all the time, MySQL creates an index based on the 8 first characters... meaning there will be only one entry in the index (and the search of Chateau Petrus will be done sequentially for all Chateau).(In this very case, a split between the first word (Chateau) and the rest of the name in two fields would make sense, but this is not the point).
Is there a way to ask MySQL to create a partial index based on the end of a field?

Actually I found a way in the meantime - with a bit of programming:
Only for the name field
all name entries are stored reverse in the DB
all name searches are made reverse
name in row is reversed before being sent to client (user agent)
For instance in PHP
...query('INSERT ... name="' . strrev($name) . '"...
...query('SELECT * FROM wine WHERE name="' . strrev($name) . '"');
and for instance a search of %MOUTON% will actually search %NOTUOM%
There is a bit of reverse overhead, but is negligible compared to the possible database gain.
The question was specifically asking for a pure MySQL solution, but if there is none, this is a workable workaround in any language. I'll accept this answer in a few days if there is nothing better.

Aside from the solution you mentioned above, the only solution for this in MySQL is to explicitly store the value you want to index. (e.g. name without Chateau)

Related

Performance of modelling inheritance in database using superclass table

My Question, is actually a question about the usability / performance of a concept / idea I had:
The Setup:
Troughout my Database, two (actually three) fields always re-appear constantly: title and description (and created). The title is always a VARCHAR(100) and the description always a TEXT.
Now, to simplify those tables, I thought about something (and changed it in that way): Wouldnt it be more useful to just create a table named content, with id, title, description and created as only fields, and always point to that table from all others?
Example:
table tab has id, key and content_id (instead of title, description and created)
table chapter has id, story_id and content_id (" ")
etc
The Question:
Everything works fine so far, but my only fear is performance. Will I run into a bottleneck, doing it this way, or should I be fine? I have about 23 different tables pointing to content right now, and some of them will hold user-defined content (journals, comments, etc) - so the number of entries in content could get quite high.
Is this setup better, or equal to having title and description in every separate table?
Edit: And if it turns out to be a bad idea, what are alternatives to mantain/copying certain fields like title and description into ~25 tables?
Thanks in advance for the help!
There is no clear answer for your question because it mainly depends on usage of the tables, so just consider following points:
How often will you need write to the tables? In case of many inserts/updates having data in one big table can cause problems because all write operations will target the same table.
How often do you need data stored in table with common data? If title or description are not needed most of the time for your select this can be OK. If you need title every time then take into account that you wile always have to JOIN table with common data.
How do you manage your database schema? It can be easier to write some simple tool for creation/checking table structure. In MySQL you can easily access data dictionary with DESCRIBE table_name or through INFORMATION_SCHEMA database.
I'm working on project with 700+ tables where some of the fields have to be present in every table (when was record created, timestamp of last modification). We have simple script that helps with this, because having all data in one table would be disastrous.

using regex in mysql based on middle of string value

I have a system that uses personalized URLs (i.e. JohnSmith.MyWebsite.com). In my database, these values are stored in the "purl" column.
If six months from now, I get another john smith I need to put into my system, I simply add a 1 to his name so that his purl becomes JohnSmith1.MyWebsite.com.
My database has grown so large that checking for this manually is a real time consumer. So, I'd like to make a quick app where I can enter in names, then check against the database to return the number I should add onto the end.
How can I use mysql to search if JohnSmith[ANY NUMBER].MyWebsite.com exists while not getting a positive hit on a purl like JohnSmithson1234.MyWebsite.com?
So basically, I need an exact match on the name, and domain, but need to get the latest number used so I can add 1 to it.
You could add additional field to your database with the number of times each subdomain is created
for example
JohnSmith.MyWebsite.com - 5
This would mean that you have to create JohnSmith6.MyWebsite.com, and after you create it, update the field to
JohnSmith.MyWebsite.com - 6
Or you can do 'order by purl DESC' like other users suggested, but if you use this method, add index to the purl field.
Sql Server does allow [0-9] to match one digit from 0-9. You might want to use
johnsmith[0-9]%.MyWebsite.com
to allow for more digits (though this would also match something like johnsmith123fooledyou.MyWebsite.com)
MySQL doesn't do regex searches like you're asking. However, you can easily do this in the application logic. Do something like this:
SELECT * FROM table WHERE purl LIKE 'JohnSmith%';
Then loop over the results in your app and see if you have anything with numbers the purl column.
Also, you would be well served to downcase everything in the purl column since DNS is case insensitive and MySQL is not. You may have times where johnSmith is being searched for but JohnSmith is in the DB and you will have no results.
EDIT:
Apparently MySQL does allow regex searches. To get the one with the highest number add an "ORDER BY purl DESC LIMIT 1"

Is MySQL FULLTEXT best solution for partial words?

I have a MySQL MyISAM table containing entries that describe airports. This table contains 3 varchar columns - code, name and tags.
code refers to the airport's code (like JFK and ORD), the name refers to the airport's name (John F Kennedy and O'Hare) and tags specify a semicolon separated list of tags that are associated with the airport (like N.Y.C;New York; and Chicago;).
I need to be able to lookup an airport (for an autocomplete) by either the code, name or tags, therefore I set a FULLTEXT index on (code, name, tags).
I have encountered two problems with FULLTEXT so far that prevent me from working with it:
1. There is no way to do partial matching - only postfix matching (is this true?)
2. When a period ('.') is specified in the term to match against, the matching works differently. I am assuming that the period is being parsed in a special way. For example, doing a FULLTEXT search on N.Y.C will not return JFK, although doing the same search on New York will
Is there anyway to overcome these barriers? Otherwise, should I be looking at like matching instead, or an entirely different storage engine? Thanks!
Best solution I came up with is using both FULLTEXT and like matching, and using UNION for the results.

MySQL 5.5 Database design. Problem with friendly URLs approach

I have a maybe stupid question but I need to ask it :-)
My Friendly URL (furl) database design approach is fairly summarized in the following diagram (InnoDB at MySQL 5.5 used)
Each item will generate as many furls as languages available on the website. The furl_redirect table represents the controller path for each item. I show you an example:
item.id = 1000
item.title = 'Example title'
furl_redirect = 'item/1000'
furl.url = 'en/example-title-1000'
furl.url = 'es/example-title-1000'
furl.url = 'it/example-title-1000'
When you insert a new item, its furl_redirect and furls must be also inserted. The problem appears becouse of the (necessary) unique constraint in the furl table. As you see above, in order to get unique urls, I use the title of the item (it is not necessarily unique) + the id to create the unique url. That means the order of inserting rows should be as follow:
1. Insert item -- (and get the id of the new item inserted) ERROR!! furl_redirect_id must not be null!!
2. Insert furl_redirect -- (need the item id to create de path)
3. Insert furl -- (need the item id to create de url)
I would like an elegant solution to this problem, but I can not get it!
Is there a way of getting the next AutoIncrement value on an InnoDB Table?, and is it recommended to use it?
Can you think of another way to ensure the uniqueness of the friendly urls that is independent of the items' id? Am I missing something crucial?
Any solution is welcome!
Thanks!
You can get an auto-increment in InnoDB, see here. Whether you should use it or not depends on what kind of throughput you need and can achieve. Any auto-increment/identity type column, when used as a primary key, can create a "hot spot" which can limit performance.
Another option would be to use an alphanumeric ID, like bit.ly or other URL shorteners. The advantage of these is that you can have short IDs that use base 36 (a-z+0-9) instead of base 10. Why is this important? Because you can use a random number generator to pick a number out of a fairly big domain - 6 characters gets you 2 billion combinations. You convert the number to base 36, and then check to see if you already have this number assigned. If not, you have your new ID and off you go, otherwise generate a new random number. This helps to avoid hotspots if that turns out to be necessary for your system. Auto-increment is easier and I'd try that first to see if it works under the loads that you're anticipating.
You could also use the base 36 ID and the auto-increment together so that your friendly URLs are shorter, which is often the point.
You might consider another ways to deal with your project.
At first, you are using "en/" "de/" etc, for changing language. May I ask how does it work in script? If you have different folders for different languages your script and users must suffer a lot. Try to use gettext or any other localisation method (depends on size of your project).
About the friendly url's. My favorite method is to have only one extra column in item's table. For example:
Table picture
id, path, title, alias, created
Values:
1, uploads/pics/mypicture.jpg, Great holidays, great-holidays, 2011-11-11 11:11:11
2, uploads/pics/anotherpic.jpg, Great holidays, great-holidays-1, 2011-12-12 12:12:12
Now in the script, while inserting the item, create alias from title, check if the alias exists already and if does, you can add id, random number, or count (depending on how many same titles u have already).
After you store the alais like this its very simple. User try to access
http://www.mywebsite.com/picture/great-holidays
So in your script you just see that user want to see picture, and picture with alias great-holidays. Find it in DB and show it.

Generating a unique MySQL field based on the contents of other fields

In creating unique custom reference codes (i.e. JOB1293XYZ) for a while I have been generating custom reference/hashes in PHP then checking my database to see if it already exists and if so generating another.
I'm curious if in MySQL it is possible to to generate the contents of a field based on the other fields in that row, i would be a handy shortcut. For Example...
Table: Employees
employeeId
firstName
lastName
jobTitle
DOB
employeeCode
So for employeeCode can I perhaps use a MySQL Custom Function (which I know very little about) to generate the contents of the field. perhaps by taking the first letter of the firstname, first letter of the second name, the str length of the job title and the last two digits of the DOB? This is purely an example.
If this is possible and anyone can point me to any reference material that would be great, have been wading through the heavy MySQL documentation and will continue to do so for the time being.
Thanks in advance!
You could concatenate all of the fields together and do an MD5 on the resulting string.
UPDATE Employees SET md5str=MD5(CONCAT(field1, field2, field3))...
I wouldn't recommend this approach because then the question of what to do if you had a hash collision would be very difficult if not impossible to answer.
The above idea is not mine: I spotted this in the maatkit code.