MySQL OUTER LEFT JOIN performance - mysql

I am updating an existing web-based inventory system that pulls data from a MySQL database. The main structures for the data stored are "items" and "tags" with a one-to-many relationship (items can have multiple corresponding tags)
The existing front-end system for the data is a Backbone.js app that pulls the entire datastore on login and manipulates that data in-memory, committing back to the database when necessary via a RESTful interface. (This is not how I would have designed the system, but it is now a common pattern in Backbone and Spine apps, and how most all of the tutorials and books teach these frameworks).
To serve the initial fetch performed by the front-end in which it captures the entire dataset (about 1000 items and 10,000 item tags at this point) the back-end performs a SELECT query for the items table, and then subsequent SELECT queries for tags table for each item fetched. Performance sucks, obviously. I thought this could be improved with an JOIN, figuring one select query is better than 1000. The following query fetches the data I need but takes over 15s to execute even on my local development server. What gives? Can we improve this system or query without setting up additional infrastructure like a caching key-value store?
SELECT items.*, itemtags.id as `tag_id`, itemtags.tag, itemtags.type
FROM items LEFT OUTER JOIN
itemtags
ON items.id = itemtags.item_id
ORDER BY items.id;
Here are the table structures:
CREATE TABLE `items` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`num` int(11) NOT NULL,
`title` varchar(100) NOT NULL,
`length_inches` int(10) unsigned DEFAULT NULL,
`length_feet` int(10) unsigned DEFAULT NULL,
`width_inches` int(10) unsigned DEFAULT NULL,
`width_feet` int(10) unsigned DEFAULT NULL,
`height_inches` int(10) unsigned DEFAULT NULL,
`height_feet` int(10) unsigned DEFAULT NULL,
`depth_inches` int(10) unsigned DEFAULT NULL,
`depth_feet` int(10) unsigned DEFAULT NULL,
`retail_price` int(10) unsigned DEFAULT NULL,
`discount` int(10) unsigned DEFAULT NULL,
`decorator_price` int(10) unsigned DEFAULT NULL,
`new_price` int(10) unsigned DEFAULT NULL,
`sold` int(10) unsigned NOT NULL,
`push_date` int(10) unsigned DEFAULT NULL,
`updated` int(10) unsigned NOT NULL,
`created` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=1747 DEFAULT CHARSET=latin1;
CREATE TABLE `itemtags` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(10) unsigned NOT NULL,
`tag` varchar(100) NOT NULL,
`type` varchar(100) NOT NULL,
`created` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=61474 DEFAULT CHARSET=latin1;

I think you could use this:
SELECT *, a.id as `tag_id`, a.tag, a.type
FROM items LEFT OUTER JOIN
(SELECT id, item_id, tag, type from itemtags ORDER BY 1,2,3) a
ON items.id = a.item_id
ORDER BY items.id;
I didn't really change much, just the alias. a doesn't signify anything important.
I didn't fill the tables but your original query took 4ms, mine took 1ms.
http://sqlfiddle.com/#!2/b9551/6
Your application can pull the entire data store, irregardless of what you have in your data-set. As data store and data set are not synonymous.
You don't have any indexes either. You should put an index on ID, ITEM_ID in order to optimize the table to return results quicker. I created an index in my sub-query with the order by. Hope this helps.

In terms of performance, you are probably not comparing like-to-like.
The SQL query is completely doing the following things:
Joining the two tables together
Sorting the results by items.id
Returning all the results
Is the original version doing all three of these and waiting until they are completed?
My guess is that the original code is pulling the items back in the order you want them, and then only pulling the tags for a handful that are actually needed at any given time.
In addition, it is unclear how large the items.* data is. The way the query is formulated, you are pulling this about 10 times for each item -- potentially a much larger return set than the original data.
The real question is why you need all this information in the memory of the application. You have the database, just pull back what you need when you need it. Are you familiar with limit and offset -- these may be what you are really looking for.

Related

How to optimize query by SUM of relations?

I have 3 simple tables
Invoices ( ~500k records )
Invoice items, one-to-many relation to invoices ( ~10 million records )
Invoice payments, one-to-many relation to invoices ( ~700k records )
Now, as simple as it sounds, I need to query for unpaid invoices.
Here is the query I am using:
select * from invoices
LEFT JOIN (SELECT invoice_id, SUM(price) as totalAmount
FROM invoice_items
GROUP BY invoice_id) AS t1
ON t1.invoice_id = invoices.id
LEFT JOIN (SELECT invoice_id, SUM(payed_amount) as totalPaid
FROM invoice_payment_transactions
GROUP BY invoice_id) AS t2
ON t2.invoice_id = invoices.id
WHERE totalAmount > totalPaid
Unfortunately, this query takes around 30 seconds, so way to slow.
Of course I have indexes set for "invoice_id" on both payments and items.
When I "EXPLAIN" the query, I can see that mysql has to do a full table scan.
I also tried several other query approaches, using "EXISTS" or "IN" with subqueries, but I never got around the full table scan.
Pretty sure there is not much that can be done here ( except use some caching approach ), but maybe someone knows how to optimize this ?
I need this query to run in a +/-2 seconds max.
EDIT:
Thanks to everybody for trying. Please just know that I absolutely know how to adopt different caching strategies here, but this question is purely about optimizing this query !
Here are the ( simplified ) table definitions
CREATE TABLE `invoices`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`date` date NOT NULL,
`title` enum ('M','F','Other') DEFAULT NULL,
`first_name` varchar(191) DEFAULT NULL,
`family_name` varchar(191) DEFAULT NULL,
`street` varchar(191) NOT NULL,
`postal_code` varchar(10) NOT NULL,
`city` varchar(191) NOT NULL,
`country` varchar(2) NOT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`)
) ENGINE = InnoDB
CREATE TABLE `invoice_items`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`invoice_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`name` varchar(191) DEFAULT NULL,
`description` text DEFAULT NULL,
`reference` varchar(191) DEFAULT NULL,
`quantity` smallint(6) NOT NULL,
`price` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `invoice_items_invoice_id_index` (`invoice_id`),
) ENGINE = InnoDB
CREATE TABLE `invoice_payment_transactions`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`invoice_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`transaction_identifier` varchar(191) NOT NULL,
`payed_amount` mediumint(9) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `invoice_payment_transactions_invoice_id_index` (`invoice_id`),
) ENGINE = InnoDB
Plan A:
Summary table by invoice_id and day. (as Bill suggested) Summary Tables
Plan B:
Change the design to be "current" and "history". This is where the "payments" is a "history" of money changing hands. Meanwhile "invoices" would be "current" in that it contains a "balance_owed" column. This is a philosophy change; it could (should) be encapsulated in a client subroutine and/or a database Stored Procedure.
Plan C: This may be useful if "most" of the invoices are paid off.
Have a flag in the invoices table to indicate paid-off. That will prevent "most" of the JOINs from occurring. (Well, adding that column is just as hard as doing Plan B.)

Mysql Limit Performance

I have a large table in mysql, about 1 million records.
I'm using a dynamic query with different parameters in where clause and order, so i cant use some code like AND id > 34000 LIMIT 10
I have index on my fields in WHERE and LIMIT and ORDER but index doesn't help alone.
I need a better way than LIMIT 34000, 10, Is there any way to slove offset delay?
I put my table schema but i just copy more usable field without any index, because i'm using dynamic queries.
CREATE TABLE IF NOT EXISTS `p_apartmentbuy` (
`property_id` mediumint(8) unsigned NOT NULL,
`dateadd` int(10) unsigned NOT NULL,
`sqm` smallint(5) unsigned NOT NULL,
`sqmland` smallint(5) unsigned NOT NULL,
`age` tinyint(2) unsigned NOT NULL,
`price` bigint(12) unsigned NOT NULL,
`pricemeter` int(11) unsigned NOT NULL,
`floortotal` tinyint(3) unsigned NOT NULL,
`floorno` tinyint(3) unsigned NOT NULL,
`unittotal` smallint(4) unsigned NOT NULL,
`unitthisfloor` tinyint(3) unsigned NOT NULL,
`room` tinyint(1) unsigned NOT NULL,
`parking` tinyint(1) unsigned NOT NULL,
`renovate` tinyint(1) unsigned NOT NULL,
`address` varchar(255) COLLATE utf8_general_ci NOT NULL,
`describe` varchar(500) COLLATE utf8_general_ci NOT NULL,
`featured` tinyint(1) unsigned NOT NULL,
`l_location_id` smallint(5) unsigned NOT NULL,
`l_city_id` smallint(4) unsigned NOT NULL,
`pf_furnished_id` tinyint(2) unsigned NOT NULL,
PRIMARY KEY (`property_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
the problem with a table with 1 mill records wont be the AND id > 34000 LIMIT 10 or LIMIT 34000, 10 that will up to the Structure and the rest of the query. I.E, you need index, PK, FK to speed up the query, beside that an Order by probably will slow it down, make search like '%text%' it will make your query SLOW. Also it's up to the table's Engine
So don't expect that changing limit 10 will make a huge difference. There are a couple of tool that will help you to determinate a 'better' query, but not all queries works as the same so don't expect the "best solution" because it doesn't exists.
You can use Show create table or Describe select ...... or explain to see what's going on, or use the command benchmark to see the approximate time of a function that you are applying to improve it
EDIT:
Some tools for MySQL
I'll recommend you to take a look to this program that will help you with this part of performance.
Mysqlslap (it's like benchmark but you can customize more the result).
SysBench (test CPUperformance, I/O performance, mutex contention, memory speed, database performance).
Mysqltuner (with this you can analize general statistics, Storage engine Statistics, performance metrics).
mk-query-profiler (perform analysis of a SQL Statement).
mysqldumpslow (good to know witch queries are causing problems).
MySQL is able to optimize LIMIT clauses (i.e. only scan / evaluate the rows in the range specified by LIMIT) if it is able to use only indexes to find rows matching the query.
For queries like SELECT * FROM users WHERE active = 1 ORDER BY created_at, adding and index on (active, created_at) is enough.
See http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

MySQL indexes creation strategy and inner logic

This question expects a generic answer to the wide problematic of indexes creation on MySQL database.
Let's take this table example :
CREATE TABLE IF NOT EXISTS `article` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`published` tinyint(1) NOT NULL DEFAULT '0',
`author_id` int(11) unsigned NOT NULL,
`modificator_id` int(11) unsigned DEFAULT NULL,
`category_id` int(11) unsigned DEFAULT NULL,
`title` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`headline` text COLLATE utf8_unicode_ci NOT NULL,
`content` text COLLATE utf8_unicode_ci NOT NULL,
`url_alias` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`priority` mediumint(11) unsigned NOT NULL DEFAULT '50',
`publication_date` datetime NOT NULL,
`creation_date` datetime NOT NULL,
`modification_date` datetime NOT NULL,
PRIMARY KEY (`id`)
);
Over such a sample there is a wide range of queries that could be performed on different criterions :
category_id
published
publication_date
e.g.:
SELECT id FROM article WHERE NOT published AND category_id = '2' ORDER BY publication_date;
On many tables you can see a wide range of state fields (like published here), date fields or reference fields (like author_id or category_id). What strategy should be picked to make indexes ?
Which can be developed under the following points:
Make an index on every fields that can be used in query (either as where argument or order by) even if this can lead to have a lot of indexes per table ?
Also make an index on fields that have only a small set of values like boolean or enum, this just does reduce the scope size of the scan by a n factor (assuming n being the number of inputs and every value homogeneously used) ?
I've read that MySQL prior to 5.0 used only one index per request how do the system picks it ? (by choosing the more restrictive one ?)
How does a OR statement is processed ?
How much does this is going to slow insert ?
Does InnoDB/MyISAM change anything to this problem ?
I know the EXPLAIN statement could be used to know whether a request is optimized or not, but a bit of concrete theoretical stuff would really be more constructive than a purely empirical approach !

Optimizing MySQL query with expensive INNER JOIN

Using trial and error i've discovered that when removing a join from the below query it runs around 30 times quicker. Can someone explain why this would be and if it's possible to optimise the query to include the additional join without the performance hit.
This is a screenshot of the explain which shows that the index isn't being used for the uesr_groups table.
http://i.imgur.com/9VDuV.png
This is the original query:
SELECT `comments`.`comment_id`, `comments`.`comment_html`, `comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`, `user_groups`.`user_group_icon`, `user_groups`.`user_group_name`
FROM (`comments`)
INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
INNER JOIN `user_groups` ON `users`.`user_group_id` = `user_groups`.`user_group_id`
WHERE `comments`.`comment_enabled` = 1
AND `comments`.`comment_content_id` = 12
ORDER BY `comments`.`comment_time_added` ASC
LIMIT 20
If I remove the "user_groups" join then the query runs 30 times quicker as mentioned above.
SELECT `comments`.`comment_id`, `comments`.`comment_html`, `comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`
FROM (`comments`)
INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
WHERE `comments`.`comment_enabled` = 1
AND `comments`.`comment_content_id` = 12
ORDER BY `comments`.`comment_time_added` ASC
LIMIT 20
My tables are below, can anyone offer any insight into how to avoid a performance hit for including the user_groups table?
--
-- Table structure for table `comments`
--
CREATE TABLE IF NOT EXISTS `comments` (
`comment_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`comment_content_id` int(10) unsigned NOT NULL,
`comment_user_id` mediumint(6) unsigned NOT NULL,
`comment_original` text NOT NULL,
`comment_html` text NOT NULL,
`comment_time_added` int(10) unsigned NOT NULL,
`comment_time_updated` int(10) unsigned NOT NULL,
`comment_enabled` tinyint(1) NOT NULL DEFAULT '0',
`comment_is_spam` tinyint(1) NOT NULL DEFAULT '0',
`comment_has_attachments` tinyint(1) unsigned NOT NULL,
`comment_has_edits` tinyint(1) NOT NULL,
PRIMARY KEY (`comment_id`),
KEY `comment_user_id` (`comment_user_id`),
KEY `comment_content_id` (`comment_content_id`),
KEY `comment_is_spam` (`comment_is_spam`),
KEY `comment_enabled` (`comment_enabled`),
KEY `comment_time_updated` (`comment_time_updated`),
KEY `comment_time_added` (`comment_time_added`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=352 ;
-- --------------------------------------------------------
--
-- Table structure for table `users`
--
CREATE TABLE IF NOT EXISTS `users` (
`user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`user_ipb_id` int(10) unsigned DEFAULT NULL,
`user_activated` tinyint(1) NOT NULL DEFAULT '0',
`user_name` varchar(64) CHARACTER SET latin1 NOT NULL,
`user_email` varchar(255) NOT NULL,
`user_password` varchar(40) NOT NULL,
`user_content_count` int(10) unsigned NOT NULL DEFAULT '0',
`user_comments_count` int(10) unsigned NOT NULL DEFAULT '0',
`user_salt` varchar(8) NOT NULL,
`user_api_key` varchar(32) NOT NULL,
`user_auth_key` varchar(32) DEFAULT NULL,
`user_paypal_key` varchar(32) DEFAULT NULL,
`user_timezone_id` smallint(3) unsigned NOT NULL,
`user_group_id` tinyint(3) unsigned NOT NULL,
`user_custom_permission_mask_id` tinyint(3) unsigned DEFAULT NULL,
`user_lang_id` tinyint(2) unsigned NOT NULL,
`user_time_registered` int(10) unsigned NOT NULL,
`user_time_last_active` int(10) unsigned NOT NULL
PRIMARY KEY (`user_id`),
UNIQUE KEY `user_email` (`user_email`),
KEY `user_group_id` (`user_group_id`),
KEY `user_auth_key` (`user_auth_key`),
KEY `user_api_key` (`user_api_key`),
KEY `user_custom_permission_mask_id` (`user_custom_permission_mask_id`),
KEY `user_time_last_active` (`user_time_last_active`),
KEY `user_paypal_key` (`user_paypal_key`),
KEY `user_name` (`user_name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=33 ;
-- --------------------------------------------------------
--
-- Table structure for table `user_groups`
--
CREATE TABLE IF NOT EXISTS `user_groups` (
`user_group_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`user_group_name` varchar(32) NOT NULL,
`user_group_permission_mask_id` tinyint(3) unsigned NOT NULL,
`user_group_icon` varchar(32) DEFAULT NULL,
PRIMARY KEY (`user_group_id`),
KEY `user_group_permission_mask_id` (`user_group_permission_mask_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
-- --------------------------------------------------------
--
-- Table structure for table `user_profile`
--
CREATE TABLE IF NOT EXISTS `user_profile` (
`user_id` mediumint(8) unsigned NOT NULL,
`user_signature_original` text,
`user_signature_html` text,
`user_avatar` varchar(64) DEFAULT NULL,
`user_steam_id` varchar(64) DEFAULT NULL,
`user_ps_id` varchar(16) DEFAULT NULL,
`user_xbox_id` varchar(64) DEFAULT NULL,
`user_wii_id` varchar(64) DEFAULT NULL,
PRIMARY KEY (`user_id`),
KEY `user_steam_id` (`user_steam_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Most database engines calculate their query plan based on statistics about the tables - for instance, if a table has a small number of rows, it's quicker to go to the table than the index. Those statistics are maintained during "normal" operation - e.g. inserts, updates and deletes - but can get out of sync when table definitions are changed, or when you do bulk inserts.
If you see unexpected behaviour in the query plan, you can force the database to update its statistics; in MySQL you can use Optimize Table - which does everything, including re-ordering the table itself, or Analyze Table which only updates the indices.
This is hard to do on production environments, as both operations lock the tables; if you can possibly negotiate a maintenance window, that's by far the simplest way to deal with the problem.
It's worth measuring performance of "optimize table" - on well-specified hardware, it should take only a couple of seconds for "normal" size tables (up to low millions of records, with only a few indices). That might mean you can have an "informal" maintenance window - you don't take the application off-line, you just accept that some users will have degraded performance while you're running the scripts.
MySQL has an EXPLAIN feature which will help you to understand the query:
$ mysql
> EXPLAIN SELECT `comments`.`comment_id`, `comments`.`comment_html`,`comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`
FROM (`comments`)
INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
WHERE `comments`.`comment_enabled` = 1
AND `comments`.`comment_content_id` = 12
ORDER BY `comments`.`comment_time_added` ASC
LIMIT 20
MySQL might simply be missing, or skipping an index.
You can learn more about understanding the output of EXPLAIN here from the documentation (a little hard-core), or better yet from a simpler explanation here, (ignore the fact that it's on a Java site.)
More than likely the amount of data, or an outdated or incomplete index is meaning that MySQL is falsely doing a table scan. When you see table scans, or sequential serches, you can often easily see which field is missing an index, or an index which is not usable.
Could you please try this one (you can remove join with user_group ). It can be faster in case if query retrieve small data set from comments table:
SELECT
comments.comment_id, comments.comment_html, comments.comment_time_added, comments.comment_has_attachments, users.user_name, users.user_id, users.user_comments_count, users.user_time_registered, users.user_time_last_active, user_profile.user_avatar, user_profile.user_signature_html, user_groups.user_group_icon, user_groups.user_group_name
FROM
(select * from comments where comment_content_id = 12 and active = 1) comments
INNER JOIN users u ON c.comment_user_id = users.user_id
INNER JOIN user_profile ON users.user_id = user_profile.user_id
INNER JOIN user_groups ON users.user_group_id = user_groups.user_group_id
ORDER BY comments.comment_time_added ASC
LIMIT 20
Try using left joins on the non null relations.
It seems that since inner joins are always symmetric mysql will reorder the joins to use best looking (typically smallest) table first.
Since left joins aren't always symmetric mysql won't reorder them and thus you can use them to force the table order. However with a non null field left and inner are equivalent so your results won't change.
The table order will determine what indicies are used which can greatly impact performance.

What is the ideal solution for storing multiple ID's in a database table?

My real-estate application has a database table that it holds the queries(inquiries) by the user. This table holds all the information about a particular real-estate query.
The database table looks something like this:
CREATE TABLE IF NOT EXISTS `queries` (
`id` bigint(20) NOT NULL auto_increment,
`category_id` int(10) NOT NULL,
`serial` varchar(30) NOT NULL,
`minBudget` int(11) NOT NULL,
`maxBudget` int(11) NOT NULL,
`city_id` int(10) NOT NULL,
`area_id` int(10) NOT NULL,
`locality` varchar(100) NOT NULL,
`status` tinyint(1) NOT NULL,
PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
In city_id and area_id chances are there are situations where I would want to store multiple IDs; for example, 10, 12, 20, 50 and so on.
How should I be storing this in the databsae?
Should I use the varchar datatype and hold it as a string with a delimiter defined, then fetch the value in an array?
You can do that, but it's not really the preferred solution. This is the classic tradeoff in database normalization.
The "more correct" approach is to have tables "cities" and "areas", and tables "queries_cities" and "queries_areas" that are many-to-many to relate them.
Else- what happens if a city or area id changes? Rather than change a single record in one place, you'll get to write a script to go through and update all the query records manually.
Do NOT use a varchar if those are IDs to another table. Use a many-to-many table mapping them together.
CREATE TABLE IF NOT EXIST `query_cities` (
`id` bigint(20) NOT NULL auto_increment,
`query_id` bigint(20),
`city_id` bigint(20)
)
CREATE TABLE IF NOT EXIST `query_areas` (
`id` bigint(20) NOT NULL auto_increment,
`area_id` bigint(20)
)
This will be much cleaner than stuffing things into a varchar - for instance, it allows you to say:
SELECT c.city_name, c.state, c.whatever FROM queries q
JOIN cities c ON (q.city_id = c.id) WHERE q.id = ?
Edit: meh, I'm lame and didn't include foreign keys, there, but you get the point.