same query, dramatic different performances on different data. MySQL - mysql

So, I have this huge database with 40 Millions entries.
the query is a simple (a_view is a view!)
select * from a_view where id > x LIMIT 10000
this is the behavior I get: If x is a little number (int) the query is super fast.
when x > 29 Millions the query starts to take minutes. if it is closer to 30 Millions it takes hours. and so on...
why is that? what can I do to avoid this?
I am using InnoDB as engine, tables have indexes.
the value of the limit is a critical one, it affects performances. if it is small the query is always fast. but if x is close to 30Millions then I need to be very careful to set it not too big (less than 300 hundreds), and still it is quite slow, but doesn't take forever
If you need more details, feel free to ask.
EDIT: here is the explain
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+
| 1 | SIMPLE | aH | index | PRIMARY | PRIMARY | 39 | NULL | 3028439 | Using index |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 4 | odb.aH.albumID | 1 | Using where |
| 1 | SIMPLE | aHT | ref | PRIMARY,albumID | albumID | 4 | odb.a.albumID | 4 | |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | odb.aHT.id | 1 | Using where |
| 1 | SIMPLE | g | eq_ref | PRIMARY | PRIMARY | 4 | odb.t.genre | 1 | |
| 1 | SIMPLE | ar | eq_ref | PRIMARY | PRIMARY | 4 | odb.t.artist | 1 | |
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+

Here is a guess. Basically, your view is a select on some tables. The "id" could be a row number. The larger your "x" is, the more select rows need to be created (and discarded) before you can get whatever data you want. That is why your query slows down when your "x" increases.
If this is true, one solution could be to create a table that contains the rownum and a primary key sorted by whatever "order by" you are using. Once you have that table, you can join it with the rest of your data and select your data window by a rownum range.

Related

How can I make this UPDATE query faster?

I need to make this update query more efficient.
UPDATE #table_name# SET #column_name2# = 1 WHERE #column_name1# in (A list of data)
Right now it takes more than 2 minute to finish the job when my list of data is quite large. Here is the result of explain of this query:
+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+
| 1 | SIMPLE | #table_name# | index | NULL | PRIMARY | 38 | NULL | 763719 | Using where; Using temporary |
+----+-------------+--------------+-------+---------------+---------+---------+------+--------+------------------------------+
In class, I was told that an OK query should at least have a type of range and is better to reach ref. Right now mine is index, which is the second slowest I think. I'm wondering if there's a way to optimize that.
Here is the table format:
+--------------------+-------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+-------------+------+-----+-------------------+-------+
| #column_name1# | varchar(12) | NO | PRI | | |
| #column_name2# | tinyint(4) | NO | | 0 | |
| #column_name3# | tinyint(4) | NO | | 0 | |
| ENTRY_TIME | datetime | NO | | CURRENT_TIMESTAMP | |
+--------------------+-------------+------+-----+-------------------+-------+
My friend suggested me that using exists rather than in clause may help. However, it looks like I cannot use exists like exists (A list of data)
For this query:
UPDATE #table_name#
SET #column_name2# = 1
WHERE #column_name1# in (A list of data);
You want an index on #table_name#(#column_name1#).
Do note that the number of records being updated has a very big impact on performance. If the "list of data" is really a subquery, then other methods are likely to be more helpful for improving performance.

MySQL query performance: individuate and rewrite faster query

today I stumbled upon two different form of the same query (which return the very same result) but that execute in very different durations:
ORIGINAL QUERY:
select count(distinct unit.ID)
from UNIT unit
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
or
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
);
plan:
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| 1 | SIMPLE | unit | NULL | ALL | FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | NULL | NULL | NULL | 451486 | 100.00 | NULL |
| 1 | SIMPLE | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID | 9 | edea2.unit.ID | 1 | 100.00 | Using where |
| 1 | SIMPLE | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
duration:
+-------------------------+
| count(distinct unit.ID) |
+-------------------------+
| 538 |
+-------------------------+
1 row in set (2.46 sec)
Then, I observed that when executing this query with only one predicate of where, the duration is sensibly decreased.
So I had the idea to rewrite it in a new style:
select count(distinct unit_root.ID)
from UNIT unit_root
where
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
)
or
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
where
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
)
);
plan:
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| 1 | PRIMARY | unit_root | NULL | index | PRIMARY,FK_UNIT_RESPONSIBLE_ID,FK_UNIT_WORKFLOW_EXECUTION_ID,IX_UNIT_EXPECTEDRELEASEDATE | IX_UNIT_EXPECTEDRELEASEDATE | 6 | NULL | 451486 | 100.00 | Using where; Using index |
| 3 | SUBQUERY | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID,FK_AUTHORIZATION_SUBJECT_ID,IX_AUTHORIZATION_INTERVENTION | FK_AUTHORIZATION_SUBJECT_ID | 8 | const | 1 | 50.00 | Using where |
| 3 | SUBQUERY | unit | NULL | eq_ref | PRIMARY,FK_UNIT_WORKFLOW_EXECUTION_ID | PRIMARY | 8 | edea2.auth.UNIT_ID | 1 | 100.00 | NULL |
| 3 | SUBQUERY | exec | NULL | eq_ref | PRIMARY,IX_WORKFLOW_EXECUTION_ACTIVE | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 26.47 | Using where |
| 2 | SUBQUERY | unit | NULL | ref | PRIMARY,FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | FK_UNIT_RESPONSIBLE_ID | 8 | const | 225743 | 100.00 | NULL |
| 2 | SUBQUERY | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
duration:
+------------------------------+
| count(distinct unit_root.ID) |
+------------------------------+
| 538 |
+------------------------------+
1 row in set (0.51 sec)
Finally, the questions:
Why there's such a difference? Shouldn't the optimizer be able to ptimize this kind of queries?
Is there a tip to quickly individuate this kind of queries without having to measure execution time or investigating query plan?
Any tips on how to rewite to faster style?
Note that I'm using MySQL 5.7.10 and these queries are generated by Hibernate.
Thank you
You have too many questions, but here is some background.
First, you are asking way too much of the optimizer. I haven't looked through the details of your queries, but they may not be exactly the same logically. For instance, NULL values or duplicate values in a table might cause differences.
Second, although it is often said (by me and many others) that SQL is a descriptive language and not a procedural language. However, this doesn't extend to query re-writes. It usually means that the optimizer can re-arrange joins, use indexes, choose appropriate algorithms for joins and aggregations, decide the optimal place do filtering, and a few other things. However, the basic structure of the processing is put in place.
In terms of hints. The use of GROUP BY and DISTINCT slow queries down. Don't avoid them! They are necessary parts of the language. But as you discover in these two queries, there can be a savings. Another danger is OR, because it can prevent the use of relevant indexes.
Judging by the complexity of logic in the subqueries, it might be possible to further optimize the queries.
Finally, if you want to write efficient queries, you cannot depend on the query optimizer. You will need to put the work in to learn how the query optimizer works, as well as something about the algorithms used for different components of the query, and fundamental optimization techniques such as indexing and partitioning.

Simple Query Slow In Mysql

Is there anyway to get better performance out of this.
select * from p_all where sec='0P00009S33' order by date desc
Query took 0.1578 sec.
Table structure is shown below. There are more than 100 Millions records in this table.
+------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+---------------+------+-----+---------+-------+
| sec | varchar(10) | NO | PRI | NULL | |
| date | date | NO | PRI | NULL | |
| open | decimal(13,3) | NO | | NULL | |
| high | decimal(13,3) | NO | | NULL | |
| low | decimal(13,3) | NO | | NULL | |
| close | decimal(13,3) | NO | | NULL | |
| volume | decimal(13,3) | NO | | NULL | |
| unadjusted_close | decimal(13,3) | NO | | NULL | |
+------------------+---------------+------+-----+---------+-------+
EXPLAIN result
+----+-------------+-----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | price_all | ref | PRIMARY | PRIMARY | 12 | const | 1731 | Using where |
+----+-------------+-----------+------+---------------+---------+---------+-------+------+-------------+
How can i speed up this query?
In your example, you do a SELECT *, but you only have an INDEX that contains the columns sec and date.
In result, MySQLs execution plan roughly looks like the following:
Find all rows that have sec = 0P00009S33 in the INDEX. This is fast.
Sort all returned rows by date. This is also possibly fast, depending on the size of your MySQL buffer. Here is possibly room for improvement by optimizing the sort_buffer_size.
Fetch all columns (= full row) for each returned row from the previous INDEX query. This is slow! see (1)
You can optimize this drastically by reducing the SELECTed fields to the minimum. Example: If you only need the open price, do only a SELECT sec, date, open instead of SELECT *.
When you identified the minimum columns you need to query, add a combined INDEX that contains exactly these colums (all columns involved - in the WHERE, SELECT or ORDER BY clause)
This way you can completely skip the slow part of this query, (3) in my example above. When the INDEX already contains all necessary columns, MySQLs optimizer can avoid looking up the full columns and serve your query directly from the INDEX.
Disclaimer: I'm unsure in which order MySQL executes the steps, possibly i ordered (2) and (3) the wrong way round. But this is not important to answer this question, though.

MySQL view taking too much time to select data

In the web page that I'm working on I need to show some statistics based on a different user details which are in three tables. So I have the following query that I join to more different tables:
SELECT *
FROM `user` `u`
LEFT JOIN `subscriptions` `s` ON `u`.`user_id` = `s`.`user_id`
LEFT JOIN `devices` `ud` ON `u`.`user_id` = `ud`.`user_id`
GROUP BY `u`.`user_id`
When I execute the query with LIMIT 1000 it takes about 0.05 seconds and since I'm using the data from all the three tables in a lot of queries I've decided to put it inside a VIEW:
CREATE VIEW `user_details` AS ( the same query from above )
And now when I run:
SELECT * FROM user_details LIMIT 1000
it takes about 7-10 seconds.
So my question is can I do something to optimize the view because the query seems to be pretty quick or I should the whole query instead of the view ?
Edit: this is what EXPLAIN SELECT * FROM user_details returns
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 322666 | |
| 2 | DERIVED | u | index | NULL | PRIMARY | 4 | NULL | 372587 | |
| 2 | DERIVED | s | eq_ref | PRIMARY | PRIMARY | 4 | db_users.u.user_id | 1 | |
| 2 | DERIVED | ud | ref | device_id_name | device_id_name | 4 | db_users.u.user_id | 1 | |
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
4 rows in set (8.67 sec)
this is what explain retuns for the query:
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
| 1 | SIMPLE | u | index | NULL | PRIMARY | 4 | NULL | 372587 | |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | db_users.u.user_id | 1 | |
| 1 | SIMPLE | ud | ref | device_id_name | device_id_name | 4 | db_users.u.user_id | 1 | |
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
3 rows in set (0.00 sec)
Views and joins are extremely bad if it comes to performance. This is more or less true for all relational database management systems. Sounds strange, since that is what those systems are designed for, but it is true nevertheless.
Try to avoid the joins if this is a query in heavy usage on your page: instead create a real table (not a view) that is filled from the three tables. you can automate that process using triggers. So each time an entry is inserted into one of the original tables the triggers takes care that the data is propagated to the physical user_details table.
This strategy certainly means a one time investment for the setup, but you definitely will get a much better performance.

Attempting to create index for MySQL query

I don't have a lot of experience yet with MySQL and with databases in general, though I'm going head on into the development of a large-scale web app anyway. The following is the search query for my app that allows users to search for other users. Now that the primary table for this query dev_Profile has about 14K rows, the query is considerably slow (about 5 secs when running a query that returns the largest set possible). I'm sure there are many optimization tweaks that could be made here, but would creating an index be the most fundamental first step to do here? I've been first trying to learn about indexes on my own, and how to make an index for a query with multiple joins, but I'm just not quite grasping it. I'm hoping that seeing things in the context of my actual query could be more educational.
Here's the basic query:
SELECT
dev_Profile.ID AS pid,
dev_Profile.Name AS username,
IF(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP()) > 300 OR ISNULL(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP())), 0, 1) AS online,
FLOOR(DATEDIFF(CURRENT_DATE, dev_Profile.DOB) / 365) AS age,
IF(dev_Profile.GenderID=1, 'M', 'F') AS sex,
IF(ISNULL(st2.Description), 0, st2.Description) AS relStatus,
st3.Name AS country,
IF(dev_Profile.RegionID > 0, st4.Name, 0) AS region,
IF(dev_Profile.CityID > 0, st5.Name, 0) AS city,
IF(ISNULL(st6.filename), 0, IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, 1, 0)) AS hasPhoto,
IF(ISNULL(st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'), IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, CONCAT('http://www.mysite.com/uploads/', st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'))) AS photo,
IF(ISNULL(dev_Profile.StatusMessage), IF(ISNULL(dev_Profile.AboutMe), IF(ISNULL(st7.AboutMyMatch), 0, st7.AboutMyMatch), dev_Profile.AboutMe), dev_Profile.StatusMessage) AS text
FROM
dev_Profile
LEFT JOIN dev_User AS st1 ON st1.ID = dev_Profile.UserID
LEFT JOIN dev_ProfileRelationshipStatus AS st2 ON st2.ID = dev_Profile.ProfileRelationshipStatusID
LEFT JOIN Country AS st3 ON st3.ID = dev_Profile.CountryID
LEFT JOIN Region AS st4 ON st4.ID = dev_Profile.RegionID
LEFT JOIN City AS st5 ON st5.ID = dev_Profile.CityID
LEFT JOIN dev_Photos AS st6 ON st6.ID = dev_Profile.PhotoAvatarID
LEFT JOIN dev_DesiredMatch AS st7 ON st7.ProfileID = dev_Profile.ID
WHERE
dev_Profile.ID != 11222 /* $_SESSION['ProfileID'] */
AND st1.EmailVerified = 'true'
AND st1.accountIsActive=1
ORDER BY st1.lastActivityTime DESC LIMIT 900;
The speed of this query (too slow, as you can see):
900 rows in set (5.20 sec)
The EXPLAIN for this query:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CountryID | 1 | |
| 1 | SIMPLE | st4 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.RegionID | 1 | |
| 1 | SIMPLE | st5 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CityID | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
It's also possible that the query can have more WHERE and HAVING clauses, if a user's search contains additional criteria. The additional clauses are (set with example values):
AND dev_Profile.GenderID = 1
AND dev_Profile.CountryID=127
AND dev_Profile.RegionID=36
AND dev_Profile.CityID=601
HAVING (age >= 18 AND age <= 50)
AND online=1
AND hasPhoto=1
This is the EXPLAIN for the query using all possible WHERE and HAVING clauses:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st4 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st5 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
I'm not even sure if this is TMI or not enough.
Is an index the right step to take here? If so, could someone get me going in the right direction?
The right step is whatever speeds up your query!
With your original query, I would say that you end up doing a table scan on the dev_Profile table as there are no indexable conditions on it. With your modified query, it depends on the number of different values allowed in the column - if there are may duplicates then the index may not get used as it has to fetch the table anyway in order to complete the rest of the query.
I have read your plan correctly then you are joining all of your other tables on an indexed non-nullable column already (except for st7, which doesn't seem to be using an index for some reason). It therefore looks as if you should not be using left joins. This would then allow the use of an index on (EmailVerified, accountIsActive, lastActivityTime) on table st1.
One should use indexes that are relevant for frequent queries. An index just slightly degrades write-performance while immensely speeds searches. As a rule of thumb, objects own IDs should be indexed as PRIMARY key and it's a good idea to have an index on column-groups that appear always together in a query. I figure you should index GenderID, CountryID, RegionID, CityID, age, online and hasPhoto. You should provide the schema of at least dev_Profile if you think that the right indexes are not used.
Notice that country/region/city IDs might represent redundant information. Your design may be suboptimal.
Notice2: you're doing awfully lot of application logic in SELECT. SQL is not designed for these lots of IF-in-IF-in-IF clauses, and because of the URLs the query is returning much larger a table than if would if you just requested the relevant field (i.e. filename, genderID, and so on). There might be times when those exact interpreted values have to be returned by the query, by in general you are better off (in the aspects of speed and readability) to code these processing steps into your application code.