I have news table defined like this:
CREATE TABLE `news` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`creation_date` datetime DEFAULT NULL,
`modification_date` datetime DEFAULT NULL,
`active` bit(1) DEFAULT NULL,
`mark_for_delete` bit(1) DEFAULT NULL,
`verified` bit(1) DEFAULT NULL,
`bot_id` int(11) DEFAULT NULL,
`description` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`hash` varchar(100) NOT NULL,
`published_at` datetime DEFAULT NULL,
`source` varchar(255) DEFAULT NULL,
`title` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`url` varchar(511) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_1dmji5m90xaiy84vttgkvsub2` (`hash`),
KEY `index_news_source` (`source`),
KEY `index_news_creation_date` (`creation_date`)
) ENGINE=InnoDB AUTO_INCREMENT=30887718 DEFAULT CHARSET=latin1
And a join table to tag news belonging to some popular names:
CREATE TABLE `star_news` (
`stars_id` bigint(20) NOT NULL,
`news_id` bigint(20) NOT NULL,
PRIMARY KEY (`stars_id`,`news_id`),
KEY `FK4eqjn8at6h4d9335q1plxkcnl` (`news_id`),
CONSTRAINT `FK1olc51y8amp8op1kbmx269bac` FOREIGN KEY (`stars_id`) REFERENCES `star` (`id`),
CONSTRAINT `FK4eqjn8at6h4d9335q1plxkcnl` FOREIGN KEY (`news_id`) REFERENCES `news` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Here is my query to return the latest news
SELECT DISTINCT n.*
FROM news n
JOIN star_news sn
ON n.id = sn.news_id
WHERE sn.stars_id IN (1234, 12345)
ORDER BY n.creation_date DESC
LIMIT 2;
Explain:
+----+-------------+-------+------------+--------+-------------------------------------+---------+---------+-----------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------+---------+---------+-----------------------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | sn | NULL | range | PRIMARY,FK4eqjn8at6h4d9335q1plxkcnl | PRIMARY | 8 | NULL |196225| 100.00 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | n | NULL | eq_ref | PRIMARY | PRIMARY | 8 | cosmos_dev.sn.news_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+-------------------------------------+---------+---------+-----------------------+------+----------+-----------------------------------------------------------+
This query takes 20 seconds on my machine. If I remove the order by clause it returns in sub milli second. How do I make the order by run faster?
I tried using force index on creation_date since its an indexed column, but it worsened the performance
First, write the query as:
SELECT n.*
FROM news n
WHERE EXISTS (SELECT 1
FROM star_news sn
WHERE n.id = sn.news_id AND
sn.stars_id IN (1234, 12345)
)
ORDER BY n.creation_date DESC
LIMIT 2 ;
This eliminates the outer SELECT DISTINCT, which should help.
Then, create an index on star_news(news_id, stars_id). This might also take advantage of an index on news(creation_date desc, id).
So you have 196k news articles relating to those 2 stars? The explain extra tells you what is happening:
Using where; Using index; Using temporary; Using filesort
MySQL is creating a temporary file and sorting it to satisfy the order by because it could not use an index that would facilitate the join AND the ordering of articles by date.
Related
we have this table
CREATE TABLE `resource_grant` (
`resource_grant_id` int(11) NOT NULL AUTO_INCREMENT,
`member_type_id` int(11) NOT NULL,
`member_ref` varchar(36) NOT NULL,
`resource_type_id` int(11) NOT NULL,
`resource_ref` varchar(36) NOT NULL,
`role_id` int(11) NOT NULL,
`modified_by` varchar(255) NOT NULL,
`modified_timestamp` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`parent_grant_id` int(11) DEFAULT NULL,
PRIMARY KEY (`resource_grant_id`),
UNIQUE KEY `member_ref` (`member_ref`,`member_type_id`,`resource_type_id`,`resource_ref`),
KEY `member_type_id` (`member_type_id`),
KEY `resource_type_id` (`resource_type_id`),
KEY `role_id` (`role_id`),
KEY `resource_ref` (`resource_ref`,`resource_type_id`),
KEY `idx_rg_parent_grant_id` (`parent_grant_id`),
KEY `resource_ref_2` (`resource_ref`,`member_ref`,`resource_type_id`,`member_type_id`,`role_id`),
CONSTRAINT `resource_grant_ibfk_1` FOREIGN KEY (`member_type_id`) REFERENCES `member_type` (`member_type_id`),
CONSTRAINT `resource_grant_ibfk_2` FOREIGN KEY (`resource_type_id`) REFERENCES `resource_type` (`resource_type_id`),
CONSTRAINT `resource_grant_ibfk_3` FOREIGN KEY (`role_id`) REFERENCES `role` (`role_id`)
) ENGINE=InnoDB;
and these related tables
CREATE TABLE `member_type` (
`member_type_id` int(11) NOT NULL AUTO_INCREMENT,
`member_type` varchar(36) NOT NULL,
`modified_by` varchar(36) NOT NULL,
`modified_timestamp` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`member_type_id`),
UNIQUE KEY `member_type` (`member_type`),
KEY `member_type_2` (`member_type`)
) ENGINE=InnoDB;
CREATE TABLE `resource_type` (
`resource_type_id` int(11) NOT NULL AUTO_INCREMENT,
`resource_type` varchar(36) NOT NULL,
`modified_by` varchar(36) NOT NULL,
`modified_timestamp` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`resource_type_id`),
UNIQUE KEY `resource_type` (`resource_type`),
KEY `resource_type_2` (`resource_type`)
) ENGINE=InnoDB;
CREATE TABLE `role` (
`role_id` int(11) NOT NULL AUTO_INCREMENT,
`role_ref` varchar(50) NOT NULL,
`name` varchar(256) NOT NULL,
`modified_by` varchar(36) NOT NULL,
`modified_timestamp` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
`deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`role_id`),
UNIQUE KEY `role_ref` (`role_ref`),
KEY `role_ref_2` (`role_ref`)
) ENGINE=InnoDB;
and we need to run selects like these ("Row Constructor Expression" syntax) (basically "bulk selects")
SELECT rg.resource_grant_id
FROM resource_grant rg
JOIN resource_type rt ON rg.resource_type_id = rt.resource_type_id
JOIN member_type mt ON rg.member_type_id = mt.member_type_id
JOIN role r ON r.role_id = rg.role_id
WHERE
(rg.resource_ref, rg.member_ref, rt.resource_type, mt.member_type, r.role_ref)
IN
(
('759','624962','property','epc-user','role.171'),
('11974','624962','property','epc-user','role.171')
);
the selects take ~60s to run, which is unacceptably long
note that there IS an index for (resource_ref,member_ref,resource_type_id,member_type_id,role_id)
we also don't want to run n individual select statements - we need these "bulk selects".
mysql 5.6 docs talk about this style of select not using indexes but you can make it using some tricks
https://dev.mysql.com/doc/refman/5.6/en/row-constructor-optimization.html
https://dev.mysql.com/doc/refman/5.6/en/range-optimization.html
not sure what's missing for us in order to make it use the indexes
EDIT here's the plan
mysql> explain SELECT rg.resource_grant_id FROM resource_grant rg JOIN resource_type rt ON rg.resource_type_id = rt.resource_type_id JOIN member_type mt ON rg.member_type_id = mt.member_type_id JOIN role r ON r.role_id = rg.role_id WHERE (rg.resource_ref, rg.member_ref, rt.resource_type, mt.member_type, r.role_ref) IN ( ('759','624962','property','epc-user','role.171'), ('11974','624962','property','epc-user','role.171') );
+----+-------------+-------+--------+-----------------------------------------+----------------+---------+--------------------------+---------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------------------------+----------------+---------+--------------------------+---------+----------------------------------------------------+
| 1 | SIMPLE | rt | index | PRIMARY | resource_type | 38 | NULL | 3 | Using index |
| 1 | SIMPLE | mt | index | PRIMARY | member_type | 38 | NULL | 6 | Using index; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | rg | ref | member_type_id,resource_type_id,role_id | member_type_id | 4 | samsDB.mt.member_type_id | 2370237 | Using where |
| 1 | SIMPLE | r | eq_ref | PRIMARY | PRIMARY | 4 | samsDB.rg.role_id | 1 | Using where |
+----+-------------+-------+--------+-----------------------------------------+----------------+---------+--------------------------+---------+----------------------------------------------------+
4 rows in set (0.53 sec)
Start by changing the where clause to:
WHERE rg.member_ref = '624962' AND
rt.resource_type = 'property' AND
mt.member_type = 'epc-user' AND
r.role_ref = 'role.171' AND
rg.resource_ref IN ('759', '11974')
The existing indexes are not quite optimal for this. You need an index where the first two keys are (member_ref, resource_ref) -- well, except in the most recent versions of MySQL which implement skip-scan index optimizations.
You might be able to change resource_ref_2 to:
KEY `resource_ref_2` (`member_ref`, `resource_ref`, `resource_type_id`, `member_type_id`, `role_id`),
I'm not surprised at 60s on 5.6. "Row constructors" have existed for a long time. But they were not optimized before 5.7.
Either upgrade or rewrite the WHERE as Gordon suggests.
Mysql version - 5.7.22
Table definition
CREATE TABLE `books` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uuid` char(32) COLLATE utf8mb4_unicode_ci NOT NULL,
`title` varchar(254) COLLATE utf8mb4_unicode_ci NOT NULL,
`created` datetime(6) NOT NULL,
`modified` datetime(6) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `book_uuid` (`uuid`),
) ENGINE=InnoDB AUTO_INCREMENT=115 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Query
SELECT DISTINCT `books`.`id`,
`books`.`uuid`,
`books`.`title`
FROM `books`
WHERE (`books`.`uuid` IN ("334222a0e99b4a3e97f577665055208e",
"979c059840964934816280ba85c67221",
"4e2978c765dd435998666ea3083666e5",
"535aa78ba80e4215bbf75fb1e20cc5f3",
"f969fb10c72b4875aabdf75c1b493524",
"1daa0015055444a4b1c0821618a7a4d9",
"04f34ede284a4b86b0adddb405d30a75",
"513cad12c88c44c6ab248d43643459b9",
"de2bde6d016f4381ad0ba714234386fa",
"f645c2c9f1594a199a960b97b7015986",
"3ce02c072f24447a8a7b269a19ec554f",
"75450daf9d024d9d9c0df038437ae2c2",
"0e822042b50b4f79bb38304e0acde6f0",
"38d808fb3f9a4f57b4f7b30a141e7169",
"ecd424abd3a94a339383f6f8e668655e"))
ORDER BY `books`.`id` DESC
LIMIT 15;
when i do explain on this query it doesn't pick the index
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | books | NULL | ALL | book_uuid | NULL | NULL | NULL | 107 | 12.15 | Using where; Using filesort |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
strangely the index is correctly used when there are 12 or less entries passed to the IN clause (without forcing index)
forcing the index works, but I cannot force index as this query is created by django ORM, cannot use django-mysql 's force_index as I am on Innodb
Note: I know 'distinct' and 'limit' can be avoided in this query, but its part of a bigger query, so i have kept it as is
I am trying to prevent mysql from creating a temporary table in this query
SELECT `vendor_id`,SUM(`qty`) AS `qty`
FROM `inventory_transactions`
WHERE `inventory_transactions`.`date`
BETWEEN '2018-10-21 00:00:00' AND '2018-10-22 23:59:59'
GROUP BY `vendor_id`
I've tried re-arranging the indexes, using SELECT DISTINCT, MIN(vendor_id),MAX(vendor_id), adding a COUNT(*) Column in an attempt to see if it would use the index to sort.
I've had ix(date,type,vendor_id) in every variation as well as individual indexes.
I just can't seem to figure out why mysql keeps trying to sort from a temporary table, the only way it doesn't use a temp table is if i group by the date column, which is not what I want.
Anyone have any insight as to how to fix it?
Table Schema
CREATE TABLE `transactions` (
`id` int(11) NOT NULL,
`vendor_id` int(11) DEFAULT NULL,
`type` varchar(50) DEFAULT NULL,
`unit_cost` decimal(10,2) NOT NULL,
`qty` decimal(11,2) DEFAULT NULL,
`location_id` int(11) NOT NULL,
`date` timestamp NOT NULL,
PRIMARY KEY (`id`),
KEY `location_id` (`location_id`),
KEY `type` (`type`),
KEY `vendor_id` (`vendor_id`) USING BTREE,
KEY `date` (`date`,`vendor_id`,`type`) USING BTREE,
CONSTRAINT `transactions_ibfk_1` FOREIGN KEY (`location_id`) REFERENCES `locations` (`id`),
CONSTRAINT `transactions_ibfk_3` FOREIGN KEY (`vendor_id`) REFERENCES `vendors` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
EXPLAIN
+----+-------------+------------------------+------------+-------+----------------------+------------+---------+------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------------+------------+-------+----------------------+------------+---------+------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | inventory_transactions | NULL | range | vendor_id,date | date | 4 | NULL | 1196 | 100.00 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+------------------------+------------+-------+----------------------+------------+---------+------+------+----------+-----------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
I'm having a problem with this query that takes several seconds to complete. I already tried many optimizations but I'm shooting blanks at this point.
The tables are the following (and are not absolutely normalized fully especially the tracks table)
CREATE TABLE `tracks` (
`id` int(14) unsigned NOT NULL AUTO_INCREMENT,
`artist` varchar(200) NOT NULL,
`track` varchar(200) NOT NULL,
`album` varchar(200) NOT NULL,
`path` text NOT NULL,
`tags` text NOT NULL,
`priority` int(10) NOT NULL DEFAULT '0',
`lastplayed` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`lastrequested` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`usable` int(1) NOT NULL DEFAULT '0',
`accepter` varchar(200) NOT NULL DEFAULT '',
`lasteditor` varchar(200) NOT NULL DEFAULT '',
`hash` varchar(40) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `hash` (`hash`),
FULLTEXT KEY `searchindex` (`tags`,`artist`,`track`,`album`),
FULLTEXT KEY `artist` (`artist`,`track`,`album`,`tags`)
) ENGINE=MyISAM AUTO_INCREMENT=3336 DEFAULT CHARSET=utf8
CREATE TABLE `esong` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`hash` varchar(40) COLLATE utf8_bin NOT NULL,
`len` int(10) unsigned NOT NULL,
`meta` text COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `hash` (`hash`)
) ENGINE=InnoDB AUTO_INCREMENT=16032 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
CREATE TABLE `efave` (
`id` int(10) unsigned NOT NULL DEFAULT '0',
`inick` int(10) unsigned NOT NULL,
`isong` int(10) unsigned NOT NULL,
UNIQUE KEY `inick` (`inick`,`isong`),
KEY `isong` (`isong`),
CONSTRAINT `inick` FOREIGN KEY (`inick`) REFERENCES `enick` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `isong` FOREIGN KEY (`isong`) REFERENCES `esong` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `enick` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT
`nick` varchar(30) COLLATE utf8_bin NOT NULL,
`dta` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`dtb` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
KEY `nick` (`nick`)
) ENGINE=InnoDB AUTO_INCREMENT=488 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
and the query I'm trying to execute with a normal speed is the following
SELECT esong.meta, tracks.id FROM tracks RIGHT JOIN esong ON tracks.hash = esong.hash JOIN efave ON efave.isong = esong.id JOIN enick ON efave.inick = enick.id WHERE enick.nick = lower('nickname');
Where if you remove the RIGHT JOIN and change it to JOIN it is fast
The EXPLAIN gives me this result, it seems there is a small problem in the efave selection but I have no idea how to get that out
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
| 1 | SIMPLE | enick | ref | PRIMARY,nick | nick | 92 | const | 1 | 100.00 | Using where; Using index |
| 1 | SIMPLE | efave | ref | inick,isong | inick | 4 | radiosite.enick.id | 12 | 100.00 | Using index |
| 1 | SIMPLE | esong | eq_ref | PRIMARY | PRIMARY | 4 | radiosite.efave.isong | 1 | 100.00 | |
| 1 | SIMPLE | tracks | ALL | hash | NULL | NULL | NULL | 3210 | 100.00 | |
+----+-------------+--------+--------+---------------+---------+---------+-----------------------+------+----------+--------------------------+
Your explain looks clean, the only thing that stands out to me is the fact that the esong table is using a collate of utf8_bin, and the tracks table doesn't have a collation specified, which means it is probably using another collation type. Try aligning your collations and see how the join performs.
Have you checked your Execution Plan? If not, run your query to include it. Your Right Join may be doing an Index Scan instead of an Index Seek. Or you may be lacking indexes. Either way, you need to look at your Execution Plan so you can optimize your query better. No one will really be able to tell you how to make it faster using a Right Join (or a Join for that matter) until you know what the real problem is. Here are some links..
For MySQL: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
For SqlServer: http://www.sql-server-performance.com/2006/query-execution-plan-analysis/
I have this query (shown below) which currently uses temporary and filesort in order to generate a grouped by set of ordered results. I would like to get rid of their usage if possible. I have looked into the underlying indexes used in this query and I just can't see what is missing.
SELECT
b.institutionid AS b__institutionid,
b.name AS b__name,
COUNT(DISTINCT f2.facebook_id) AS f2__0
FROM education_institutions b
LEFT JOIN facebook_education_matches f ON b.institutionid = f.institutionid
LEFT JOIN facebook_education f2 ON f.school_uid = f2.school_uid
WHERE
(
b.approved = '1'
AND f2.facebook_id IN ( [lots of facebook ids here ])
)
GROUP BY b__institutionid
ORDER BY f2__0 DESC
LIMIT 10
Here is the output for EXPLAIN EXTENDED :
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | f | index | PRIMARY,institutionId | institutionId | 4 | NULL | 308 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | f2 | ref | facebook_id_idx,school_uid_idx | school_uid_idx | 9 | f.school_uid | 1 | 100.00 | Using where |
| 1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 4 | f.institutionId | 1 | 100.00 | Using where |
+----+-------------+-------+--------+--------------------------------+----------------+---------+----------------------------------+------+----------+----------------------------------------------+
The CREATE TABLE statements for each table are shown below so you know the schema.
CREATE TABLE facebook_education (
education_id int(11) NOT NULL AUTO_INCREMENT,
name varchar(255) DEFAULT NULL,
school_uid bigint(20) DEFAULT NULL,
school_type varchar(255) DEFAULT NULL,
year smallint(6) DEFAULT NULL,
facebook_id bigint(20) DEFAULT NULL,
degree varchar(255) DEFAULT NULL,
PRIMARY KEY (education_id),
KEY facebook_id_idx (facebook_id),
KEY school_uid_idx (school_uid),
CONSTRAINT facebook_education_facebook_id_facebook_user_facebook_id FOREIGN KEY (facebook_id) REFERENCES facebook_user (facebook_id)
) ENGINE=InnoDB AUTO_INCREMENT=484 DEFAULT CHARSET=utf8;
CREATE TABLE facebook_education_matches (
school_uid bigint(20) NOT NULL,
institutionId int(11) NOT NULL,
created_at timestamp NULL DEFAULT NULL,
updated_at timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (school_uid),
KEY institutionId (institutionId),
CONSTRAINT fk_facebook_education FOREIGN KEY (school_uid) REFERENCES facebook_education (school_uid) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT fk_education_institutions FOREIGN KEY (institutionId) REFERENCES education_institutions (institutionId) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT;
CREATE TABLE education_institutions (
institutionId int(11) NOT NULL AUTO_INCREMENT,
name varchar(100) NOT NULL,
type enum('School','Degree') DEFAULT NULL,
approved tinyint(1) NOT NULL DEFAULT '0',
deleted tinyint(1) NOT NULL DEFAULT '0',
normalisedName varchar(100) NOT NULL,
created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (institutionId)
) ENGINE=InnoDB AUTO_INCREMENT=101327 DEFAULT CHARSET=utf8;
Any guidance would be greatly appreciated.
The filesort probably happens because you have no suitable index for the ORDER BY
It's mentioned in the MySQL "ORDER BY Optimization" docs.
What you can do is load a temp table, select from that afterwards. When you load the temp table, use ORDER BY NULL. When you select from the temp table, use ORDER BY .. LIMIT
The issue is that group by adds an implicit order by <group by clause> ASC unless you disable that behavior by adding a order by null.
It's one of those MySQL specific gotcha's.
I can see two possible optimizations,
b.approved = '1' - You definitely need an index on approved column for quick filtering.
f2.facebook_id IN ( [lots of facebook ids here ]) ) - Store the facebook ids in a temp table,. Then create an index on the temp table and then join with the temp table instead of using IN clause.