Avoid UNION for two almost identical tables in MySQL - mysql

I'm not very good at MySQL and i'm going to write a query to count messages sent by an user, based on its type and is_auto field.
Messages can be of type "small text message" or "newsletter". I created two entities with a few fields that differs between them. The important one is messages_count that is absent in table newsletter and it's used in the query:
CREATE TABLE IF NOT EXISTS `small_text_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
And:
CREATE TABLE `newsletter` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` varchar(78) DEFAULT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
I ended up with a UNION query. Can this query be shortened or optimized since the only difference is messages_count that should be always 1 for newsletter?
SELECT
CONCAT('sms_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(messages_count * (customers_count + recipients_count)) AS count
FROM small_text_message WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto
UNION
SELECT
CONCAT('newsletter_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(customers_count + recipients_count) AS count
FROM newsletter WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto

I don't see any easy way to avoid a UNION (or UNION ALL) operation, that will return the specified result set.
I would recommend you use a UNION ALL operator in place of the UNION operator. Then the execution plan will not include the step that eliminates duplicate rows. (You already have GROUP BY operations on each query, and there is no way that those two queries can produce an identical row.)
Otherwise, your query looks fine just as it is written.
(It's always a good thing to consider the question, might there be a better way? To get the result set you are asking for, from the schema you have, your query looks about as good as it's going to get.)

If you are looking for more general DB advice, I recommend restructuring the tables to factor the common elements into one table, perhaps called outbound_communication or something, with all of your common fields, then perhaps have "sub tables" for the specific types to host the fields which are unique to that type. It does mean a simple JOIN is necessary to select all of the fields you want, but the again, it's normalized and actually makes situations like this one easier (one table holds all of the entities of interest). Additionally, you have the option of writing that JOIN just once as a "view", and then your existing code would not even need to change to see the two tables as if they never changed.
CREATE TABLE IF NOT EXISTS `outbound_communicaton` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `small_text_message` (
`oubound_communication_id` int(11) NOT NULL,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
CREATE TABLE `newsletter` (
`oubound_communication_id` int(11) NOT NULL,
`subject` varchar(78) DEFAULT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
Then selecting a text msg is like this:
SELECT *
FROM outbound_communication AS parent
JOIN small_text_message
ON parent.id = small_text_message.outbound_communication_id
WHERE parent.id = 1234;

The nature of the query is inherently the union of the data from the small text message and the newsletter tables, so the UNION query is the only realistic formulation. There's no join of relevance between the two tables, for example.
So, I think you're very much on the right lines with your query.
Why are you worried about a UNION?

Related

How to optimize query by SUM of relations?

I have 3 simple tables
Invoices ( ~500k records )
Invoice items, one-to-many relation to invoices ( ~10 million records )
Invoice payments, one-to-many relation to invoices ( ~700k records )
Now, as simple as it sounds, I need to query for unpaid invoices.
Here is the query I am using:
select * from invoices
LEFT JOIN (SELECT invoice_id, SUM(price) as totalAmount
FROM invoice_items
GROUP BY invoice_id) AS t1
ON t1.invoice_id = invoices.id
LEFT JOIN (SELECT invoice_id, SUM(payed_amount) as totalPaid
FROM invoice_payment_transactions
GROUP BY invoice_id) AS t2
ON t2.invoice_id = invoices.id
WHERE totalAmount > totalPaid
Unfortunately, this query takes around 30 seconds, so way to slow.
Of course I have indexes set for "invoice_id" on both payments and items.
When I "EXPLAIN" the query, I can see that mysql has to do a full table scan.
I also tried several other query approaches, using "EXISTS" or "IN" with subqueries, but I never got around the full table scan.
Pretty sure there is not much that can be done here ( except use some caching approach ), but maybe someone knows how to optimize this ?
I need this query to run in a +/-2 seconds max.
EDIT:
Thanks to everybody for trying. Please just know that I absolutely know how to adopt different caching strategies here, but this question is purely about optimizing this query !
Here are the ( simplified ) table definitions
CREATE TABLE `invoices`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`date` date NOT NULL,
`title` enum ('M','F','Other') DEFAULT NULL,
`first_name` varchar(191) DEFAULT NULL,
`family_name` varchar(191) DEFAULT NULL,
`street` varchar(191) NOT NULL,
`postal_code` varchar(10) NOT NULL,
`city` varchar(191) NOT NULL,
`country` varchar(2) NOT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`)
) ENGINE = InnoDB
CREATE TABLE `invoice_items`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`invoice_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`name` varchar(191) DEFAULT NULL,
`description` text DEFAULT NULL,
`reference` varchar(191) DEFAULT NULL,
`quantity` smallint(6) NOT NULL,
`price` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `invoice_items_invoice_id_index` (`invoice_id`),
) ENGINE = InnoDB
CREATE TABLE `invoice_payment_transactions`
(
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`invoice_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT current_timestamp(),
`transaction_identifier` varchar(191) NOT NULL,
`payed_amount` mediumint(9) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `invoice_payment_transactions_invoice_id_index` (`invoice_id`),
) ENGINE = InnoDB
Plan A:
Summary table by invoice_id and day. (as Bill suggested) Summary Tables
Plan B:
Change the design to be "current" and "history". This is where the "payments" is a "history" of money changing hands. Meanwhile "invoices" would be "current" in that it contains a "balance_owed" column. This is a philosophy change; it could (should) be encapsulated in a client subroutine and/or a database Stored Procedure.
Plan C: This may be useful if "most" of the invoices are paid off.
Have a flag in the invoices table to indicate paid-off. That will prevent "most" of the JOINs from occurring. (Well, adding that column is just as hard as doing Plan B.)

Slow search query with a one to many join

My problem is a slow search query with a one-to-many relationship between the tables. My tables look like this.
Table Assignment
CREATE TABLE `Assignment` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ProjectId` int(10) unsigned NOT NULL,
`AssignmentTypeId` smallint(5) unsigned NOT NULL,
`AssignmentNumber` varchar(30) NOT NULL,
`AssignmentNumberExternal` varchar(50) DEFAULT NULL,
`DateStart` datetime DEFAULT NULL,
`DateEnd` datetime DEFAULT NULL,
`DateDeadline` datetime DEFAULT NULL,
`DateCreated` datetime DEFAULT NULL,
`Deleted` datetime DEFAULT NULL,
`Lat` double DEFAULT NULL,
`Lon` double DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `idx_assignment_assignment_type_id` (`AssignmentTypeId`),
KEY `idx_assignment_assignment_number` (`AssignmentNumber`),
KEY `idx_assignment_assignment_number_external`
(`AssignmentNumberExternal`)
) ENGINE=InnoDB AUTO_INCREMENT=5280 DEFAULT CHARSET=utf8;
Table ExtraFields
CREATE TABLE `ExtraFields` (
`assignment_id` int(10) unsigned NOT NULL,
`name` varchar(30) NOT NULL,
`value` text,
PRIMARY KEY (`assignment_id`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
My search query
SELECT
`Assignment`.`Id`, COL_5_72, COL_5_73, COL_5_74, COL_5_75, COL_5_76,
COL_5_77 FROM (
SELECT
`Assignment`.`Id`,
`Assignment`.`AssignmentNumber` AS COL_5_72,
`Assignment`.`AssignmentNumberExternal` AS COL_5_73 ,
`AssignmentType`.`Name` AS COL_5_74,
`Assignment`.`DateStart` AS COL_5_75,
`Assignment`.`DateEnd` AS COL_5_76,
`Assignment`.`DateDeadline` AS COL_5_77 FROM `Assignment`
CASE WHEN `ExtraField`.`Name` = "WorkDistrict" THEN
`ExtraField`.`Value` end as COL_5_78 FROM `Assignment`
LEFT JOIN `ExtraFields` as `ExtraField` on
`ExtraField`.`assignment_id` = `Assignment`.`Id`
WHERE `Assignment`.`Deleted` IS NULL -- Assignment should not be removed.
AND (1=1) -- Add assignment filters.
) AS q1
GROUP BY `Assignment`.`Id`
HAVING 1 = 1
AND COL_5_78 LIKE '%Amsterdam East%'
ORDER BY COL_5_72 ASC, COL_5_73 ASC;
When the table is only around 3500 records my query takes a couple of seconds to execute and return the results.
What is a better way to search in the related data? Should I just add a JSON field to the Assignment table and use the MySQL 5.7 Json query features? Or did I made a mistake in designing my database?
You are using select from subquery that forces MySQL to create unindexed temp table for each execution. Remove subquery (you really don't need it here) and it will be much faster.

Query large table with 50 million rows

trying to query a large table (senddb.order_histories) that has close to 50M rows and this is the MySQL query I am using:
FIRST APPROACH- inner join:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
inner join (
select order_line_item_id,
order_number,
order_status,
order_status_description,
action,
modified_by,
created_at,
max(updated_at) as updated_at
from senddb.order_histories
where order_status in ('x','y','z')
and fulfillment_location = 'abcd'
group by order_line_item_id) as b
on a.id = b.order_line_item_id
and a.fulfillment_status = '2';
EXPLAIN output :
SECOND APPROACH- nested select:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
where a.fulfillment_status = '2'
and a.id in (
select b.order_line_item_id from(
select order_line_item_id,
order_number,
order_status,
order_status_description,
action,
modified_by,
created_at,
max(updated_at) as updated_at
from senddb.order_histories
where
order_status in ('x','y','z')
and fulfillment_location = 'abcd'
group by order_line_item_id) as b);
I believe nested select is a bad approach on large data but i anyhow added it here because it worked on my sample set. Anyway both the queries eventually time out after 600 seconds with the message : Error Code: 2013. Lost connection to MySQL server during query.
I would like to know if there are any ways to alter the query to make it run faster. I have already tried reducing the columns in the inner select / inner join but that should not really be an issue IMO. I also looked up a solution that says "create a clustered index" but i wasn't really able to follow. Any help is appreciated.
TABLE order_histories :
order_histories CREATE TABLE `order_histories` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`order_number` varchar(24) DEFAULT NULL,
`order_status_description` varchar(255) DEFAULT NULL,
`datetime_stamp` datetime DEFAULT NULL,
`action` varchar(32) DEFAULT NULL,
`fulfillment_location` int(8) DEFAULT NULL,
`order_status` int(8) DEFAULT NULL,
`user_id` int(8) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`modified_by` varchar(32) DEFAULT NULL,
`order_line_item_id` int(11) DEFAULT NULL,
`pooled` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `order_histories_ecash_idx` (`order_number`),
KEY `order_line_item_id` (`order_line_item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=454738178 DEFAULT CHARSET=latin1
TABLE order_line_items :
order_line_items CREATE TABLE `order_line_items` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`order_number` varchar(24) DEFAULT NULL,
`sku_id` int(8) DEFAULT NULL,
`original_price` float DEFAULT NULL,
`dept_description` varchar(100) DEFAULT NULL,
`description` varchar(100) DEFAULT NULL,
`quantity_ordered` int(8) DEFAULT NULL,
`gift_indicator` char(1) DEFAULT NULL,
`gift_wrap_flag` char(1) DEFAULT NULL,
`shipping_record_flag` char(1) DEFAULT NULL,
`gift_comments` varchar(100) DEFAULT NULL,
`item_status` char(1) DEFAULT NULL,
`tax_amount` float DEFAULT NULL,
`tax_rate` float DEFAULT NULL,
`upc` varchar(20) DEFAULT NULL,
`final_price` float DEFAULT NULL,
`line_number` int(8) DEFAULT NULL,
`master_line_number` int(8) DEFAULT NULL,
`gift_wrap_flag_type` char(1) DEFAULT NULL,
`color_code` varchar(4) DEFAULT NULL,
`size_id` varchar(6) DEFAULT NULL,
`width_id` varchar(6) DEFAULT NULL,
`brand` varchar(15) DEFAULT NULL,
`vpn` varchar(30) DEFAULT NULL,
`dept_number` int(8) DEFAULT NULL,
`class_number` int(8) DEFAULT NULL,
`non_merch_item` char(1) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`modified_by` varchar(32) DEFAULT NULL,
`chain_id` int(11) DEFAULT NULL,
`fulfillment_location` int(11) DEFAULT NULL,
`fulfillment_date` datetime DEFAULT NULL,
`fulfillment_status` int(11) DEFAULT NULL,
`fulfillment_sales_associate` int(11) DEFAULT NULL,
`gift_wrap_line_number` int(11) DEFAULT NULL,
`shipping_type` int(11) DEFAULT NULL,
`order_track_info_id` int(11) DEFAULT NULL,
`store_tlog_updated` varchar(1) DEFAULT NULL,
`shipping_tlx_code` int(11) DEFAULT NULL,
`store_closed` tinyint(1) DEFAULT NULL,
`flags` int(11) DEFAULT NULL,
`deal_based_index` int(11) DEFAULT NULL,
`tlog_calc_ret_price` float DEFAULT NULL,
`tlog_amount` float DEFAULT NULL,
`tlog_retail_price` float DEFAULT NULL,
`tlog_ext_amount` float DEFAULT NULL,
`tlog_flag_1` int(11) DEFAULT NULL,
`tlog_flag_2` int(11) DEFAULT NULL,
`tlog_flag_3` int(11) DEFAULT NULL,
`time_remaining` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `order_line_items_ecash_idx` (`order_number`),
KEY `order_line_item_fulfillment_location_idx` (`fulfillment_location`),
KEY `order_line_item_fulfillment_status_idx` (`fulfillment_status`),
KEY `upc_idx` (`upc`),
KEY `sku_id_idx` (`sku_id`),
KEY `order_line_items_idx001` (`order_number`,`id`,`fulfillment_status`),
KEY `order_track_info_id` (`order_track_info_id`),
KEY `shipping_type_idx` (`shipping_type`,`non_merch_item`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=11367052 DEFAULT CHARSET=latin1
This query can be simplified:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
inner join senddb.order_histories b on a.id = b.order_line_item_id
where b.order_status in ('x','y','z')
and b.fulfillment_location = 'abcd'
and a.fulfillment_status = '2';
Since you're only selecting values from table a, you don't need to select specific values from table b and can instead just apply your conditions. Outside of this, you need to ensure that b.order_line_item_id has an index on it. You can find more about creating indexes here. I'm not an expert in MySQL but something similar to this should work if senddb.order_histories.order_line_item_id isn't already the primary key.
CREATE INDEX IX_order_histories_order_line_item_id ON order_histories (order_line_item_id);
You need to read up the optimization section of the MySQL docs. It contains a lot of information on how you can optimize your queries and data sets. The main idea here is to add indexes to the fields that are being used as the criteria in the WHERE clause of the SQL statements.
Basically, both of your alternatives are using a "sub-SELECT, not an INNER JOIN.
The syntax of a true JOIN is one of the following:
SELECT ...
FROM X INNER JOIN Y USING (field_list)
... or ...
SELECT ...
FROM X INNER JOIN Y ON (x.field1 = y.field2) ...
But in both cases the objects being joined are tables or views.
I'm going to presume ... admittedly, without checking ... that Nick Larsen's answer #1 adequately re-expresses your original query using JOINs.
(Notice how, in his answer, the shorthand identifiers A and B are introduced as referring to each of the two table-names mentioned in his query.)
Firstly, you need to decide if a 50 million resultset is what you are asking for. Big data tables are not there so that you can select all their rows. They are there so that you can ask them questions using sql queries. SQL is a query language, it's not a data loading language.
What's your purpose? If you want to copy the data you can do that by loading the data, for example, 1000 rows per query in a for loop. if you are loading the data for processing, you can do that in the same way.
If you want to derive statistical information, you can use outer join and return a low number of rows, using aggregate functions. But you shouldn't do that either, what you "should" do is to decide what you want from the table and preferably, run aggregate functions to store useful information in a different table. (mostly SELECT INTO queries) You should never need to join a table of 50 million records in the first place.
Telling you how to do something wrong using indexes wouldn't be the right thing here.

MySQL Left join take too long for huge data

I have two tables .Property tables and it related photo.One property may have many photo but I want only one any of it related photo, When I use left join MySQL query it become too slow.
Here is my query
SELECT `sp_property`.`id` as propertyid, `sp_property`.`property_name`, `sp_property`.`property_price`, `sp_property`.`adv_type`, `sp_property`.`usd`, `images`.`filepath_name`
FROM (`sp_property`)
LEFT JOIN (select id, Max(property_id) as pid,filepath_name
from sp_property_images
group by property_id) `images`
ON `images`.`pid` = `sp_property`.`id`
WHERE `sp_property`.`published` = 'yes'
GROUP BY `propertyid`
ORDER BY `sp_property`.`feature_listing` desc, `submit_date` desc
LIMIT 1,20
CREATE TABLE IF NOT EXISTS `sp_property_images` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`property_id` varchar(100) NOT NULL,
`filepath_name` text,
`label_name` varchar(45) DEFAULT NULL,
`primary` char(10) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `property_id` (`property_id`),
KEY `id` (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=12941 ;
CREATE TABLE IF NOT EXISTS `sp_property` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`propertytype` varchar(50) NOT NULL,
`adv_type` varchar(45) NOT NULL,
`property_name` text,
`division` varchar(45) NOT NULL,
`township` varchar(45) NOT NULL,
`property_price` decimal(20,2) unsigned DEFAULT NULL,
`price_type` varchar(45) NOT NULL,
`availability` varchar(100) DEFAULT NULL,
`property_address` text,
`p_dimension_length` varchar(45) NOT NULL,
`p_dimension_width` varchar(45) NOT NULL,
`p_dimension_sqft` varchar(45) NOT NULL,
`p_dimension_acre` varchar(45) NOT NULL,
`floor` varchar(45) NOT NULL,
`phone` varchar(100) DEFAULT NULL,
`aircorn` varchar(45) NOT NULL,
`ownership` varchar(45) NOT NULL,
`bedroom` varchar(45) NOT NULL,
`bathroom` varchar(45) NOT NULL,
`special_feature` text,
`amentites` text,
`property_detail` text,
`submit_date` datetime DEFAULT NULL,
`published` varchar(45) NOT NULL,
`published_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`agent_id` varchar(45) NOT NULL,
`source` varchar(45) NOT NULL,
`contact_name` varchar(100) NOT NULL,
`contact_no` varchar(100) NOT NULL,
`contact_address` text NOT NULL,
`contact_email` varchar(100) NOT NULL,
`unit_type` varchar(100) DEFAULT NULL,
`map_lat` varchar(100) DEFAULT NULL,
`map_long` varchar(100) DEFAULT NULL,
`show_map` varchar(3) DEFAULT 'no',
`total_view` bigint(20) NOT NULL DEFAULT '0',
`feature_listing` varchar(10) NOT NULL DEFAULT 'no',
`new_homes_id` int(11) NOT NULL,
`publish_price` int(1) NOT NULL DEFAULT '0',
`usd` decimal(20,2) NOT NULL,
`tag_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=18524 ;
Have you added indices on your tables? You would need three indices on the following columns:
article_photo.a_id for grouping and joining
article_photo.p_id for sorting
article.a_id for joining (although this is hopefully already the PK of your table)
The result of joins is not guaranteed to be sorted in any order, so you probably want to move your ORDER BY clause from the subquery to the outer query:
SELECT * from `article`
LEFT JOIN (
SELECT * from `article_photo`
GROUP BY `a_id`) as images
ON article.a_id = images.a_id
ORDER BY images.p_id DESC
Also, you have no guarantee on which article_photo you will get, since select data without an aggregate function (and only MySQL will allow you to do that).
Now that the question contains the real tables and all information essential to answering, here's my take – first, here's your query:
SELECT `sp_property`.`id` as propertyid, `sp_property`.`property_name`, `sp_property`.`property_price`, `sp_property`.`adv_type`, `sp_property`.`usd`, `images`.`filepath_name`
FROM (`sp_property`)
LEFT JOIN (select id, Max(property_id) as pid,filepath_name
from sp_property_images
group by property_id) `images`
ON `images`.`pid` = `sp_property`.`id`
WHERE `sp_property`.`published` = 'yes'
GROUP BY `propertyid`
ORDER BY `sp_property`.`feature_listing` desc, `submit_date` desc
LIMIT 1,20
Let's see. you are joining sp_property_images.property_id with sp_property.id. These columns have different types (int vs. varchar) and I suppose this results in a severe performance penalty (since the values have to be converted to the same type).
Then, you are filtering by sp_property.published, so I suggest adding an index on this column as well. Also, examine if you really need to have this column as varchar. A bool/bit flag probably suffices as well (if it doesn't, an enum might be a better choice still).
Ordering benefits from an index too. Add an index spanning both columns sp_property.feature_listing and sp_property.submit_date.
If all of the above still doesn't help, you might have to remove the sub-select. It might prevent the mysql engine from recognizing (and using!) the index you have defined on the sp_property_images.property_id column.
This simple query will give you what you asked for, all articles with their photos
SELECT ar.a_id, ar.a_title, ap.p_id, ap.photo_name
FROM article ar
JOIN article_photo ap on ar.a_id = ap.a_id
No reason for left join and grouping there or you wanna get sum on photos by article?

MySQL query issue with combining two tables

I have two tables:
`search_chat` (
`pubchatid` varchar(255) NOT NULL,
`profile` varchar(255) DEFAULT NULL,
`prefs` varchar(255) DEFAULT NULL,
`init` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`session` varchar(255) DEFAULT NULL,
`device` varchar(255) DEFAULT NULL,
`uid` int(10) DEFAULT NULL,
PRIMARY KEY (`pubchatid`)
and
`chats` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`chatlog` varchar(255) DEFAULT NULL,
`block` varchar(2) DEFAULT '',
`whenadded` datetime DEFAULT NULL,
`pubchatid1` varchar(255) DEFAULT NULL,
`pubchatid2` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
So basically people chat with each other through a search system based on prefrences. The further they are apart, the worse it is. So the query I have is simple:
SELECT *
FROM search_chat
WHERE levenshtein(profile, "[user_prefs]") < 20
AND pubchatid <> "[user_pubchatid]"
ORDER BY
levenshtein(profile, "[user_prefs]")
LIMIT 1
It is a shitty query in itself, but it does the job (everything between "[]" is a variable I put in, just to make it clear).
As you can see this query only makes a selection between two peoples preferences (prefs) and how they are (profile). So far so good.
I have been bugging around some to make this query also check if they have had previous chats. That is where "chats" comes in. I can not get the query to check for a proper user and see if they have an open chat.
In chats, the "search_chat.pubchatid" can be either "chats.pubchatid1" or "chats.pubchatid2"
So somehow I have got to make these two work, making chats rule out options in search_chat.
Do you want something like this:
-- ... ( start of query as per your question )
and not exists (
select *
from chats
where ( ( chats.pubchatid1 = search_chat.pubchatid )
or ( chats.pubchatid2 = search_chat.pubchatid ) )
and -- ... add any restriction on how recent the chat was
)