Which column(s) to index in MySQL - mysql

I'm trying to optimize the following table, according to phpMyAdmin several stats regarding Table Scans are high and indices do not exist or are not being used. (Handler read rnd next 5.7 M)
1.
$query = "
SELECT * FROM apps_discrep
WHERE discrep_station = '$station'
AND discrep_date = '$date'
ORDER BY discrep_timestart";
2.
$query = "
SELECT * FROM apps_discrep
WHERE discrep_date BETWEEN '$keyword' AND '$keyword3'
AND (discrep_station like '$keyword2%') ORDER BY discrep_date";
Would it be correct to Index discrep_station, discrep_date, and discrep_timestart?
There currently only exist the Primary Unique Index on the auto-increment ID.
-- Table structure
`index` int(11) NOT NULL AUTO_INCREMENT,
discrep_station varchar(5) NOT NULL,
discrep_timestart time NOT NULL,
discrep_timestop time NOT NULL,
discrep_date date NOT NULL,
discrep_datetime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
discrep_show varchar(31) NOT NULL,
discrep_text text NOT NULL,
discrep_by varchar(11) NOT NULL,
discrep_opr varchar(11) NOT NULL,
email_traffic varchar(3) NOT NULL,
email_techs varchar(3) NOT NULL,
email_promos varchar(3) NOT NULL,
email_spots varchar(3) NOT NULL,
eas_row varchar(11) NOT NULL,
PRIMARY KEY (`index`)
ENGINE=MyISAM DEFAULT CHARSET=utf8;

It looks to me like you can get both queries with the same BTREE index, since that allows you to use the left-most tuples as a separate index.
Consider this MySQL doc page as a reference.
ALTER TABLE xxx ADD KEY `key1` (`discrep_station`, `discrep_date`, `discrep_timestart`) USING BTREE;
Your first query will use all 3 fields in the index. The second query will only use the first 2 fields in the index.

Related

Improving the performance of a MYSQL query with a one-to-many relationship

I have a query in my DB that is taking 25 seconds to return results, which is way too long. It seems like it should be pretty simple. Two tables; the main table (document) is a standard table with some data columns, the join table is a mapping table with only two columns (parent_id, division_id). Previously there wasn't an index on the mapping table so I added one and that changed the "explain" to include the index but doesn't seem to have had an impact on the performance.
The query looks like this:
explain SELECT DISTINCT doc.*
FROM document doc
LEFT JOIN multi_division_mapper divisions ON doc.id = divisions.parent_id
WHERE doc.clientId = 'SOME_GUID'
AND (divisions.division_id IS NULL OR divisions.division_id IN ('SOME_GUID'));
and the results of explain are:
Total number of rows in document: 6720
Total number of rows in mapper: 6173
From what I've been able to gather I need to improve either the "type" or the "extra" to make the query faster. What can I do here?
Create table statements:
CREATE TABLE `document` (
`id` varchar(36) NOT NULL,
`addedBy` varchar(255) DEFAULT NULL,
`addedDate` datetime NOT NULL,
`editedBy` varchar(255) DEFAULT NULL,
`editedDate` datetime NOT NULL,
`deleted` bit(1) DEFAULT NULL,
`clientId` varchar(36) NOT NULL,
`departmentId` varchar(36) DEFAULT NULL,
`documentParentId` varchar(36) DEFAULT NULL,
`documentParent` varchar(50) DEFAULT NULL,
`fileId` varchar(255) DEFAULT NULL,
`fileUrl` varchar(600) DEFAULT NULL,
`documentName` varchar(500) NOT NULL,
`displayName` varchar(255) NOT NULL,
`documentId` varchar(45) DEFAULT NULL,
`notes` varchar(1000) DEFAULT NULL,
`visibility` varchar(45) NOT NULL DEFAULT 'PRIVATE',
`documentType` varchar(45) NOT NULL,
`restrictDelete` bit(1) NOT NULL,
`customData` text,
`releaseDate` datetime NOT NULL,
`expirationDate` datetime NOT NULL,
`isApproved` bit(1) NOT NULL DEFAULT b'0',
`userSupplier` varchar(36) DEFAULT NULL,
`complianceCertificateId` varchar(36) DEFAULT NULL,
`Status` varchar(50) DEFAULT 'NEUTRAL',
PRIMARY KEY (`id`),
KEY `idx_client` (`clientId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `multi_division_mapper` (
`parent_id` varchar(36) NOT NULL,
`division_id` varchar(36) NOT NULL,
PRIMARY KEY (`parent_id`,`division_id`),
KEY `idx_parent` (`parent_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I was able to get a more favorable EXPLAIN report in a test by creating the following index:
ALTER TABLE multi_division_mapper
DROP INDEX idx_parent,
ADD INDEX (division_id, parent_id);
I also dropped idx_parent because it's redundant; it's a prefix of the primary key.
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
doc
NULL
ref
idx_client
idx_client
110
const
1
100.00
Using temporary
1
SIMPLE
divisions
NULL
ref
PRIMARY,division_id
division_id
38
const
1
100.00
Using where; Using index; Distinct
The type: ref is better than type: index.
The query I tested is slightly different, but I believe it returns the same result:
SELECT DISTINCT doc.*
FROM document doc
LEFT JOIN multi_division_mapper divisions
ON doc.id = divisions.parent_id AND divisions.division_id in ('SOME_GUID')
WHERE doc.clientId = 'SOME_GUID'

Mysql: Selecting id, then * on id is much faster than selecting *. Why?

I have a MySQL database table (roughly 100K rows):
id BIGINT(indexed), external_barcode VARCHAR(indexed), other simple columns, and a LongText column.
The LongText column is a JSON data dump. I save the large JSON objects because I will need to extract more of the data in the future.
When I run this query it takes 29+ seconds:
SELECT * FROM scraper_data WHERE external_barcode = '032429257284'
EXPLAIN
#id select_type table partitions type possible_keys key key_len ref rows filtered Extra
'1' 'SIMPLE' 'scraper_data' NULL 'ALL' NULL NULL NULL NULL '119902' '0.00' 'Using where'
This more complex query takes 0.00 seconds:
SELECT * FROM scraper_data WHERE id = (
SELECT id FROM scraper_data WHERE external_barcode = '032429257284'
)
EXPLAIN
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
'1', 'PRIMARY', 'scraper_data', NULL, 'const', 'PRIMARY,id_UNIQUE', 'PRIMARY', '8', 'const', '1', '100.00', NULL
'2', 'SUBQUERY', 'scraper_data', NULL, 'ALL', NULL, NULL, NULL, NULL, '119902', '0.00', 'Using where'
Less than 6 rows are returned from these queries. Why is the LONGTEXT slowing down the first query given that its not being referenced in the where clause?
CREATE TABLE
CREATE TABLE `scraper_data` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`bzic` varchar(10) NOT NULL,
`pzic` varchar(10) DEFAULT NULL,
`internal_barcode` varchar(20) DEFAULT NULL,
`external_barcode_type` enum('upc','isbn','ean','gtin') DEFAULT NULL,
`external_barcode` varchar(15) DEFAULT NULL,
`url` varchar(255) NOT NULL,
`title` varchar(255) DEFAULT NULL,
`category` varchar(3) DEFAULT NULL,
`description` text,
`logo_image_url` varchar(255) DEFAULT NULL,
`variant_image_urls` text,
`parent_brand` varchar(10) DEFAULT NULL,
`parent_brand_name` varchar(255) DEFAULT NULL,
`manufacturer` varchar(10) DEFAULT NULL,
`manufacturer_name` varchar(255) DEFAULT NULL,
`manufacturer_part_number` varchar(255) DEFAULT NULL,
`manufacturer_model_number` varchar(255) DEFAULT NULL,
`contributors` text,
`content_info` text,
`content_rating` text,
`release_date` timestamp NULL DEFAULT NULL,
`reviews` int(11) DEFAULT NULL,
`ratings` int(11) DEFAULT NULL,
`internal_path` varchar(255) DEFAULT NULL,
`price` int(11) DEFAULT NULL,
`adult_product` tinyint(4) DEFAULT NULL,
`height` varchar(255) DEFAULT NULL,
`length` varchar(255) DEFAULT NULL,
`width` varchar(255) DEFAULT NULL,
`weight` varchar(255) DEFAULT NULL,
`scraped` tinyint(4) NOT NULL DEFAULT '0',
`scraped_timestamp` timestamp NULL DEFAULT NULL,
`scrape_attempt_timestamp` timestamp NULL DEFAULT NULL,
`processed` tinyint(4) NOT NULL DEFAULT '0',
`processed_timestamp` timestamp NULL DEFAULT NULL,
`modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`scrape_dump` longtext,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`),
UNIQUE KEY `url_UNIQUE` (`url`),
UNIQUE KEY `internal_barcode_UNIQUE` (`internal_barcode`),
KEY `bzic` (`bzic`),
KEY `pzic` (`pzic`),
KEY `internal_barcode` (`internal_barcode`),
KEY `external_barcode` (`external_barcode`,`external_barcode_type`) /*!80000 INVISIBLE */,
KEY `scrape_attempt` (`bzic`,`scraped`,`scrape_attempt_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=121674 DEFAULT CHARSET=latin1;
The second query could benefit from the cache that already contains the result of the first query.
In addition in the second subquery you just select use two column (id, external_barcode ) in these two column are in a index all the query result is obtained only with the index scan while in the first query for retrieve all the data the query must scan all the tables row ..
For avoiding the long time for the first query, you should add a proper index on external_barcode column
create index my_idx on scraper_data (external_barcode, id)
Your queries are not equivalent, and your second query will throw an error if you have more than one row with that barcode:
Error Code: 1242. Subquery returns more than 1 row
This is probably what happens here: you do not actually get a result, just an error. Since MySQL can stop the full table scan as soon as it finds a second row, you can get this error faster than a correct result, including "0.00s" if those rows are among the first rows that are scanned (for example in ids 1 and 2).
From the execution plan, you can see that both do a full table scan (which, up to current versions, includes reading the blob field), and thus should perform similarly fast (as the first entry in your 2nd explain plan is neglectable for only a few rows).
So with a barcode that doesn't throw an error, both of your queries, as well as the corrected 2nd query (where you use IN instead of =),
SELECT * FROM scraper_data WHERE id IN ( -- IN instead of = !!
SELECT id FROM scraper_data WHERE external_barcode = '032429257284'
)
as well as running your subquery
SELECT id FROM scraper_data WHERE external_barcode = '032429257284'
separately (which, if your assumption is correct, have to be even faster than your 2nd query) will have a similar (long) execution time.
As scaisEdge mentioned in his answer, an index on external_barcode will improve the performance significantly, as you do not not need to do a full table scan, as well as you do not need to read the blob field. You actually have such an index, but you disabled it (invisible). You can simply reenable it by using
ALTER TABLE scraper_data ALTER INDEX `external_barcode` VISIBLE;

Mysql Partitioning Query Performance

i have created partitions on pricing table. below is the alter statement.
ALTER TABLE `price_tbl`
PARTITION BY HASH(man_code)
PARTITIONS 87;
one partition consists of 435510 records. total records in price_tbl is 6 million.
EXPLAIN query showing only one partion is used for the query . Still the query takes 3-4 sec to execute. below is the query
EXPLAIN SELECT vrimg.image_cap_id,vm.man_name,vr.range_code,vr.range_name,vr.range_url, MIN(`finance_rental`) AS from_price, vd.der_id AS vehicle_id FROM `range_tbl` vr
LEFT JOIN `image_tbl` vrimg ON vr.man_code = vrimg.man_code AND vr.type_id = vrimg.type_id AND vr.range_code = vrimg.range_code
LEFT JOIN `manufacturer_tbl` vm ON vr.man_code = vm.man_code AND vr.type_id = vm.type_id
LEFT JOIN `derivative_tbl` vd ON vd.man_code=vm.man_code AND vd.type_id = vr.type_id AND vd.range_code=vr.range_code
LEFT JOIN `price_tbl` vp ON vp.vehicle_id = vd.der_id AND vd.type_id = vp.type_id AND vp.product_type_id=1 AND vp.maintenance_flag='N' AND vp.man_code=164
AND vp.initial_rentals_id =(SELECT rental_id FROM `rentals_tbl` WHERE rental_months='9')
AND vp.annual_mileage_id =(SELECT annual_mileage_id FROM `mileage_tbl` WHERE annual_mileage='8000')
WHERE vr.type_id = 1 AND vm.man_url = 'audi' AND vd.type_id IS NOT NULL GROUP BY vd.der_id
Result of EXPLAIN.
Same query without partitioning takes 3-4 sec.
Query with partitioning takes 2-3 sec.
how we can increase query performance as it is too slow yet.
attached create table structure.
price table - This consists 6 million records
CREATE TABLE `price_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`lender_id` bigint(20) DEFAULT NULL,
`type_id` bigint(20) NOT NULL,
`man_code` bigint(20) NOT NULL,
`vehicle_id` bigint(20) DEFAULT NULL,
`product_type_id` bigint(20) DEFAULT NULL,
`initial_rentals_id` bigint(20) DEFAULT NULL,
`term_id` bigint(20) DEFAULT NULL,
`annual_mileage_id` bigint(20) DEFAULT NULL,
`ref` varchar(255) DEFAULT NULL,
`maintenance_flag` enum('Y','N') DEFAULT NULL,
`finance_rental` decimal(20,2) DEFAULT NULL,
`monthly_rental` decimal(20,2) DEFAULT NULL,
`maintenance_payment` decimal(20,2) DEFAULT NULL,
`initial_payment` decimal(20,2) DEFAULT NULL,
`doc_fee` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`,`type_id`,`man_code`),
KEY `type_id` (`type_id`),
KEY `vehicle_id` (`vehicle_id`),
KEY `term_id` (`term_id`),
KEY `product_type_id` (`product_type_id`),
KEY `finance_rental` (`finance_rental`),
KEY `type_id_2` (`type_id`,`vehicle_id`),
KEY `maintenanace_idx` (`maintenance_flag`),
KEY `lender_idx` (`lender_id`),
KEY `initial_idx` (`initial_rentals_id`),
KEY `man_code_idx` (`man_code`)
) ENGINE=InnoDB AUTO_INCREMENT=5830708 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (man_code)
PARTITIONS 87 */
derivative table - This consists 18k records.
CREATE TABLE `derivative_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`der_cap_code` varchar(20) DEFAULT NULL,
`der_id` bigint(20) DEFAULT NULL,
`body_style_id` bigint(20) DEFAULT NULL,
`fuel_type_id` bigint(20) DEFAULT NULL,
`trans_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`model_code` bigint(20) DEFAULT NULL,
`der_name` varchar(255) DEFAULT NULL,
`der_url` varchar(255) DEFAULT NULL,
`der_intro_year` date DEFAULT NULL,
`der_disc_year` date DEFAULT NULL,
`der_last_spec_date` date DEFAULT NULL,
KEY `der_id` (`der_id`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`),
KEY `range_code` (`range_code`),
KEY `model_code` (`model_code`),
KEY `body_idx` (`body_style_id`),
KEY `capcodeidx` (`der_cap_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
range table - This consists 1k records
CREATE TABLE `range_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`range_name` varchar(255) DEFAULT NULL,
`range_url` varchar(255) DEFAULT NULL,
KEY `range_code` (`range_code`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY HASH is essentially useless if you are hoping for improved performance. BY RANGE is useful in a few use cases_.
In most situations, improvements in indexes are as good as trying to use partitioning.
Some likely problems:
No explicit PRIMARY KEY for InnoDB tables. Add a natural PK, if applicable, else an AUTO_INCREMENT.
No "composite" indexes -- they often provide a performance boost. Example: The LEFT JOIN between vr and vrimg involves 3 columns; a composite index on those 3 columns in the 'right' table will probably help performance.
Blind use of BIGINT when smaller datatypes would work. (This is an I/O issue when the table is big.)
Blind use of 255 in VARCHAR.
Consider whether most of the columns should be NOT NULL.
That query may be a victim of the "explode-implode" syndrome. This is where you do JOIN(s), which create a big intermediate table, followed by a GROUP BY to bring the row-count back down.
Don't use LEFT unless the 'right' table really is optional. (I see LEFT JOIN vd ... vd.type_id IS NOT NULL.)
Don't normalize "continuous" values (annual_mileage and rental_months). It is not really beneficial for "=" tests, and it severely hurts performance for "range" tests.
Same query without partitioning takes 3-4 sec. Query with partitioning takes 2-3 sec.
The indexes almost always need changing when switching between partitioning and non-partitioning. With the optimal indexes for each case, I predict that performance will be close to the same.
Indexes
These should help performance whether or not it is partitioned:
vm: (man_url)
vr: (man_code, type_id) -- either order
vd: (man_code, type_id, range_code, der_id)
-- `der_id` 4th, else in any order (covering)
vrimg: (man_code, type_id, range_code, image_cap_id)
-- `image_cap_id` 4th, else in any order (covering)
vp: (type_id, der_id, product_type_id, maintenance_flag,
initial_rentals, annual_mileage, man_code)
-- any order (covering)
A "covering" index is an extra boost, in that it can do all the work just in the index's BTree, without touching the data's BTree.
Implement a bunch of what I recommend, then come back (in another Question) for further tweaking.
Usually the "partition key" should be last in a composite index.

Slow search query with a one to many join

My problem is a slow search query with a one-to-many relationship between the tables. My tables look like this.
Table Assignment
CREATE TABLE `Assignment` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ProjectId` int(10) unsigned NOT NULL,
`AssignmentTypeId` smallint(5) unsigned NOT NULL,
`AssignmentNumber` varchar(30) NOT NULL,
`AssignmentNumberExternal` varchar(50) DEFAULT NULL,
`DateStart` datetime DEFAULT NULL,
`DateEnd` datetime DEFAULT NULL,
`DateDeadline` datetime DEFAULT NULL,
`DateCreated` datetime DEFAULT NULL,
`Deleted` datetime DEFAULT NULL,
`Lat` double DEFAULT NULL,
`Lon` double DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `idx_assignment_assignment_type_id` (`AssignmentTypeId`),
KEY `idx_assignment_assignment_number` (`AssignmentNumber`),
KEY `idx_assignment_assignment_number_external`
(`AssignmentNumberExternal`)
) ENGINE=InnoDB AUTO_INCREMENT=5280 DEFAULT CHARSET=utf8;
Table ExtraFields
CREATE TABLE `ExtraFields` (
`assignment_id` int(10) unsigned NOT NULL,
`name` varchar(30) NOT NULL,
`value` text,
PRIMARY KEY (`assignment_id`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
My search query
SELECT
`Assignment`.`Id`, COL_5_72, COL_5_73, COL_5_74, COL_5_75, COL_5_76,
COL_5_77 FROM (
SELECT
`Assignment`.`Id`,
`Assignment`.`AssignmentNumber` AS COL_5_72,
`Assignment`.`AssignmentNumberExternal` AS COL_5_73 ,
`AssignmentType`.`Name` AS COL_5_74,
`Assignment`.`DateStart` AS COL_5_75,
`Assignment`.`DateEnd` AS COL_5_76,
`Assignment`.`DateDeadline` AS COL_5_77 FROM `Assignment`
CASE WHEN `ExtraField`.`Name` = "WorkDistrict" THEN
`ExtraField`.`Value` end as COL_5_78 FROM `Assignment`
LEFT JOIN `ExtraFields` as `ExtraField` on
`ExtraField`.`assignment_id` = `Assignment`.`Id`
WHERE `Assignment`.`Deleted` IS NULL -- Assignment should not be removed.
AND (1=1) -- Add assignment filters.
) AS q1
GROUP BY `Assignment`.`Id`
HAVING 1 = 1
AND COL_5_78 LIKE '%Amsterdam East%'
ORDER BY COL_5_72 ASC, COL_5_73 ASC;
When the table is only around 3500 records my query takes a couple of seconds to execute and return the results.
What is a better way to search in the related data? Should I just add a JSON field to the Assignment table and use the MySQL 5.7 Json query features? Or did I made a mistake in designing my database?
You are using select from subquery that forces MySQL to create unindexed temp table for each execution. Remove subquery (you really don't need it here) and it will be much faster.

Avoid UNION for two almost identical tables in MySQL

I'm not very good at MySQL and i'm going to write a query to count messages sent by an user, based on its type and is_auto field.
Messages can be of type "small text message" or "newsletter". I created two entities with a few fields that differs between them. The important one is messages_count that is absent in table newsletter and it's used in the query:
CREATE TABLE IF NOT EXISTS `small_text_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
And:
CREATE TABLE `newsletter` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` varchar(78) DEFAULT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
I ended up with a UNION query. Can this query be shortened or optimized since the only difference is messages_count that should be always 1 for newsletter?
SELECT
CONCAT('sms_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(messages_count * (customers_count + recipients_count)) AS count
FROM small_text_message WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto
UNION
SELECT
CONCAT('newsletter_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(customers_count + recipients_count) AS count
FROM newsletter WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto
I don't see any easy way to avoid a UNION (or UNION ALL) operation, that will return the specified result set.
I would recommend you use a UNION ALL operator in place of the UNION operator. Then the execution plan will not include the step that eliminates duplicate rows. (You already have GROUP BY operations on each query, and there is no way that those two queries can produce an identical row.)
Otherwise, your query looks fine just as it is written.
(It's always a good thing to consider the question, might there be a better way? To get the result set you are asking for, from the schema you have, your query looks about as good as it's going to get.)
If you are looking for more general DB advice, I recommend restructuring the tables to factor the common elements into one table, perhaps called outbound_communication or something, with all of your common fields, then perhaps have "sub tables" for the specific types to host the fields which are unique to that type. It does mean a simple JOIN is necessary to select all of the fields you want, but the again, it's normalized and actually makes situations like this one easier (one table holds all of the entities of interest). Additionally, you have the option of writing that JOIN just once as a "view", and then your existing code would not even need to change to see the two tables as if they never changed.
CREATE TABLE IF NOT EXISTS `outbound_communicaton` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `small_text_message` (
`oubound_communication_id` int(11) NOT NULL,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
CREATE TABLE `newsletter` (
`oubound_communication_id` int(11) NOT NULL,
`subject` varchar(78) DEFAULT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
Then selecting a text msg is like this:
SELECT *
FROM outbound_communication AS parent
JOIN small_text_message
ON parent.id = small_text_message.outbound_communication_id
WHERE parent.id = 1234;
The nature of the query is inherently the union of the data from the small text message and the newsletter tables, so the UNION query is the only realistic formulation. There's no join of relevance between the two tables, for example.
So, I think you're very much on the right lines with your query.
Why are you worried about a UNION?