Optimizing a query for optional fields from another table - mysql

I have a innodb table called items that powers one ecommerce site. The search system allows you to search for optional/additional fields, so that you can e.g. search for only repaired computers or cars only older than 2000.
This is done via additional table called items_fields.
It has a very simple design:
+------------+------------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| field_id | int(11) | NO | MUL | NULL | |
| item_id | int(11) | NO | MUL | NULL | |
| valueText | varchar(500) | YES | | NULL | |
| valueInt | decimal(10,1) unsigned | YES | | NULL | |
+------------+------------------------+------+-----+---------+----------------+
There is also a table called fields which contains only field names and types.
The main query, which returns search results, is the following:
SELECT items...
FROM items
WHERE items... AND (
SELECT count(id)
FROM items_fields
WHERE items_fields.field_id = "59" AND items_fields.item_id = items.id AND
items_fields.valueText = "Damaged")>0
ORDER by ordering desc LIMIT 35;
On a large scale (4 million+ search queries only, per day), I need to optimize these advanced search even more. Currently, the average advanced search query takes around 100ms.
How can I speed up this query? Do you have any other suggestions, advices, for optimization? Both tables are innodb, server stack is absolutely awesome, however I still got this query to solve :)

Add and index for (item_id, field_id, valueText) since this is your search.
Get rid of the inner select!!! MySQL up to 5.5 cannot optimize queries with inner selects. As far as I know MariaDB 5.5 is the only MySQL replacement that currently supports inner select optimization.
SELECT i.*, f2.* as damageCounter FROM items i
JOIN items_fields f ON f.field_id = 59
AND f.item_id = i.id
AND f.valueText = "Damaged"
JOIN item_fields f2 ON f2.item_id = i.id
ORDER by i.ordering desc
LIMIT 35;
The first join will limit the set being returned. The second join will grab all item_fields for items meeting the first join. Between the first and last joins, you can add more Join conditionals that will filter out results based on additional points. For example:
SELECT i.*, f3.* as damageCounter FROM items i
JOIN items_fields f ON f.field_id = 59
AND f.item_id = i.id
AND f.valueText = "Damaged"
JOIN items_fields f2 ON f2.field_id = 22
AND f2.item_id = i.id
AND f.valueText = "Green"
JOIN item_fields f3 ON f3.item_id = i.id
ORDER by i.ordering desc
LIMIT 35;
This would return a result set of all items that had fields 59 with the value of "Damaged" and field 22 with the value of "Green" along with all their item_fields.

Related

How to use GROUP BY which takes into account two columns?

I have a message table like this in MySQL.
+--------------------+--------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| subject | varchar(120) | NO | | NULL | |
| body | longtext | NO | | NULL | |
| sent_at | datetime | YES | | NULL | |
| recipient_read | tinyint(1) | NO | | 0 | |
| recipient_id | int(11) | NO | MUL | 0 | |
| sender_id | int(11) | NO | MUL | 0 | |
| thread_id | int(11) | NO | MUL | 0 | |
+--------------------+--------------+------+-----+---------------------+----------------+
Messages in a recepient's inbox are to be grouped by thread_id like this:
SELECT * FROM message WHERE recipient_id=42 GROUP BY thread_id ORDER BY sent_at DESC
My problem is how to take recipient_read into account so that each row in the result also show what is the recipient_read value of the last message in the thread?
In the original query, the ORDER BY is only satisfied after the GROUP BY operation. The ORDER BY affects the order of the returned rows. It does not influence which rows are returned.
With the non-aggregate expression in the SELECT list, it is indeterminate which values will be returned; the value of each column will be from some row in the collapsed group. But it's not guaranteed to be the first row, or the latest row, or any other specific row. The behavior of MySQL (allowing the query to run without throwing an error) is enabled by a MySQL extension.
Other relational databases would throw a "non-aggregate in SELECT list not in GROUP BY" type error with the query. MySQL exhibits a similar (standard) behavior when ONLY_FULL_GROUP_BY is included in sql_mode system variable. MySQL allows the original query to run (and return unexpected results) because of a non-standard, MySQL-specific extension.
The pattern of the original query is essentially broken.
To get a resultset that satisfies the specification, we can write a query to get the latest (maximum) sent_at datetime for each thread_id, for a given set of recipient_id (in the example query, the set is a single recipient_id.)
SELECT lm.recipient_id
, lm.thread_id
, MAX(lm.sent_at) AS latest_sent_at
FROM message lm
WHERE lm.recipient_id = 42
GROUP
BY lm.recipient_id
, lm.thread_id
We can use the result from that query in another query, by making in an inline view (wrap it in parens, and reference it in the FROM clause like table, assign an alias).
We can join that resultset to the original table to retrieve all of the columns from the rows that match.
Something like this:
SELECT m.id
, m.subject
, m.body
, m.sent_at
, m.recipient_read
, m.recipient_id
, m.sender_id
, m.thread_id
FROM (
SELECT lm.recipient_id
, lm.thread_id
, MAX(lm.sent_at) AS latest_sent_at
FROM message lm
WHERE lm.recipient_id = 42
GROUP
BY lm.recipient_id
, lm.thread_id
) l
JOIN message m
ON m.recipient_id = l.recipient_id
AND m.thread_id = l.thread_id
AND m.sent_at = l.latest_sent_at
ORDER
BY ...
Note that if (recipient_id,thread_id,sent_at) is not guaranteed to be unique, there is a potential that there will be multiple rows with the same "maximum" sent_at; that is, we could get more than one row back for a given maximum sent_at.
We can order that result however we want, with whatever expressions. That will affect only the order that the rows are returned in, not which rows are returned.
If you want the last message, you want filtering, not aggregation:
SELECT m.*
FROM message m
WHERE m.recipient_id = 42 AND
m.sent_at = (SELECT MAX(m2.sent_at)
FROM messages m2
WHERE m2.thread_id = m.thread_id
)
ORDER BY m.sent_at DESC;

Fastest way to order by having true result on a left join in MYSQL

I am trying to set up something where data is being matched on two different tables. The results would be ordered by some data being true on the second table. However, not everyone in the first table is in the second table. My problem is twofold. 1) Speed. My current MYSQL query takes 4 seconds to go through several thousand results on each table. 2) Not ordering correctly. I need it to order the results by who is online, but still be alphabetical. As it stands now it orders everyone by whether or not they are online according to chathelp table, then fills in the rest with the users table.
What I have:
SELECT u.name, u.id, u.url, c.online
FROM users AS u
LEFT JOIN livechat AS c ON u.url = CONCAT('http://www.software.com/', c.chat_handle)
WHERE u.live_account = 'y'
ORDER BY c.online DESC, u.name ASC
LIMIT 0, 24
users
+-----------------------------------------------------------+--------------+
| id | name | url | live_account |
+-----------------------------------------------------------+--------------|
| 1 | Lisa Fuller | http://www.software.com/LisaHelpLady | y |
| 2 | Eric Reiner | | y |
| 3 | Tom Lansen | http://www.software.com/SaveUTom | y |
| 4 | Billy Bob | http://www.software.com/BillyBob | n |
+-----------------------------------------------------------+--------------+
chathelp
+------------------------------------+
| chat_id | chat_handle | online |
+------------------------------------+
| 12 | LisaHelpLady | 1 |
| 34 | BillyBob | 0 |
| 87 | SaveUTom | 0 |
+------------------------------------+
What I would like the data I receive to look like:
+----------------------------------------------------------------------+
| name | id | url | online |
+----------------------------------------------------------------------+
| Lisa Fuller | 1 | http://www.software.com/LisaHelpLady | 1 |
| Eric Reiner | 4 | | 0 |
| Tom Lansen | 3 | http://www.software.com/SaveUTom | 0 |
+----------------------------------------------------------------------+
Explanation: Billy is excluded right off the bat for not having a live account. Lisa comes before Eric because she is online. Tom comes after Eric because he is offline and alphabetically later in the data. The only matching data between the two tables is a portion of the url column with the chat_handle column.
What I am getting instead:
(basically, I am getting Lisa, Tom, then Eric)
I am getting everybody in the chathelp table listed first whether or not they are online or not. So 600 people come first, then I get the remaining people who aren't in both tables from users table. I need people who are offline in the chathelp table to be sorted into the users table people in alphabetical order. So if Lisa and Tom were the only users online they would come first, but everyone else from the users table regardless of whether or not they set up their chathelp handle would come alphabetically after those two users.
Again, I need to sort them and figure out how to do this in less than 4 seconds. I have tried indexes on both tables, but they don't help. Explain says it is using a key (name) on table users hitting rows 4771 -> Using where;Using temporary; Using filesort and on table2 NULL for key with 1054 rows and nothing in the extra column.
Any help would be appreciated.
Edit to add table into and explain statement
CREATE TABLE `chathelp` (
`chat_id` int(13) NOT NULL,
`chat_handle` varchar(100) NOT NULL,
`online` tinyint(1) NOT NULL DEFAULT '0',
UNIQUE KEY `chat_id` (`chat_id`),
KEY `chat_handle` (`chat_handle`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `users` (
`id` int(8) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`url` varchar(250) NOT NULL,
`live_account` varchar(1) NOT NULL DEFAULT 'n',
PRIMARY KEY (`id`),
KEY `livenames` (`live_account`,`name`)
) ENGINE=MyISAM AUTO_INCREMENT=9556 DEFAULT CHARSET=utf8
+----+-------------+------------+------+---------------+--------------+---------+-------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+--------------+---------+-------+------+----------------------------------------------+
| 1 | SIMPLE | users | ref | livenames | livenames | 11 | const | 4771 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | chathelp | ALL | NULL | NULL | NULL | NULL | 1144 | |
+----+-------------+------------+------+---------------+--------------+---------+-------+------+----------------------------------------------+
We're going to guess that online is integer datatype.
You can modify the expression in your order by clause like this:
ORDER BY IFNULL(online,0) DESC, users.name ASC
^^^^^^^ ^^^
The problem is that for rows in user that don't have a matching row in chathelp, the value of the online column in the resultset is NULL. And NULL always sorts after all non-NULL values.
If we assume that a missing row in helpchat is to be treated equally with a row in helpchat that has a 0 for online, we can replace the NULL value with a 0. (If there are NULL values in the online column, we won't be able to distinguish between that, and a missing row in helpchat (using this expression in the ORDER BY.))
EDIT
Optimizing Performance
To address performance, we'd need to see the output from EXPLAIN.
With the query as its written above, there's no getting around the "Using filesort" to get the rows returned in the order specified, on that expression.
We may be able to re-write the query to get an equivalent result faster.
But I suspect the "Using filesort" operation is not really the problem, unless there's a bloatload (thousands and thousands) of rows to sort.
I suspect that suitable indexes aren't available for the join operation.
But before we go to the knee jerk "add an index!", we really need to look at EXPLAIN, and look at the table definitions including the indexes. (The output from SHOW CREATE TABLE is suitable.
We just don't have enough information to make recommendations yet.
Reference: 8.8.1 Optimizing Queries with EXPLAIN
As a guess, we might want to try a query like this:
SELECT u.name
, u.id
, l.url
, l.online
FROM users
LEFT
JOIN livechat
ON l.url = CONCAT('http://www.software.com/', u.chat_handle)
AND l.online = 1
WHERE u.live_account = 'y'
ORDER
BY IF(l.online=1,0,1) ASC
, u.name ASC
LIMIT 0,24
After we've added covering indexes, e.g.
.. ON user (live_account,chat_handle,name, id)
...ON livechat (url, online)
(If query is using a covering index, EXPLAIN should show "Using index" in the Extra column.)
One approach might be to break the query into two parts: an inner join, and a semi-anti join. This is just a guess at something we might try, but again, we'd want to compare the EXPLAIN output.
Sometimes, we can get better performance with a pattern like this. But for better performance, both of the queries below are going to need to be more efficient than the original query:
( SELECT u.name
, u.id
, l.url
, l.online
FROM users u
JOIN livechat
ON l.url = CONCAT('http://www.software.com/', u.chat_handle)
AND l.online = 1
WHERE u.live_account = 'y'
ORDER
BY u.name ASC
LIMIT 0,24
)
UNION ALL
( SELECT u.name
, u.id
, NULL AS url
, 0 AS online
FROM users u
LEFT
JOIN livechat
ON l.url = CONCAT('http://www.software.com/', u.chat_handle)
AND l.online = 1
WHERE l.url IS NULL
AND u.live_account = 'y'
ORDER
BY u.name ASC
LIMIT 0,24
)
ORDER BY 4 DESC, 1 ASC
LIMIT 0,24

Switching Raw greatest-n-per-group MySQL query to Laravel query builder

I want to move a raw mysql query into Laravel 4's query builder, or preferably Eloquent.
The Setup
A database for storing discount keys for games.
Discount keys are stored in key sets where each key set is associated with one game (a game can have multiple keysets).
The following query is intended to return a table of key sets and relevant data, for viewing on an admin page.
The 'keys used so far' is calculated by a scheduled event and periodically stored/updated in log entries in a table keySetLogs. (it's smart enough to only log data when the count changes)
We want to show the most up-to-date value of 'keys used', which is a 'greatest-n-per-group' problem.
The Raw Query
SELECT
`logs`.`id_keySet`,
`games`.`name`,
`kset`.`discount`,
`kset`.`keys_total`,
`logs`.`keys_used`
FROM `keySets` AS `kset`
INNER JOIN
(
SELECT
`ksl1`.*
FROM `keySetLogs` AS `ksl1`
LEFT OUTER JOIN `keySetLogs` AS `ksl2`
ON (`ksl1`.`id_keySet` = `ksl2`.`id_keySet` AND `ksl1`.`set_at` < `ksl2`.`set_at`)
WHERE `ksl2`.`id_keySet` IS NULL
ORDER BY `id_keySet`
)
AS `logs`
ON `logs`.`id_keySet` = `kset`.`id`
INNER JOIN `games`
ON `games`.`id` = `kset`.`id_game`
ORDER BY `kset`.`id_game` ASC, `kset`.`discount` DESC
Note: the nested query gets the most up-to-date keys_used value from the logs. This greatest-n-per-group code used as discussed in this question.
Example Output:
+-----------+-------------+----------+------------+-----------+
| id_keySet | name | discount | keys_total | keys_used |
+-----------+-------------+----------+------------+-----------+
| 5 | Test_Game_1 | 100.00 | 10 | 4 |
| 6 | Test_Game_1 | 50.00 | 100 | 20 |
| 3 | Test_Game_2 | 100.00 | 10 | 8 |
| 4 | Test_Game_2 | 50.00 | 100 | 14 |
| 1 | Test_Game_3 | 100.00 | 10 | 1 |
| 2 | Test_Game_3 | 50.00 | 100 | 5 |
...
The Question(s)
I have KeySet, KeySetLog and Game Eloquent Models created with relationship functions set up.
How would I write the nested query in query builder?
Is it possible to write the query entirely with eloquent (no manually writing joins)?
I don't know Laravel or Eloquent so I probably shouldn't comment, but if performance isn't at stake then it seems to me that this query could be rewritten something like this:
SELECT ksl1.id_keySet
, g.name
, k.discount
, k.keys_total
, ksl1.keys_used
FROM keySetLogs ksl1
LEFT
JOIN keySetLogs ksl2
ON ksl1.id_keySet = ksl2.id_keySet
AND ksl1.set_at < ksl2.set_at
LEFT
JOIN keysets k
ON k.id = l.id_keySet
LEFT
JOIN games g
ON g.id = k.id_game
WHERE ksl2.id_keySet IS NULL
ORDER
BY k.id_game ASC
, k.discount DESC

SQL LIMIT to get latest records

I am writing a script which will list 25 items of all 12 categories. Database structure is like:
tbl_items
---------------------------------------------
item_id | item_name | item_value | timestamp
---------------------------------------------
tbl_categories
-----------------------------
cat_id | item_id | timestamp
-----------------------------
There are around 600,000 rows in the table tbl_items. I am using this SQL query:
SELECT e.item_id, e.item_value
FROM tbl_items AS e
JOIN tbl_categories AS cat WHERE e.item_id = cat.item_id AND cat.cat_id = 6001
LIMIT 25
Using the same query in a loop for cat_id from 6000 to 6012. But I want the latest records of every category. If I use something like:
SELECT e.item_id, e.item_value
FROM tbl_items AS e
JOIN tbl_categories AS cat WHERE e.item_id = cat.item_id AND cat.cat_id = 6001
ORDER BY e.timestamp
LIMIT 25
..the query goes computing for approximately 10 minutes which is not acceptable. Can I use LIMIT more nicely to give the latest 25 records for each category?
Can anyone help me achieve this without ORDER BY? Any ideas or help will be highly appreciated.
EDIT
tbl_items
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| item_id | int(11) | NO | PRI | 0 | |
| item_name | longtext | YES | | NULL | |
| item_value | longtext | YES | | NULL | |
| timestamp | datetime | YES | | NULL | |
+---------------------+--------------+------+-----+---------+-------+
tbl_categories
+----------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| cat_id | int(11) | NO | PRI | 0 | |
| item_id | int(11) | NO | PRI | 0 | |
| timestamp | datetime | YES | | NULL | |
+----------------+------------+------+-----+---------+-------+
Can you add indices? If you add an index on the timestamp and other appropriate columns the ORDER BY won't take 10 minutes.
First of all:
It seems to be a N:M relation between items and categories: a item may be in several categories. I say this because categories has item_id foreign key.
If is not a N:M relationship then you should consider to change design. If it is a 1:N relationship, where a category has several items, then item must constain category_id foreign key.
Working with N:M:
I have rewrite your query to make a inner join insteat a cross join:
SELECT e.item_id, e.item_value
FROM
tbl_items AS e
JOIN
tbl_categories AS cat
on e.item_id = cat.item_id
WHERE
cat.cat_id = 6001
ORDER BY
e.timestamp
LIMIT 25
To optimize performance required indexes are:
create index idx_1 on tbl_categories( cat_id, item_id)
it is not mandatory an index on items because primary key is also indexed.
A index that contains timestamp don't help as mutch. To be sure can try with an index on item with item_id and timestamp to avoid access to table and take values from index:
create index idx_2 on tbl_items( item_id, timestamp)
To increase performace you can change your loop over categories by a single query:
select T.cat_id, T.item_id, T.item_value from
(SELECT cat.cat_id, e.item_id, e.item_value
FROM
tbl_items AS e
JOIN
tbl_categories AS cat
on e.item_id = cat.item_id
ORDER BY
e.timestamp
LIMIT 25
) T
WHERE
T.cat_id between 6001 and 6012
ORDER BY
T.cat_id, T.item_id
Please, try this querys and come back with your comments to refine it if necessary.
Leaving aside all other factors I can tell you that the main reason why the query is so slow, is because the result involves longtext columns.
BLOB and TEXT fields in MySQL are mostly meant to store complete files, textual or binary. They are stored separately from the row data for InnoDB tables. Each time a query involes sorting (explicitly or for a group by), MySQL is sure to use disk for the sorting (because it can not be sure in advance how large any file is).
And it is probably a rule of thumb: if you need to return more than a single row of a column in a query, the type of the field is almost never should be TEXT or BLOB, use VARCHAR or VARBINARY instead.
UPD
If you can not update the table, the query will hardly be fast with the current indexes and column types. But, anyway, here is a similar question and a popular solution to your problem: How to SELECT the newest four items per category?

Optimizing MySQL query with inner join

I've spent a lot of time optimizing this query but it's starting to slow down with larger tables. I imagine these are probably the worst types of questions but I'm looking for some guidance. I'm not really at liberty to disclose the database schema so hopefully this is enough information. Thanks,
SELECT tblA.id, tblB.id, tblC.id, tblD.id
FROM tblA, tblB, tblC, tblD
INNER JOIN (SELECT max(tblB.id) AS xid
FROM tblB
WHERE tblB.rdd = 11305
GROUP BY tblB.index_id
ORDER BY NULL) AS rddx
ON tblB.id = rddx.xid
WHERE
tblA.id = tblB.index_id
AND tblC.name = tblD.s_type
AND tblD.name = tblA.s_name
GROUP BY tblA.s_name
ORDER BY NULL;
There is a one-to-many relationship between:
tblA.id and tblB.index_id
tblC.name and tblD.s_type
tblD.name and tblA.s_name
+----+-------------+------------+--------+---------------+-----------+---------+------------------------------+-------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-----------+---------+------------------------------+-------+------------------------------+
| 1 | PRIMARY | derived2 | ALL | NULL | NULL | NULL | NULL | 32568 | Using temporary |
| 1 | PRIMARY | tblB | eq_ref | PRIMARY | PRIMARY | 8 | rddx.xid | 1 | |
| 1 | PRIMARY | tblA | eq_ref | PRIMARY | PRIMARY | 8 | tblB.index_id | 1 | Using where |
| 1 | PRIMARY | tblD | eq_ref | PRIMARY | PRIMARY | 22 | tblA.s_name | 1 | Using where |
| 1 | PRIMARY | tblC | eq_ref | PRIMARY | PRIMARY | 22 | tblD.s_type | 1 | |
| 2 | DERIVED | tblB | ref | rdd_idx | rdd_idx | 7 | | 65722 | Using where; Using temporary |
+----+-------------+------------+--------+---------------+-----------+---------+------------------------------+-------+------------------------------+
Unless I've misunderstood the information that you've provided I believe you could re-write the above query as follows
EXPLAIN SELECT tblA.id, MAX(tblB.id), tblC.id, tblD.id
FROM tblA
LEFT JOIN tblD ON tblD.name = tblA.s_name
LEFT JOIN tblC ON tblC.name = tblD.s_type
LEFT JOIN tblB ON tblA.id = tblB.index_id
WHERE tblB.rdd = 11305
ORDER BY NULL;
Obviously I can't provide an explain for this as explain depends on the data in your database. It would be interesting to see the explain on this query.
Obviously explain only gives you an estimate of what will happen. You can use SHOW SESSION STATUS to provide in details of what happened when you run an actual query. Make sure to run before you run the query that you are investigating so that you have clean data to read from. So in this case you would run
FLUSH STATUS;
EXPLAIN SELECT tblA.id, MAX(tblB.id), tblC.id, tblD.id
FROM tblA
LEFT JOIN tblD ON tblD.name = tblA.s_name
LEFT JOIN tblC ON tblC.name = tblD.s_type
LEFT JOIN tblB ON tblA.id = tblB.index_id
WHERE tblB.rdd = 11305
ORDER BY NULL;
SHOW SESSION STATUS LIKE 'ha%';
This gives you a number of indicators to show what actually happened when a query executed.
Handler_read_rnd_next - Number of requests to read next row in the data file
Handler_read_key - Number of requests to read a row based on a key
Handler_read_next - Number of requests to read the next row in key order
Using these values you can see exactly what is going on under the hood.
Unfortunately without knowing the data in the tables, engine type and the data types used in the queries it is quite hard to advise on how you could optimize.
I have updated the query using joins instead of the join within the WHERE clause. Also, by looking at it, as a developer, you can directly see the relationship between the tables.
A->B, A->D and D->C. Now, on table B where you want the highest ID based on the common "ID=Index_ID" AND the RDD = 11305 won't require a complete sub-query. However, this has moved the "MAX()" to the upper portion of the field selection clause. I would ensure you have an index on tblB on (index_id, rdd). Finally, by doing STRAIGHT_JOIN will help enforce the order to run the query based on how specifically listed.
-- EDIT FROM COMMENT --
It appears you are getting nulls from the tblB. This typically indicates a valid tblA record, but no tblB record by same ID that has an RDD = 11305. That said, it appears you are only concerned with those entries associated with 11305, so I'm adjusting the query accordingly. Please make sure you have an index on tblB based on the "RDD" column (at least in the first position in case multiple column index)
As you can see in this one, I'm pre-querying from table B only for 11305 entries and pre-grouping by the index_ID (as linked to tblA). This gives me one record per index where they will exist... From THIS result, I'm joining back to A, then directly back to B again, but based on that highest match ID found, then D and C as was before. So NOW, you can get any column from any of the tables and get proper record in question... There should be no NULL values left in this query.
Hopefully, I've clarified HOW I'm getting the pieces together for you.
SELECT STRAIGHT_JOIN
PreQuery.HighestPerIndexID
tblA.id,
tblA.AnotherAField,
tblA.Etc,
tblB.SomeOtherField,
tblB.AnotherField,
tblC.id,
tblD.id
FROM
( select PQ1.Index_ID,
max( PQ1.ID ) as HighestPerIndexID
from tblB PQ1
where PQ1.RDD = 11305
group by PQ1.Index_ID ) PreQuery
JOIN tblA
on PreQuery.Index_ID = tblA.ID
join tblB
on PreQuery.HighestPerIndexID = tblB.ID
join tblD
on tblA.s_Name = tblD.name
join tblC
on tblD.s_type = tblC.Name
ORDER BY
tblA.s_Name