Updating multiple columns with data from subquery in MySQL - mysql

I am trying to update multiple columns in a row, with data from multiple columns in a subquery.
The following approaches did not work for me, and I can't find different ones that suit my needs:
UPDATE
beers,
(SELECT AVG(appearance) AS appearance, AVG(palate) AS palate, AVG(taste) AS taste, AVG(aroma) AS aroma, AVG(overall) AS overall, beer_id FROM reviews) AS review_total
SET
beers.appearance = review_total.appearance,
beers.palate = review_total.palate,
beers.taste = review_total.taste,
beers.aroma = review_total.aroma,
beers.overall = review_total.overall
WHERE
review_total.beer_id = beers.id
AND
beers.id = 43
I don't get an error for this one, but 5 warnings and the row is not updated:
Query OK, 0 rows affected, 5 warnings (0.01 sec)
Show warnings gives me:
+-------+------+----------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------+
| Note | 1265 | Data truncated for column 'appearance' at row 9991 |
| Note | 1265 | Data truncated for column 'palate' at row 9991 |
| Note | 1265 | Data truncated for column 'taste' at row 9991 |
| Note | 1265 | Data truncated for column 'aroma' at row 9991 |
| Note | 1265 | Data truncated for column 'overall' at row 9991 |
+-------+------+----------------------------------------------------+
I know this issue has to do with the data type, but the data type is float, i beleive thats what AVG's result is too:
mysql> describe beers;
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(90) | YES | | NULL | |
| aroma | float | YES | | NULL | |
| appearance | float | YES | | NULL | |
| palate | float | YES | | NULL | |
| taste | float | YES | | NULL | |
| overall | float | YES | | NULL | |
+-------------+---------------+------+-----+---------+----------------+
The next query is slightly different:
UPDATE
beers
SET
beers.appearance = review_total.appearance,
beers.palate = review_total.palate,
beers.taste = review_total.taste,
beers.aroma = review_total.aroma,
beers.overall = review_total.overall
FROM
INNER JOIN (SELECT AVG(appearance) AS appearance, AVG(palate) AS palate, AVG(taste) AS taste, AVG(aroma) AS aroma, AVG(overall) AS overall, beer_id FROM reviews) review_total ON review_total.beer_id = beers.id
WHERE
beers.id = 43
The error i got for this one is:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM INNER JOIN (SELECT AVG(appearance) AS appearance, AVG(palate) AS palate, AV' at line 9
I really can't find a way to get this working and I hope someone sees what I'm doing wrong. Thank you very much in advance!

UPDATE beers b
JOIN
( SELECT beer_id
, AVG(appearance) appearance
, AVG(palate) palate
, AVG(taste) taste
, AVG(aroma) aroma
, AVG(overall) overall
, beer_id
FROM reviews
GROUP
BY beer_id
) review_total
ON review_total.beer_id = b.id
SET b.appearance = review_total.appearance
, b.palate = review_total.palate
, b.taste = review_total.taste
, b.aroma = review_total.aroma
, b.overall = review_total.overall
WHERE b.id = 43;
or something like that

Related

MySQL - How can you select multiple columns on a nested IFNULL...GROUP_CONCAT() condition?

I have a web application which is connected to a MySQL (5.5.64-MariaDB) database.
One of the queries is as follows:
SELECT
d.id,
d.label AS display_label,
d.anchor,
r.id AS regulation_id,
IFNULL(
(SELECT GROUP_CONCAT(value) FROM display_substances `ds`
WHERE `ds`.`display_id` = `d`.`id`
AND ds.substance_id = 1 -- For example, substance ID = 1
GROUP BY `ds`.`display_id`
), "Not Listed"
) `display_value` FROM displays `d`
JOIN groups g ON d.group_id = g.id
JOIN regulations r ON g.regulation_id = r.id
An example of the output is as follows:
+-----+------------------------------------+------------------------------------------------------------------------------------------+
| id | name | display_value |
+-----+------------------------------------+------------------------------------------------------------------------------------------+
| 4 | techfunction | Intermediate / monomer; Corrosion inhibitor / anodiser / galvaniser; Catalyst; Additive |
| 323 | russia_chemsafety_register_display | Not Listed |
| 733 | peru_pcb_display | Not Listed |
+-----+------------------------------------+------------------------------------------------------------------------------------------+
This query does what we need. For explanatory purposes:
There are 2 tables, displays and display_substances
The query is obtaining display_substances.value for each displays.id
If there is no corresponding display_substances.value then the string "Not Listed" (refer to query above) is returned. If there is a corresponding value then display_substances.value is returned. So in the example data above, IDs 323 and 733 refer to a scenario where there is no corresponding entry, therefore we want "Not Listed". Conversely ID 4 does have a value ("Intermediate / monomer; Corrosion inhibitor / anodiser / galvaniser; Catalyst; Additive") so we get that.
The table structures are as follows:
DESCRIBE displays;
+----------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------------------+------+-----+---------+----------------+
| id | smallint(5) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(127) | NO | | NULL | |
| label | varchar(255) | NO | | NULL | |
+----------+----------------------+------+-----+---------+----------------+
DESCRIBE display_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| display_id | smallint(5) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
| value | text | NO | | NULL | |
| automated | tinyint(4) | YES | | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
I want to be able to return display_substances.automated (refer to table structure above) as a column from my query. But I can't see how to do this.
The reference to the display_substances table is ds, so I cannot use that in the initial SELECT statement because at that point there's no alias. Equally there is no JOIN condition that would make it possible, because not every row returned obtains data from display_substances (i.e. those that are "Not Listed" are not getting anything from that table).
If I want an additional column next to display_value in the sample output above that shows display_substances.automated, or NULL if it doesn't exist, how can I achieve that?
For reference the automated field either contains a 1 (to represent data that has been obtained through automated processes by our application), or NULL if it isn't automated.
there is no JOIN condition that would make it possible, because not
every row returned obtains data from display_substances
For this case you can use a LEFT JOIN:
SELECT d.id, d.label display_label, d.anchor, r.id regulation_id,
COALESCE(ds.value, 'Not Listed') display_value,
ds.automated
FROM displays d
INNER JOIN groups g ON d.group_id = g.id
INNER JOIN regulations r ON g.regulation_id = r.id
LEFT JOIN (
SELECT display_id, GROUP_CONCAT(value) value, MAX(automated) automated
FROM display_substances
WHERE substance_id = 1
GROUP BY display_id
) ds ON ds.display_id = d.id
I used MAX(automated) as the returned column, but you can use GROUP_CONCAT(automated) just like you do for value and also COALESCE():
COALESCE(ds.automated, 'Not Listed')

MySql LEFT OUTER JOIN causing duplicate rows

Im running a query to grab the first 10 profiles (think of them as an article that shows when a shop opens and holds information about that shop). I'm using the OUTER JOIN to select * images that belong to the profile PK.
Im running the following query, the main part I'm trying to focus on is the JOIN. I won't post the whole query as it's just a whole bunch of 'table'.'colname' = 'table.colname'.
But here is where the magic happens during my outer join.
LEFT JOIN `content_image` AS `image` ON `profile`.`content_ptr_id` = `image`.`content_id`
Full Query:
I've formatted like this so everyone can see the query without scrolling endlessly to the right.
select `profile`.`content_ptr_id` AS `profile.content_ptr_id`,
`profile`.`body` AS `profile.body`,
`profile`.`web_site` AS `profile.web_site`,
`profile`.`email` AS `profile.email`,
`profile`.`hours` AS `profile.hours`,
`profile`.`price_range` AS `profile.price_range`,
`profile`.`price_range_high` AS `profile.price_range_high`,
`profile`.`primary_category_id` AS `profile.primary_category_id`,
`profile`.`business_contact_email` AS `profile.business_contact_email`,
`profile`.`business_contact_phone` AS `profile.business_contact_phone`,
`profile`.`show_in_directory` AS `profile.show_in_directory`,
`image`.`id` AS `image.id`,
`image`.`content_id` AS `image.content_id`,
`image`.`type` AS `image.type`,
`image`.`order` AS `image.order`,
`image`.`caption` AS `image.caption`,
`image`.`author_id` AS `image.author_id`,
`image`.`image` AS `image.image`,
`image`.`link_url` AS `image.link_url`
FROM content_profile AS profile
LEFT JOIN `content_image` AS `image` ON `profile`.`content_ptr_id` = `image`.`content_id`
GROUP BY profile.content_ptr_id
LIMIT 10, 12
Is there a way I can group my results per profile? E.g all images will show in the one profile result? I can't use group by as I'm getting an error
Error: ER_WRONG_FIELD_WITH_GROUP: Expression #12 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'broadsheet.image.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by]
code: 'ER_WRONG_FIELD_WITH_GROUP',
errno: 1055,
sqlState: '42000',
index: 0 }
Is there a possible way around this group by error or another query I could run?
Tables:
content_image
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| content_id | int(11) | NO | MUL | NULL | |
| type | varchar(255) | NO | | NULL | |
| order | int(11) | NO | | NULL | |
| caption | longtext | NO | | NULL | |
| author_id | int(11) | YES | MUL | NULL | |
| image | varchar(255) | YES | | NULL | |
| link_url | varchar(200) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
content_profile
+------------------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+----------------------+------+-----+---------+-------+
| content_ptr_id | int(11) | NO | PRI | NULL | |
| body | longtext | NO | | NULL | |
| web_site | varchar(200) | NO | | NULL | |
| email | varchar(75) | NO | | NULL | |
| menu | longtext | NO | | NULL | |
| hours | longtext | NO | | NULL | |
| price_range | smallint(5) unsigned | YES | MUL | NULL | |
| price_range_high | smallint(5) unsigned | YES | | NULL | |
| primary_category_id | int(11) | NO | | NULL | |
| business_contact_name | varchar(255) | NO | | NULL | |
| business_contact_email | varchar(75) | NO | | NULL | |
| business_contact_phone | varchar(20) | NO | | NULL | |
| show_in_directory | tinyint(1) | NO | | NULL | |
+------------------------+----------------------+------+-----+---------+-------+
From reading your question, I think you don't have a grasp of how the GROUP BY clause works.
So the short summary of my answer is: learn the fundamentals of the GROUP BY clause.
I will use only a small number of columns to make the explanation easier.
The first problem with your query is that you are not using the group by clause properly - when using a group by clause, all columns that are selected must be either in the group by clause OR be selected with an aggregate function.
Lets suppose these are the only columns you are selecting:
profile.content_ptr_id
profile.body
profile.web_site
image.id
image.content_id
And the query looked like this:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`
This query will error out as you did not specify how you want to consolidate multiple rows to one row for profile.body, profile.web_site, image.id, image.content_id. The database does not know how you want to consolidate the other columns as you can group, or use aggregate functions such as min(), max(), count(), etc.
So one solution to fix the error raised in the query above would be the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
Here, I put all the columns in the group by clause which makes the query group and select all the unique combinations of profile.content_ptr_id, profile.body, profile.web_site, image.id, image.content_id columns.
Following is an example query which does not have all the columns included in the group by clause:
Lets say, you want to find out how many images there are for each of the profiles. You can use a query such as the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, COUNT(`image.id`)
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`
This query lets you find out how many images there are for every unique combination of profile.content_ptr_id, profile.body, profile.web_site columns.
Be aware that in my previous two examples, all the columns that are selected are either included in the group by clause or are selected with an aggregate function. This is a rule all queries need to follow when using the group by clause, otherwise an error will be raised by the database.
Now, lets get onto answering your question:
"Is there a way I can group my results per profile? E.g all images will show in the one profile result?"
I will use the following mock data to explain:
profile
+----------------+--------------+---------------+
| content_ptr_id | body | web_site |
+----------------+--------------+---------------+
| 100 | body1 | web1 |
+----------------+--------------+---------------+
image
+--------+-------------+
| id | content_id |
+--------+-------------+
| iid1 | 100 |
| iid2 | 100 |
+--------+-------------+
Following would be what the result would look like if you don't do a join:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
+----------------+--------------+---------------+--------+-------------+
| content_ptr_id | body | web_site | id | content_id |
+----------------+--------------+---------------+--------+-------------+
| 100 | body1 | web1 | iid1 | 100 |
| 100 | body1 | web1 | iid2 | 100 |
+----------------+--------------+---------------+--------+-------------+
You can't achieve your objective of grouping your results per profile (combining to only show one line per profile) by grouping by all the columns as the result will be the same:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
will return
+----------------+--------------+---------------+--------+-------------+
| content_ptr_id | body | web_site | id | content_id |
+----------------+--------------+---------------+--------+-------------+
| 100 | body1 | web1 | iid1 | 100 |
| 100 | body1 | web1 | iid2 | 100 |
+----------------+--------------+---------------+--------+-------------+
The question you need to answer is how you want to display the non-unique columns you want to combine - in this case image.id. You can use count, but this will only return you a number. If you want to display all the text, you can use GROUP_CONCAT() which will concatenate all the values delimited by comma by default. If you use GROUP_CONCAT() the result will look like the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, GROUP_CONCAT(`image.id`), GROUP_CONCAT(`image.content_id`)
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`
This query will return:
+----------------+--------------+---------------+--------------------+-------------+
| content_ptr_id | body | web_site | GROUP_CONCAT(id) | content_id |
+----------------+--------------+---------------+--------------------+-------------+
| 100 | body1 | web1 | iid1,iid2 | 100 |
+----------------+--------------+---------------+--------------------+-------------+
If GROUP_CONCAT() is what you want to use for all the image columns, then go ahead, but doing this for many columns consolidating many rows may make the table less readable. But either way, I would suggest you read some articles to familiarise yourself with how the GROUP BY clause works.
Remove the GROUP BY clause.
I suspect you didn't want to do a GROUP BY operation, given that the expression in the group by is the PRIMARY KEY of the content_profile table.
What is up with all the single quotes? Those are used to enclose string literals, not identifiers.
Thank for sparing us from "scrolling endlessly to the right".
Are you aware that spaces and linebreaks can be included in the SQL text, without altering the meaning of the statement? The parser can easily deal with extra whitespace, and adding the extra whitespace to format the statement can make it much easier for a human reader to decipher.
It's not at all clear why the statement is skipping over the first ten rows, and then returning the next twelve. Very strange.
SELECT p.content_ptr_id AS `profile.content_ptr_id`
, p.body AS `profile.body`
, p.web_site AS `profile.web_site`
, p.email AS `profile.email`
, p.hours AS `profile.hours`
, p.price_range AS `profile.price_range`
, p.price_range_high AS `profile.price_range_high`
, p.primary_category_id AS `profile.primary_category_id`
, p.business_contact_email AS `profile.business_contact_email`
, p.business_contact_phone AS `profile.business_contact_phone`
, p.show_in_directory AS `profile.show_in_directory`
, i.id AS `image.id`
, i.content_id AS `image.content_id`
, i.type AS `image.type`
, i.order AS `image.order`
, i.caption AS `image.caption`
, i.author_id AS `image.author_id`
, i.image AS `image.image`
, i.link_url AS `image.link_url`
FROM `content_profile` p
LEFT
JOIN `content_image` i
ON i.content_id = p.content_ptr_id
ORDER
BY p.content_ptr_id
, i.id
Because content_id is not unique in the content_image table, duplicate rows from content_profile are the expected result.
If your code can't handle the "duplicate" rows, i.e. identifying when the row that was just fetched has the same value for content_ptr_id as the previous row, then your SQL shouldn't do a join operation that creates the duplicated values.

Unknown column in field list error

I am doing an inner join on a bunch of tables and I surely made that all of those tables exist. But when everytime I input my query, it always says '
Unknown column 'tbl_undertime.ut_date' in 'field list'
I am very sure that tbl_undertime is a table under my database but I don't know why it keeps on returning such error. I've already dropped the table and made a new table again with the same name and column but still gives me the same error.
A help would be very much appreciated.
select tbl_employee.lname, tbl_employee.fname, tbl_employee.mi,
tbl_employee.sss_no,
tbl_employee.philhealth_no, tbl_employee.dept_id, tbl_employee.salaryperday,
tbl_earlyout.timeout_date, tbl_late.late_date, tbl_overtime.ot_date,
tbl_absent.absentdate,
tbl_leave.leave_type, tbl_leave.start_date, tbl_leave.end_date,
tbl_undertime.utdate, tbl_cashadv.cashadv_date,
tbl_pay15.gross_sal
from tbl_employee
inner join tbl_earlyout
on tbl_employee.empid = tbl_earlyout.empid
inner join tbl_late
on tbl_late.empid = tbl_overtime.empid
inner join tbl_overtime
on tbl_overtime.empid = tbl_absent.empid
inner join tbl_absent
on tbl_absent.empid = tbl_leave.empid
inner join tbl_leave
on tbl_leave.empid = tbl_undertime.empid
inner join tbl_cashadv
on tbl_cashadv.empid = tbl_pay15.empid;
+---------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------+------+-----+---------+-------+
| ut_id | int(11) | NO | PRI | NULL | |
| empid | int(11) | YES | MUL | NULL | |
| utdate | date | YES | | NULL | |
| ut_mins_hours | double | YES | | NULL | |
+---------------+---------+------+-----+---------+-------+
Looking at your code, it would appear your table column is utdate not ut_date. I would imagine a typo is your issue, hence Unknown column 'tbl_undertime.ut_date.

MySQL - Select everything from one table, but only first matching value in second table

I'm feeling a little rusty with creating queries in MySQL. I thought I could solve this, but I'm having no luck and searching around doesn't result in anything similar...
Basically, I have two tables. I want to select everything from one table and the matching row from the second table. However, I only want to have the first result from the second table. I hope that makes sense.
The rows in the daily_entries table are unique. There will be one row for each day, but maybe not everyday. The second table notes contains many rows, each of which are associated with ONE row from daily_entries.
Below are examples of my tables;
Table One
mysql> desc daily_entries;
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| eid | int(11) | NO | PRI | NULL | auto_increment |
| date | date | NO | | NULL | |
| location | varchar(100) | NO | | NULL | |
+----------+--------------+------+-----+---------+----------------+
Table Two
mysql> desc notes;
+---------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+----------------+
| task_id | int(11) | NO | PRI | NULL | auto_increment |
| eid | int(11) | NO | MUL | NULL | |
| notes | text | YES | | NULL | |
+---------+---------+------+-----+---------+----------------+
What I need to do, is select all entries from notes, with only one result from daily_entries.
Below is an example of how I want it to look:
+----------------------------------------------+---------+------------+----------+-----+
| notes | task_id | date | location | eid |
+----------------------------------------------+---------+------------+----------+-----+
| Another note | 3 | 2014-01-02 | Home | 2 |
| Enter a note. | 1 | 2014-01-01 | Away | 1 |
| This is a test note. To see what happens. | 2 | | Away | 1 |
| Testing another note | 4 | | Away | 1 |
+----------------------------------------------+---------+------------+----------+-----+
4 rows in set (0.00 sec)
Below is the query that I currently have:
SELECT notes.notes, notes.task_id, daily_entries.date, daily_entries.location, daily_entries.eid
FROM daily_entries
LEFT JOIN notes ON daily_entries.eid=notes.eid
ORDER BY daily_entries.date DESC
Below is an example of how it looks with my query:
+----------------------------------------------+---------+------------+----------+-----+
| notes | task_id | date | location | eid |
+----------------------------------------------+---------+------------+----------+-----+
| Another note | 3 | 2014-01-02 | Home | 2 |
| Enter a note. | 1 | 2014-01-01 | Away | 1 |
| This is a test note. To see what happens. | 2 | 2014-01-01 | Away | 1 |
| Testing another note | 4 | 2014-01-01 | Away | 1 |
+----------------------------------------------+---------+------------+----------+-----+
4 rows in set (0.00 sec)
At first I thought I could simply GROUP BY daily_entries.date, however that returned only the first row of each matching set. Can this even be done? I would greatly appreciate any help someone can offer. Using Limit at the end of my query obviously limited it to the value that I specified, but applied it to everything which was to be expected.
Basically, there's nothing wrong with your query. I believe it is exactly what you need because it is returning the data you want. You can not look at as if it is duplicating your daily_entries you should be looking at it as if it is return all notes with its associated daily_entry.
Of course, you can achieve what you described in your question (there's an answer already that solve this issue) but think twice before you do it because such nested queries will only add a lot of noticeable performance overhead to your database server.
I'd recommend to keep your query as simple as possible with one single LEFT JOIN (which is all you need) and then let consuming applications manipulate the data and present it the way they need to.
Use mysql's non-standard group by functionality:
SELECT n.notes, n.task_id, de.date, de.location, de.eid
FROM notes n
LEFT JOIN (select * from
(select * from daily_entries ORDER BY date DESC) x
group by eid) de ON de.eid = n.eid
You need to do these queries with explicit filtering for the last row. This example uses a join to do this:
SELECT n.notes, n.task_id, de.date, de.location, de.eid
FROM daily_entries de LEFT JOIN
notes n
ON de.eid = n.eid LEFT JOIN
(select n.eid, min(task_id) as min_task_id
from notes n
group by n.eid
) nmin
on n.task_id = nmin.min_task_id
ORDER BY de.date DESC;

Get distinct results from several tables

I need to implement mysql query to calculate space used by user's mailbox.
A message thread may have multiple messages (reply, follow up) by 2 parties
(sender/recipient) and is tagged with one or more tags (Inbox, Sent etc.).
The following conditions have to be met:
a) user is either recipient OR author of the message;
b) message IS TAGGED by any of the tags: 1,2,3,4;
c) distinct records only, ie if the thread, containing messages is tagged with
more than one of the 4 tags (for example 1 and 4: Inbox and Sent) the calculation
is done on one tag only
I have tried the following query but I am not able to get distinct values - the
subject/body values are duplicated:
SELECT SUM(LENGTH(subject)+LENGTH(body)) AS sum
FROM om_msg_message omm
JOIN om_msg_index omi ON omm.mid = omi.mid
JOIN om_msg_tags_index omti ON omi.thread_id = omti.thread_id AND omti.uid = user_id
WHERE (omi.recipient = user_id OR omi.author = user_id) AND omti.tag_id IN (1,2,3,4)
GROUP BY omi.mid;
Structure of the tables:
om_msg_message - fields subject and body are the ones to be calculated
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| mid | int(10) unsigned | NO | PRI | NULL | auto_increment |
| subject | varchar(255) | NO | | NULL | |
| body | longtext | NO | | NULL | |
| timestamp | int(10) unsigned | NO | | NULL | |
| reply_to_mid | int(10) unsigned | NO | | 0 | |
+--------------+------------------+------+-----+---------+----------------+
om_msg_index
+-----+-----------+-----------+--------+--------+---------+
| mid | thread_id | recipient | author | is_new | deleted |
+-----+-----------+-----------+--------+--------+---------+
| 1 | 1 | 1392 | 1211 | 0 | 0 |
| 2 | 1 | 1211 | 1392 | 1 | 0 |
+-----+-----------+-----------+--------+--------+---------+
om_msg_tags_index
+--------+------+-----------+
| tag_id | uid | thread_id |
+--------+------+-----------+
| 1 | 1211 | 1 |
| 4 | 1211 | 1 |
| 1 | 1392 | 1 |
| 4 | 1392 | 1 |
+--------+------+-----------+
Here's another solution:
SELECT SUM(LENGTH(omm.subject) + LENGTH(omm.body)) as totalLength
FROM om_msg_message omm
JOIN om_msg_index omi
ON omi.mid = omm.mid
AND (omi.recipient = user_id OR omi.author = user_id)
JOIN (SELECT DISTINCT thread_id
FROM om_msg_tags_index
WHERE uid = user_id
AND tag_id IN (1, 2, 3, 4)) omti
ON omti.thread_id = omi.thread_id
I'm assuming that:
user_id is a parameter marker/host variable, being queried for an individual user.
You want the total of all messages per user, not the total length of each message (which is what the GROUP BY clause in your version was getting you).
That mid in both om_msg_message and om_msg_index is unique.
So, your problem is the IN clause. I'm not a MYSQL guru, but in T-SQL you could change it to have a where clause on a subquery that contained an EXISTS so your join didn't pop out two rows. You need to compensate for the fact that you have two rows with different tagID's associated with each row of your primary join data.
The way I could do it cross-platform would be with four left-joins that linked tables then demanded a non-null value for 1, 2, 3, or 4. Fairly inefficient; I'm sure there's a better way to do it in MySQL, but now that you know what the problem is you might know a better solution.