My friend asked a question a few times ago. Also there is a answer under that and it is good, but not for my case. The idea of that solution is joining the current table to itself. That seems expensive and not effective for me, Because in reality there is four join on these tables (votes, favorites, comments, viewed) in my query.
Now I want to know, how can I do that using CASE function? Something like this:
... ORDER BY Type, CASE WHEN AcceptedAnswerId = Id THEN 1 ELSE 0, timestamp
Or is there any better solution?
To be more readable, I paste those examples here:
I have a table like this:
// Mytable
+----+--------------------+------+------------------+-----------+
| Id | QuestionOrAnswer | Type | AcceptedAnswerId | timestamp |
+----+--------------------+------+------------------+-----------+
| 1 | question1 | 0 | 3 | 1 |
| 2 | answer1 | 1 | NULL | 2 |
| 3 | answer2 | 1 | NULL | 3 | -- accepted answer
| 4 | answer3 | 1 | NULL | 4 |
+----+--------------------+------+------------------+-----------+
Now I want this result: (please focus on the order)
+----+--------------------+------+------------------+-----------+
| Id | QuestionOrAnswer | Type | AcceptedAnswerId | timestamp |
+----+--------------------+------+------------------+-----------+
| 1 | question1 | 0 | 3 | 1 |
| 3 | answer2 | 1 | NULL | 3 | -- accepted answer
| 2 | answer1 | 1 | NULL | 2 |
| 4 | answer3 | 1 | NULL | 4 |
+----+--------------------+------+------------------+-----------+
// ^ 0 means question and 1 means answer
CASE would work, but you are missing the END. But in this case, you could also just use IF(AcceptedAnswerId = Id,1,0).
In the simple case you show, you could just do:
order by type,if(type=0,(#accepted:=acceptedanswerid),id<>#accepted),timestamp
but I don't know if that would work in your real case.
Given the table definition (without proper indices) + sample data
CREATE TABLE Table1
(`Id` int, `QuestionOrAnswer` varchar(9), `Type` int, `AcceptedAnswerId` varchar(4), `related` int NOT NULL, `timestamp` int)
;
INSERT INTO Table1
(`Id`, `QuestionOrAnswer`, `Type`, `AcceptedAnswerId`, `related`, `timestamp`)
VALUES
(1, 'question1', 0, '3', 1, 1),
(2, 'answer1', 1, NULL, 1, 2),
(3, 'answer2', 1, NULL, 1, 3),
(4, 'answer3', 1, NULL, 1, 4)
you can use the query
SELECT
t2.*
FROM
table1 as t1
JOIN
table1 as t2
ON
t1.related=t2.related
WHERE
t1.related = 1
AND t1.Type = 0
ORDER BY
t2.Type desc, t2.Id=t1.AcceptedAnswerId, t2.Id
to get the question/answer set of a specific question (t1.related = 1 <- adjust that parameter for other questions).
And no, with the right indices this query is not "expensive".
example at http://sqlfiddle.com/#!9/24954/4 (yeah, took me 4 attempts to get it right, grrrrr)
Related
I have two tables, first one is 'file_details':
+---------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+-------+
| file_name | varchar(40) | YES | | NULL | |
| creation_date | date | YES | | NULL | |
+---------------+-------------+------+-----+---------+-------+
and second one is 'logs':
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| sl_no | varchar(20) | YES | | NULL | |
| file_name | varchar(40) | YES | | NULL | |
| status | varchar(100) | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+
values in the tables are:
file_details:
+-----------+---------------+
| file_name | creation_date |
+-----------+---------------+
| a1 | 2020-01-09 |
| a2 | 2020-01-08 |
+-----------+---------------+
logs:
+-------+-----------+---------+
| sl_no | file_name | status |
+-------+-----------+---------+
| 1 | a1 | created |
| 2 | a1 | step1 |
| 1 | a2 | created |
| 2 | a2 | step1 |
| 3 | a2 | step2 |
+-------+-----------+---------+
now I want to retrieve the following data:
+-----------+---------------+--------+
| file_name | creation_date | status |
+-----------+---------------+--------+
| a1 | 2020-01-09 | step1 |
| a2 | 2020-01-08 | step2 |
+-----------+---------------+--------+
using the below query:
select f.file_name,f.creation_date,
l.status
from file_details f
inner join logs l on f.file_name=l.file_name
and l.status=(select status
from logs
where sl_no=(
select max(convert(sl_no,unsigned))
from logs));
But the above query gives me the below output:
+-----------+---------------+--------+
| file_name | creation_date | status |
+-----------+---------------+--------+
| a2 | 2020-01-08 | step2 |
+-----------+---------------+--------+
which is not a required solution. So, please help me out.
So there is a couple things to discuss here, you mentioned in the comments that you are new to SQL, so I will provide some links to look at, first off being normalization, this is used to reduce data redundancy (which you have with your status descriptions).
Also what you are trying to do is essentially make the engine "guess" what status is the most up to date one, using the MAX like you have will only deal with alphabetical orders and as such is not scale-able for if you say want to add a status such as "completed", so what you would have to do is hard code the order in something like a case statement which gets really messy with multiple conditions.
And lastly here is a tutorial site on SELECT query basics with links to other data manipulation commands.
So the answer I came up with, I made a status table to store the description and then in the log table I store the status_id, doing this addresses the normalization issue I mentioned earlier. Creating this table also allows me to assign the statuses a rank to order with, which is another issue I discussed earlier.
SELECT t.file_name,
t.creation_date,
s.description
FROM status_details s
JOIN (SELECT f.file_name,
f.creation_date,
MAX(s2.rank_no) rank_no
FROM file_details f
JOIN logs l
ON l.file_name = f.file_name
JOIN status_details s2
ON s2.status_id = l.status_id
GROUP BY f.file_name,
f.creation_date) t
ON t.rank_no = s.rank_no
Now I don't want you so blindly copy this query without understanding what it is doing, so the general gist is that the inner select gets the file names and creation dates with the rank number of the status, note this only gets the status with the highest rank number, then the outer select takes the data already retrieved and joins back onto the status table to grab the status description from the rank number. Giving the output
file_name creation_date description
a1 2020-01-09 step1
a2 2020-01-08 step2
If you would like to see the query working I have created a fiddle for you to try.
These are the data scripts I used to create the environment:
create table file_details( file_name varchar(40), creation_date date)
create table logs (sl_no varchar(20), file_name varchar(40), status_id int)
create table status_details (status_id int, description varchar(100), rank_no int)
insert into file_details values ('a1', '2020-01-09')
insert into file_details values ('a2', '2020-01-08')
insert into status_details values (1, 'created', 1)
insert into status_details values (2, 'step1', 2)
insert into status_details values (3, 'step2', 3)
insert into logs values ('1', 'a1' , 1)
insert into logs values ('2', 'a1' , 2)
insert into logs values ('1', 'a2' , 1)
insert into logs values ('1', 'a2' , 2)
insert into logs values ('3', 'a2' , 3)
max(convert(sl_no,unsigned)) from logs) will return 3 in your example and therefore it only matches
+-------+-----------+---------+
| sl_no | file_name | status |
+-------+-----------+---------+
| 3 | a2 | step2 |
+-------+-----------+---------+
Per the example data below, I need a query that returns every row, where if the 'contingent_on' field is NULL, it is returned as NULL, but if it is not NULL it is returned with the 'ticket_name' corresponding to the 'primary_key' value.
I tried self join queries but could only get them to return the not NULL rows.
example table data:
primary_key | ticket_name | contingent_on
1 | site preparation | NULL
2 | tender process | NULL
3 | construction | 1
All rows should be returned, where the in the 'construction' row return, 'site preparation' is input in place of '1' in the 'contingent_on' field.
You need a self left join:
select
t.primary_key,
t.ticket_name,
tt.ticket_name ticket_name2
from tablename t left join tablename tt
on tt.primary_key = t.contingent_on
order by t.primary_key
See the demo.
Results:
| primary_key | ticket_name | ticket_name2 |
| ----------- | ---------------- | ---------------- |
| 1 | site preparation | null |
| 2 | tender process | null |
| 3 | construction | site preparation |
It looks simple query:
select
primary_key,
ticket_name,
case when contingent_on is not null then ticket_name else contingent_on end as contingent_on
from <<your_table>>
order by primary_key
Given the following table:
+--------+-------------------+-----------+
| ID | Name | Priority |
+--------+-------------------+-----------+
| 1 | Andy | 1 |
| 2 | Bob | 2 |
| 3 | David | 8 |
| 4 | Edward | 9 |
| 5 | CHARLES | 15 |
+--------+-------------------+-----------+
I would like to move CHARLES to between Bob and David by Priority value (ignore the alphabetical list, this is just to make the desired result obvious).
(Also note the Priority values may not be sequential)
To do this I need to change CHARLES' current Priority (15) to Bob's Priority+1, and update David and Edward's Priority to Priority+1.
I can DO this if I know two things, the id of CHARLES and the Priority value of the row he must be after (Bob):
UPDATE mytable SET Priority =
IF(ID = :charles_id, :bob_priority + 1,
IF(Priority >= :bob_priority,
Priority + 1, Priority))
The PROBLEM or at least question is, how could I compress the resulting values to 1,2,3,4,5 instead of 1,2,3,9,10 - and do it in one shot?
Oracle has a "pseudo field" which is the index of the row, but I don't know of anything equivalent in MySQL.
The first part of the problem is fairly trivial...
DROP TABLE IF EXISTS priorities;
CREATE TABLE priorities
(ID SERIAL PRIMARY KEY
,Name VARCHAR(12) NOT NULL
,Priority INT NOT NULL
,INDEX(priority)
);
INSERT INTO priorities VALUES
(101,'Andy',1),
(108,'Bob',2),
(113,'David',8),
(124,'Edward',9),
(155,'CHARLES',15);
UPDATE priorities a
JOIN
( SELECT x.id,x.name, #i:=#i+1 priority FROM priorities x, (SELECT #i:=0) vars ORDER BY id) b
ON b.id = a.id
SET a.priority = b.priority;
SELECT * FROM priorities
+-----+---------+----------+
| ID | Name | Priority |
+-----+---------+----------+
| 101 | Andy | 1 |
| 108 | Bob | 2 |
| 113 | David | 3 |
| 124 | Edward | 4 |
| 155 | CHARLES | 5 |
+-----+---------+----------+
I need to implement mysql query to calculate space used by user's mailbox.
A message thread may have multiple messages (reply, follow up) by 2 parties
(sender/recipient) and is tagged with one or more tags (Inbox, Sent etc.).
The following conditions have to be met:
a) user is either recipient OR author of the message;
b) message IS TAGGED by any of the tags: 1,2,3,4;
c) distinct records only, ie if the thread, containing messages is tagged with
more than one of the 4 tags (for example 1 and 4: Inbox and Sent) the calculation
is done on one tag only
I have tried the following query but I am not able to get distinct values - the
subject/body values are duplicated:
SELECT SUM(LENGTH(subject)+LENGTH(body)) AS sum
FROM om_msg_message omm
JOIN om_msg_index omi ON omm.mid = omi.mid
JOIN om_msg_tags_index omti ON omi.thread_id = omti.thread_id AND omti.uid = user_id
WHERE (omi.recipient = user_id OR omi.author = user_id) AND omti.tag_id IN (1,2,3,4)
GROUP BY omi.mid;
Structure of the tables:
om_msg_message - fields subject and body are the ones to be calculated
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| mid | int(10) unsigned | NO | PRI | NULL | auto_increment |
| subject | varchar(255) | NO | | NULL | |
| body | longtext | NO | | NULL | |
| timestamp | int(10) unsigned | NO | | NULL | |
| reply_to_mid | int(10) unsigned | NO | | 0 | |
+--------------+------------------+------+-----+---------+----------------+
om_msg_index
+-----+-----------+-----------+--------+--------+---------+
| mid | thread_id | recipient | author | is_new | deleted |
+-----+-----------+-----------+--------+--------+---------+
| 1 | 1 | 1392 | 1211 | 0 | 0 |
| 2 | 1 | 1211 | 1392 | 1 | 0 |
+-----+-----------+-----------+--------+--------+---------+
om_msg_tags_index
+--------+------+-----------+
| tag_id | uid | thread_id |
+--------+------+-----------+
| 1 | 1211 | 1 |
| 4 | 1211 | 1 |
| 1 | 1392 | 1 |
| 4 | 1392 | 1 |
+--------+------+-----------+
Here's another solution:
SELECT SUM(LENGTH(omm.subject) + LENGTH(omm.body)) as totalLength
FROM om_msg_message omm
JOIN om_msg_index omi
ON omi.mid = omm.mid
AND (omi.recipient = user_id OR omi.author = user_id)
JOIN (SELECT DISTINCT thread_id
FROM om_msg_tags_index
WHERE uid = user_id
AND tag_id IN (1, 2, 3, 4)) omti
ON omti.thread_id = omi.thread_id
I'm assuming that:
user_id is a parameter marker/host variable, being queried for an individual user.
You want the total of all messages per user, not the total length of each message (which is what the GROUP BY clause in your version was getting you).
That mid in both om_msg_message and om_msg_index is unique.
So, your problem is the IN clause. I'm not a MYSQL guru, but in T-SQL you could change it to have a where clause on a subquery that contained an EXISTS so your join didn't pop out two rows. You need to compensate for the fact that you have two rows with different tagID's associated with each row of your primary join data.
The way I could do it cross-platform would be with four left-joins that linked tables then demanded a non-null value for 1, 2, 3, or 4. Fairly inefficient; I'm sure there's a better way to do it in MySQL, but now that you know what the problem is you might know a better solution.
Previous Related Posts:
MySQL: how to convert to EAV?
MySQL: how to convert to EAV - Part 2?
Given a table:
TABLE: foo
===============================
| id | first_name | last_name |
===============================
| 1 | John | Doe |
| 2 | Jane | Smith |
| 3 | Ronald | McDonald |
-------------------------------
How do I take this table and convert it to these tables (an EAV implementation)?:
TABLE: attribute
===========================
| id | fk_id | attribute |
===========================
| 1 | 100 | first_name |
| 2 | 100 | last_name |
---------------------------
TABLE: value
=========================================
| id | attribute_id | row_id | value |
=========================================
| 1 | 1 | 1 | John |
| 2 | 2 | 1 | Doe |
| 3 | 1 | 2 | Jane |
| 4 | 2 | 2 | Smith |
| 5 | 1 | 3 | Ronald |
| 6 | 2 | 3 | McDonald |
-----------------------------------------
NOTES:
attribute.fk_id will be provided.
value.row_id is used to identify how the values are grouped as records in the original table.
UPDATE: Also, how do I query the EAV tables so that I can make it look like table foo again.
I give +1 to #Phil's solution for populating the EAV table. Insert one attribute at a time.
Here's another solution to reverse an EAV transformation:
SELECT v.row_id AS id,
MAX(IF(a.attribute='first_name',v.value,NULL)) AS first_name,
MAX(IF(a.attribute='last_name',v.value,NULL)) AS last_name
FROM value v INNER JOIN attribute a
ON v.attribute_id = a.id
GROUP BY v.row_id
Except that by using EAV, you've put all your values into a column of VARCHAR(255) or whatever, so you have lost information about the respective data types of the original columns.
There's really no way to do it "dynamically" without hard-coding the attribute names, either as joins as #Phil shows, or as columns as I show. It's essentially the same problem as trying to write dynamic pivot queries.
I have written more about EAV in my presentation Practical Object-Oriented Models in SQL and in my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
I think your only hope is if you use the foo table. bar is essentially useless without the ID.
Try something like this (assuming attribute.id is an auto-increment primary key)
INSERT INTO `attribute` (`fk_id`, `attribute`)
VALUES (100, 'first_name');
INSERT INTO `value` (`attribute_id`, `row_id`, `value`)
SELECT LAST_INSERT_ID(), `id`, `first_name`
FROM `foo`;
INSERT INTO `attribute` (`fk_id`, `attribute`)
VALUES (100, 'last_name');
INSERT INTO `value` (`attribute_id`, `row_id`, `value`)
SELECT LAST_INSERT_ID(), `id`, `last_name`
FROM `foo`;
To reconstruct the foo table, try this
SELECT
`fn`.`row_id` AS `id`,
`fn`.`value` AS `first_name`,
`ln`.`value` AS `last_name`
FROM `value` `fn`
INNER JOIN `attribute` `fn_a`
ON `fn`.`attribute_id` = `fn_a`.`id`
AND `fn_a`.`attribute` = 'first_name'
INNER JOIN `value` `ln`
ON `fn`.`row_id` = `ln`.`row_id`
INNER JOIN `attribute` `ln_a`
ON `ln`.`attribute_id` = `ln_a`.`id`
AND `ln_a`.`attribute` = 'last_name'
Ergh, thanks for reminding me why I hate this pattern