Need help with a MySQL SELECT (within-group aggregate) - mysql

Background
I have a table of comments. Each comment belongs to a specific row in the 'lead' table or the 'prod' table. There can be multiple comments for each row in the 'lead' and 'prod' tables. There can also be multiple comments with the same 'kstatus' for each row.
comments
INT id /* primary key */
ENUM('lead', 'prod') type /* tells me whether the comment belongs to a row in the lead table or the prod table */
INT fid /* foreign key, refers to the id column in either the lead or prod table */
MEDIUMTEXT comment /* the comment itself */
INT timestamp /* when the comment was written, i.e. 1300530201 */
INT kstatus /* foreign key, refers to the id column in the kstatus column */
lead
INT id
...
prod
INT id
...
kstatus
INT id
...
A comment with type = 'lead' and fid = 105 refers to the row in lead with lead.id = 105.
A comment with type = 'prod' and fid = 105 refers to the row in prod with prod.id = 105.
Question
I want to SELECT the first comment given (if any, and based on timestamp) for each row in the 'lead' table with a certain kstatus (say 14).
Here is my attempt, which returns way too few results. Maybe due to mixing of fid's referring to lead.id and fid's referring to prod.id?
SELECT C1.*
FROM comments AS C1
LEFT JOIN comments AS C2 ON (C1.timestamp > C2.timestamp AND C1.fid = C2.fid)
WHERE C2.timestamp is NULL
AND C1.type = 'lead'
AND C1.kstatus = 14
GROUP BY C1.fid

select c2.*
from (
select min(c.id) realfirstcomment
from (
select fid, min(timestamp) as firstcomment
from comments
where type='lead' and kstatus=14
group by fid
) f, comments c
where f.fid = c.fid
and c.type='lead' and c.kstatus=14
and f.firstcomment = c.timestamp
group by c.fid
) f2, c2
where f2.realfirstcomment = c2.id

You are not only losing results because of mixing between products and leads, but between kstatus values as well -- your left join finds all rows in C2 with a earlier timestamp, regardless of type and kstatus, as long as the fid is the same.
One solution would be to add AND C1.type = C2.type AND C1.kstatus = C2.kstatus to the join condition.

Related

why does index not work as expected in mysql?

I really want to why my index not working.
I have two table post, post_log.
create table post
(
id int auto_increment
primary key,
comment int null,
is_used tinyint(1) default 1 not null,
is_deleted tinyint(1) default 0 not null
);
create table post_log
(
id int auto_increment
primary key,
post_id int not null,
created_at datetime not null,
user int null,
constraint post_log_post_id_fk
foreign key (post_id) references post (id)
);
create index post_log_created_at_index
on post_log (created_at);
When I queried below, created_at index works well.
explain
SELECT *
FROM post p
INNER JOIN post_log pl ON p.id = pl.post_id
WHERE pl.created_at > DATE('2022-06-01')
AND pl.created_at < DATE('2022-06-08')
AND p.is_used is TRUE
AND p.is_deleted is FALSE;
When I queried below, it doesn't work and post table do full scan.
explain
SELECT *
FROM post p
INNER JOIN post_log pl ON p.id = pl.post_id
WHERE pl.created_at > DATE('2022-06-01')
AND pl.created_at < DATE('2022-06-08')
AND p.is_used = 1
AND p.is_deleted = 0;
And below not working either.
explain
SELECT *
FROM post p
INNER JOIN post_log pl ON p.id = pl.post_id
WHERE pl.created_at > DATE('2022-06-01')
AND pl.created_at < DATE('2022-06-08')
and p.comment = 111
what is different between 'tinyint = 1' and 'tinyint is true'?
and, why first query work correctly and the others don't work correctly??
When making the query plan, MySQL has to decide whether to first filter the post_log table using the index, or first filter the post table using the is_used and is_deleted columns.
= 1 tests for the specific value 1, while IS TRUE is true for any non-zero value. I guess it decides that when you're searching for specific values, it will be more efficient to filter the post table first because there will likely be fewer matches (since these columns aren't indexed, it doesn't know that 0 and 1 are the only values).

MySQL. How to make a selection by multiple columns

I have a database with the following base structure.
create table objects
(
id int auto_increment primary key,
);
create table object_attribute_values
(
id int auto_increment primary key,
object_id int not null,
attribute_id int not null,
value varchar(255) null
);
create table attributes
(
id int auto_increment primary key,
attribute varchar(20) null,
);
And so let's say the attribute table has 3 :
id
attribute
1
color
2
rating
3
size
I need select all objects that have color='black', rating IN (5, 10), size=10.
I understand how to get all objects in black
SELECT o.id
FROM objects o
INNER JOIN object_attribute_values oav ON oav.object_id = o.id
INNER JOIN join attributes a ON a.id = oav.attribute_id
WHERE a.attribute = 'color' AND oav.value = 'black'
The result should be like this:
object_id
attributes
1
color:black,rating:6,size:10
7
color:black,rating:6,size:10
12
color:black,rating:9,size:10
What you are dealing with is a key/value table. I don't like them much, because they make querying data more complex and don't guarantee consisteny (data type, obligatory/optional values) as normal columns do. But sometimes they are necessary.
Anyway, the typical way to query key/value tables is by aggregation:
SELECT
o.id as object_id,
GROUP_CONCAT(CONCAT(a.attribute, ':', oav.value) ORDER BY a.id SEPARATOR ';') AS attributes
FROM objects o
INNER JOIN object_attribute_values oav ON oav.object_id = o.id
INNER JOIN join attributes a ON a.id = oav.attribute_id
GROUP BY o.id
HAVING SUM(a.attribute = 'color' AND oav.value = 'black') > 0;
The HAVING clause looks for all objetcs that have color = black. Others are dismissed. This works, because in MySQL true = 1, false = 0, so we can just add up the condition results.
Yes, you can do this. The knack you need is the concept that there are two ways of getting tables out of the table server. One way is ..
FROM TABLE A
The other way is
FROM (SELECT col as name1, col2 as name2 FROM ...) B

Join unrelated tables in summarized manner

I am finding a hard time to summarize the SQL table.
Objective: from the given tables I have to join and summarize the table.
col1 = Name_of_student,
col2 = Name_of_subject(where she/he scored highest),
col3= highest_number,
col4 = faculty_Name(where she/he scored highest),
col5 = Name_of_subject(where she/he scored lowest),
col6 = lowest marks,
col7 = faculty_Name(where she/he scored lowest)
Note - I have to write only one query for the given output.
There four tables:
Students.
Students_subject.
Faculty.
Marks.
You can copy the code in my SQL script for understanding the tables.
create database university ;
use university ;
create table students (id int auto_increment primary key,
student_name varchar(250) NOT NULL,
dob DATE NOT NULL) ;
create table faculty ( id int auto_increment primary key,
faculty_name varchar(250) NOT NULL,
date_of_update datetime default NOW()) ;
create table Students_subject ( id int auto_increment primary key,
subject_name varchar(250) default 'unknown' NOT NULL,
subject_faculty int not null,
foreign key(subject_faculty) references faculty(id));
create table marks (id int auto_increment primary key,
student_id int NOT NULL,
subject_id int NOT NULL,
marks int NOT NULL,
date_of_update datetime default now() ON UPDATE NOW(),
foreign key(student_id) references students(id),
foreign key(subject_id) references students_subject(id));
insert into students ( student_name, dob) values
('rob', '2001-03-06'),
('bbb', '2001-09-06'),
('rab', '1991-03-06'),
('root', '2001-03-16') ;
insert into faculty(faculty_name) values
('kaka'),
('dope'),
('kallie'),
('kim');
insert into students_subject (subject_name, subject_faculty) values
('maths', 2),
('physics', 3),
('english', 4),
('biology', 1),
('statistics', 2),
('french', 4),
('economics',3);
insert into marks ( student_id, subject_id, marks) values
(1,1,70),
(1,2,60),
(1,3,98),
(1,4,75),
(1,5,90),
(1,6,30),
(1,7,40),
(2,1,70),
(2,2,60),
(2,3,70),
(2,4,105),
(2,5,95),
(2,6,30),
(2,7,10),
(3,1,70),
(3,2,60),
(3,3,70),
(3,4,75),
(3,5,99),
(3,6,30),
(3,7,10),
(4,1,70),
(4,2,60),
(4,3,70),
(4,4,89),
(4,5,99),
(4,6,30),
(4,7,19);
I had written Query myself to work out on this but cannot break it though.
select students.id, table_high.marks, table_high.faculty_name as high_faculty, table_high.subject_name as sub_high,
student_low.marks , student_low.faculty_name as faculty_low, student_low.subject_name as sub_low from students
inner join
(select students.id, students.student_name ,marks.marks, subject_joined.faculty_name, students_subject.subject_name from marks
inner join (select students_subject.id,students_subject.subject_name, faculty.faculty_name, students_subject.subject_faculty
from students_subject left join faculty on students_subject.subject_faculty = faculty.id)
as subject_joined on subject_joined.id = marks.subject_id
inner join faculty on subject_joined.subject_faculty = faculty.id
inner join students_subject on students_subject.id = marks.subject_id
inner join students on students.id = marks.student_id
order by 1, 3 desc) as table_high on table_high.id = students.id
inner join
(select students.id, students.student_name ,marks.marks, subject_joined.faculty_name, students_subject.subject_name from marks
inner join (select students_subject.id,students_subject.subject_name, faculty.faculty_name, students_subject.subject_faculty
from students_subject left join faculty on students_subject.subject_faculty = faculty.id)
as subject_joined on subject_joined.id = marks.subject_id
inner join faculty on subject_joined.subject_faculty = faculty.id
inner join students_subject on students_subject.id = marks.subject_id
inner join students on students.id = marks.student_id
order by 1, 3 ) as student_low on student_low.id = students.id
group by 1 ;
attaching screen of output :
Finally resolved this question!
The basic tweak that was required in summarizing this table is that the sub-table had to be joined with a combination of two columns as group by command just reflects the first row value in the summarized table for non-summarized cols, so reflecting values of max and min was not possible at the same time, to which I created sub tables filtering rows through double column joins and finally joined the table to the main student table.
The main table which was joined is Students.
Sub-Table 1 - hw (which summarized the data for highest)
sub-table 1.2 - high for highest marks tagging.
sub-table 2 - lw ( which summarized the table for the lowest)
sub-table 2.1 - low for minimum marks tagging.
Query >>
select students.id, students.student_name, lw.min_marks, lw.lower_subject, lw.lower_faculty,
hw.high_marks, hw.subject_name as high_subject, hw.faculty_name as higher_faculty
from students inner join
(select high.student_id, high.high_marks, high.subject_id, high.subject_name, high.faculty_name
from
(select marks.student_id, marks.marks as high_marks, sub_with_faculty.subject_id, sub_with_faculty.subject_name,
sub_with_faculty.faculty_name from marks
left join
(select students_subject.id as subject_id, students_subject.subject_name, faculty.faculty_name
from students_subject
left join faculty on students_subject.subject_faculty = faculty.id) as sub_with_faculty
on sub_with_faculty.subject_id = marks.subject_id) as high
inner join (select marks.student_id, max(marks) as marks from marks group by 1) as maximum on
maximum.student_id = high.student_id and maximum.marks = high.high_marks) as hw on
hw.student_id = students.id
inner join
(select low.student_id, low.low_marks as min_marks, low.subject_id as lower_subjectID, low.subject_name as lower_subject, low.faculty_name as lower_faculty
from
(select marks.student_id, marks.marks as low_marks, sub_with_faculty.subject_id, sub_with_faculty.subject_name,
sub_with_faculty.faculty_name from marks
left join
(select students_subject.id as subject_id, students_subject.subject_name, faculty.faculty_name
from students_subject
left join faculty on students_subject.subject_faculty = faculty.id) as sub_with_faculty
on sub_with_faculty.subject_id = marks.subject_id) as low
inner join (select marks.student_id, min(marks) as marks from marks group by 1) as minimum on
minimum.student_id = low.student_id and minimum.marks = low.low_marks) as lw on
lw.student_id = students.id;
This could be a good exercise for someone who's new to MySQL like me.

Update with subquery in MySql

I've two tables:
work (work_id (AI, PK), sent_date, received_date, visit_date)
history_work(id_history_work (AI, PK), work_id (FK), sent_date, reseived_date, visit_date)
Relationship shoud be 1->n.
I want to update work table so sent_date, received_date and visit_date shoud have the values of last inserted record in history_work table (last id_history value) with same work_id value.
You can do this by using join. Join once to the history table. Join a second time to get the maximum id (which is presumably the most recent insertion).
update work w join
history h
on w.work_id = h.work_id join
(select work_id, max(id_history_work) as maxihw
from history
group by work_id
) hw
on hw.maxihw = h.id_work_history
set w.sent_date = h.sent_date,
w.received_date = h.received_date,
w.visit_date = h.visit_date;

multiple conditions on same column

Have seen multiple posts on this but I can't see any which answer my question.
Basically I have 3 tables in my database which relate to members and their categorisation/classification.
members (defines list of members and associated data)
member_taxonomy (defines categories, subcategories and facilities using combination of partent id and enumerated field tax_type (CATEGORY, SUBCATEGORY, FACILITY)
member_taxonomy_map (defines mapping between members and member_taxonomy)
I have a members page within which are a number of options to refine the search by specifying one or more subcategory and one or more facility. I have been trying to search on the table using the query:
SELECT members.*
FROM (members)
JOIN member_taxonomy_map ON member_taxonomy_map.member_id = members.id
WHERE member_taxonomy_map.taxonomy_id = 1
AND member_taxonomy_map.taxonomy_id = 26
AND members.active = 1
AND members.deleted = 0;
However this returns 0 rows which is something to do with having multiple where clauses on the same column but I can't figure out how this query should look. Each time the search is refined (and there could be up to 30 different options to refine the search) I need add an additional AND clause so that only members with these mappings are returned.
An IN clause will not work as this is effectively returning any rows which match any of these particular values but this is incorrect as it needs to match exactly the values specified.
Hopefully someone can give me a few pointers in the right direction.
Thanks in advance.
UPDATE
It is possible that taxonomy_id can be 1 and 26. I prob need to include the schema for the members_taxonomy_map table.
id int
tax_name varchar
slug varchar
tax_type enum ('CATEGORY','CLASSIFICATION','FACILITY','RATING')
parent_id int
active int
Therefore any tax_type with no parent id set are top level CATEGORY. Subcategories have a parent_id CATEGORY. CLASSIFICATION's can have a parent_id of the CATEGORY and FACILITY's have a parent_id of CATEGORY.
Therefore for example a category could be accommodation, sub-category could be hotel and facility could be wifi. Therefore if a member has all three of these items they will have 3 entries in the mapping table. I need to be be able to filter these so that it builds up the query to filter depending on the accommodation types (i.e subcategories - those entries with a tax_type of CATEGORY but also have a parent id, then within this the classifications. The query may return multiple entries for the same member but I can deal with this by filter these out with extra SQL clauses.
SELECT members.*
FROM (members)
JOIN member_taxonomy_map ON member_taxonomy_map.member_id = members.id
WHERE (member_taxonomy_map.taxonomy_id = 1
OR member_taxonomy_map.taxonomy_id = 26)
AND members.active = 1
AND members.deleted = 0;
It's probably not possible for a record to have a taxonomy_id of 1 and 26; you are probably trying to get a record that contains one or the other.
SELECT mb.*
FROM members mb
JOIN member_taxonomy_map tm ON tm.member_id = mb.id
WHERE tm.taxonomy_id IN ( 1, 26)
AND mb.active = 1
AND mb.deleted = 0
;
... Or ...
SELECT mb.*
FROM members mb
WHERE EXISTS ( SELECT *
FROM member_taxonomy_map tm
WHERE ON tm.member_id = mb.id
AND tm.taxonomy_id IN ( 1, 26)
)
AND mb.active = 1
AND mb.deleted = 0
;
... Or ...
SELECT mb.*
FROM members mb
WHERE mb.id IN ( SELECT tm.member_id
FROM member_taxonomy_map tm
WHERE tm.taxonomy_id IN ( 1, 26)
)
AND mb.active = 1
AND mb.deleted = 0
;
You need to JOIN member_taxonomy_map for every taxonomy_id
SELECT members.*
FROM members
JOIN member_taxonomy_map mtm1 ON mtm1.member_id = members.id AND mtm1.taxonomy_id=1
JOIN member_taxonomy_map mtm26 ON mtm26.member_id = members.id AND mtm26.taxonomy_id=26
WHERE members.active = 1
AND members.deleted = 0;