duplicate result in group_concat - mysql

I have this data in my database(3,15,6,4,15), and I tried to show it on my table using group concat, the problem is i got duplicate results. (3,15,6,4,15,3,15,6,4,15,3,15,6,4,15).
I tried to use distinct but it eliminate also the other "15".
What is the best solution for that?
thanks!
this is my query
SELECT users.*, GROUP_CONCAT(written.score separator ' - ') as Wscore, student_subject.*,SUM(written.score) as total, SUM(written.item) as item FROM users JOIN written ON users.idnumber=written.idnumber JOIN student_subject ON users.idnumber=student_subject.idnumber WHERE student_subject.teacher='$login_session' AND written.section='$section' AND written.level='$level' AND written.year='$year' AND written.subject='$subject' AND users.gender='male' AND written.period='first' GROUP BY users.idnumber order by users.lname

You probably just want distinct in the group_concat():
SELECT u.*, GROUP_CONCAT(distinct w.score separator ' - ') as Wscore,
ss.*, SUM(w.score) as total, SUM(w.item) as item
FROM users u JOIN
written w
ON u.idnumber = w.idnumber JOIN
student_subject ss
ON u.idnumber = ss.idnumber
WHERE ss.teacher = '$login_session' AND w.section='$section' AND
w.level = '$level' AND w.year = '$year' AND w.subject = '$subject' AND
u.gender = 'male' AND w.period = 'first'
GROUP BY u.idnumber
order by u.lname;
Notice how table aliases make the query easier to write and to read.

Related

Optimize Query Mysql to count data in each district

i have this query for calculate success total in each district. this query works but its take until 2min to output data, i have 15k rows in orders.
SELECT
nsf.id,
nsf.province,
nsf.city,
nsf.district,
nsf.shipping_fee,
IFNULL((SELECT COUNT(orders.id) FROM orders
JOIN users ON orders.customer_id = users.id
JOIN addresses ON addresses.user_id = users.id
JOIN subdistricts ON subdistricts.id = addresses.subdistrict_id
WHERE orders.status_tracking IN ("Completed","Successful Delivery")
AND subdistricts.ninja_fee_id = nsf.id
AND orders.transfer_to = "cod"),0) as success_total
from ninja_shipping_fees nsf
GROUP BY nsf.id
ORDER BY nsf.province;
the output should be like this
can you help me to improve the peformance? Thanks
Try performing the grouping/calculation in a joined "derived table" instead of a "correlated subquery"
SELECT
nsf.id
, nsf.province
, nsf.city
, nsf.district
, nsf.shipping_fee
, COALESCE( g.order_count, 0 ) AS success_total
FROM ninja_shipping_fees nsf
LEFT JOIN (
SELECT
subdistricts.ninja_fee_id
, COUNT( orders.id ) AS order_count
FROM orders
JOIN users ON orders.customer_id = users.id
JOIN addresses ON addresses.user_id = users.id
JOIN subdistricts ON subdistricts.id = addresses.subdistrict_id
WHERE orders.status_tracking IN ('Completed', 'Successful Delivery')
AND orders.transfer_to = 'cod'
GROUP BY subdistricts.ninja_fee_id
) AS g ON g.ninja_fee_id = nsf.id
ORDER BY nsf.province;
"Correlated subqueries" are often a source of poor performance.
Other notes, I prefer to use COALESCE() because it is ANSI standard and available in most SQL implementations now. Single quotes are more typically used to denote strings literals.

not showing DISTINCT record on 3 table join

I try following query for display DISTINCT record from 3 table.But there also display repeat user record.
SELECT DISTINCT users.sid, users.username, users.registration_date, users.FirstName,
users.LastName, users.phoneNumber, listings.Resume, uploaded_files.saved_file_name
FROM users JOIN listings
ON users.sid = listings.user_sid
JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7' AND listings.Resume != 'NULL'
Making some massive assumptions on the structure of your tables.
Obvious way is to join against a sub query that gets the latest listing date for each user, and then join that against listings to get the listing fields for that date.
SELECT users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume,
uploaded_files.saved_file_name
FROM users
INNER JOIN
(
SELECT user_sid, MAX(resume_date) AS latest_resume
FROM listings
GROUP BY user_sid
) sub0
ON users.sid = sub0.user_sid
INNER JOIN listings
ON sub0.sid = listings.user_sid
AND sub0.latest_resume = listings.resume_date
INNER JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7'
AND listings.Resume != 'NULL'
A bit of a fiddle is to use GROUP_CONCAT to get all the saved files ordered by the date, then use SUBSTRING_INDEX to get the first one (I have just used the default comma to split the files up - but you should really use a delimited that will never be in any of the file names)
SELECT users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume,
SUBSTRING_INDEX(GROUP_CONCAT(uploaded_files.saved_file_name ORDER BY listings.resume_date DESC), ',', 1) AS saved_file_name
FROM users
INNER JOIN listings
ON users.sid = listings.user_sid
INNER JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7'
AND listings.Resume != 'NULL'
GROUP BY users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume

Grouping by multiple columns in SQL

I am trying to bring through the site.Site_Name, for each hive.hiveno and it's max(hiverdg.invdate). Running the code below doesn't work because site.Site_Name is not aggrigated. If I add site.Site_Name to the Group By, the code runs, but the ouput displays the results repeated, once for each site.Site_Name
select site.Site_Name ,hive.hiveno, max(hiverdg.invdate)
from hiverdg
inner join hive
on hiveRdg.hive_Link = hive.hive_Link
inner join Customer
on customer.Customer_Link = hive.Customer_Link
inner join site
on site.Customer_Link = customer.Customer_Link
where
(hiverdg.xtype = 'N'
and customer.CustomerName = 'Cust1')
or
(hiverdg.xtype = 'A'
and customer.CustomerName = 'Cust1')
group by hive.hiveno
The easiest way to do this, with your query, is the substring_index()/group_concat() trick:
select substring_index(group_concat(s.Site_Name order by rdg.invdate desc separator '|'
), '|', 1
) as SiteName,
h.hiveno, max(rdg.invdate)
from hiverdg rdg inner join
hive h
on rdg.hive_Link = h.hive_Link inner join
Customer c
on c.Customer_Link = h.Customer_Link inner join
site s
on s.Customer_Link = c.Customer_Link
where rdg.xtype in ('N', 'A') and c.CustomerName = 'Cust1')
group by h.hiveno;
I also made the following changes to your query:
Introduced table aliases, to make the query easier to write and to read.
Changed the where to use in, simplifying the logic.

Limit concatenatings in group_concat

I'm just encountering an issue that i have no clue on how to solve. It is related to this (solved) problem.
Like mentioned in the other post, i have a Media table that can hold up many records of the same user, but i only want to display a maximum of six records (in order, In any case, only one Type 1 image, followed by a maximum of five Type 2 images).
The query i now have works fine as long as there is only one Type 1 image, but when i add another Type 1 image, the query displays them both. Unfortunatly, something like ORDER BY UserMedia.m_Type = 1 DESC LIMIT 1 in an GROUP_CONCAT doesn't work, but it is exactly what i need. Anybody a clever idea how to realise this?
I have a SQL Fiddle here with the relevant code. My query looks like this
SELECT
User.u_UserName, User.u_UserMail, User.u_UserRegistration,
Status.us_PaymentStatus,
Sex.us_Gender, Sex.us_Interest,
Personal.up_Name, Personal.up_Dob, Personal.up_City, Personal.up_Province,
UserMedia.m_Id, UserMedia.m_Type, SUBSTRING_INDEX(
GROUP_CONCAT(
CONCAT(
UserMedia.m_Type, ':', UserMedia.m_File
)
ORDER BY UserMedia.m_Type = 1, UserMedia.m_Date DESC SEPARATOR '|'
),'|',6
) AS userFiles
FROM User AS User
JOIN User_Status AS Status ON Status.User_u_UserId = User.u_UserId
JOIN User_Sex_Info AS Sex ON Sex.User_u_UserId = User.u_UserId
LEFT JOIN User_Personal_Info AS Personal ON Personal.User_u_UserId = User.u_UserId
LEFT JOIN Media AS UserMedia ON UserMedia.User_u_UserId = User.u_UserId
WHERE User.u_UserId = '18'
GROUP BY User.u_UserId
Went for a walk and came with the following solution. Maybe not the most beautiful one, but at least it works. I also realised i didn't need the CONCAT function
SELECT
User.u_UserName, User.u_UserMail, User.u_UserRegistration,
Status.us_PaymentStatus,
Sex.us_Gender, Sex.us_Interest,
Personal.up_Name, Personal.up_Dob, Personal.up_City, Personal.up_Province,
UserMedia.m_Id, UserMedia.m_Type, SUBSTRING_INDEX(
GROUP_CONCAT(
UserMedia.m_File
ORDER BY UserMedia.m_Type = 1 DESC
SEPARATOR '|'
),'|',1
) AS userFiles,
SUBSTRING_INDEX(
GROUP_CONCAT(
UserMedia.m_File
ORDER BY UserMedia.m_Date DESC
SEPARATOR '|'
),'|',5
) AS userTypes,
SUBSTRING_INDEX(
GROUP_CONCAT(
Interests.ui_Interest SEPARATOR '|'
),'|',5
) AS userInterests
FROM User AS User
JOIN User_Status AS Status ON Status.User_u_UserId = User.u_UserId
JOIN User_Sex_Info AS Sex ON Sex.User_u_UserId = User.u_UserId
LEFT JOIN User_Personal_Info AS Personal ON Personal.User_u_UserId = User.u_UserId
LEFT JOIN Media AS UserMedia ON UserMedia.User_u_UserId = User.u_UserId
LEFT JOIN User_Interest_Info AS Interests ON Interests.User_u_UserId = User.u_UserId
WHERE User.u_UserId = :uId
GROUP BY User.u_UserId

MySQL using Aliases

I have the following syntactically incorrect query with aliases in_Degree and out_degree:
insert into userData
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree,
in_degree + out_degree(freq)
from users u
The problem in the query is the the 4th item in the select list aliased as freq. I want the 4th item to have the value in_degree + out_degree. The brute force extremely slow solution would be to copy and past both subqueries and add them.
How can I make this fast and as simple as in_degree + out_degree?
You could use a subquery:
insert into userData
select user_name,
in_degree,
out_degree,
in_degree + out_degree
from
(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) src
Or you might be able to use:
insert into userData
select user_name,
count(distinct in_t.*) in_degree,
count(distinct out_t.*) out_degree,
count(distinct in_t.*) + count(distinct out_t.*)
from users u
left join tweets in_t
on u.USER_NAME = in_t.rt_user_name
left join tweets out_t
on u.USER_NAME = out_t.source_user_name
group by u.user_name
As you have discovered, you can't reference the aliases given in that select list, except in a HAVING clause or an ORDER BY clause.
One option is to use your query as an "inline view", and write a wrapper query around that.
remove the 4th (invalid) expression from the select list in your query,
wrap your query in a set of parens
follow the closing paren with an alias (e.g.) s
write a query around that, referencing the inline view as if it were a table
the select list on the outer query can reference the "aliases" defined in the inline view.
However, if you want to make this "fast", you might consider (as an option) taking an entirely different tack. Rather than using correlated subqueries to get the count for each individal user, you could get the counts for all users, and then use LEFT JOIN operator, e.g.
SELECT u.user_name
, IFNULL(i.cnt,0) AS in_degree
, IFNULL(o.cnt,0) AS out_degree
, IFNULL(i.cnt,0)+IFNULL(o.cnt,0) AS freq
FROM users u
LEFT
JOIN (SELECT rt_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY rt_user_name) i
ON i.rt_user_name = u.user_name
LEFT
JOIN (SELECT source_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY source_user_name) o
ON o.source_user_name = u.user_name
This should work:
insert into userData
SELECT T.user_name,
T.in_degree,
T.out_degree,
(T.in_degree + T.out_degree) as freq
FROM (SELECT user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME) as in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name) as out_degree
FROM users u) T
In a fast way, I would do something like:
insert into userData
select
TMP.user_name,
TMP.in_degree,
TMP.out_degree,
(TMP.in_degree + TMP.out_degree) degreeSum
from(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) TMP