Filtering without duplicates in mysql - mysql

I'm fetching data from four different tables. This is how I did:
select value_split.id,
count(invoice_request.id) as counts,
GROUP_CONCAT(`category`.`category_name`) as `category_name`
from value_split
left join invoice_request
on invoice_request.id = value_split.id
and MONTH(invoice_request.from_date) = '05'
and YEAR(invoice_request.from_date) = '2016'
and invoice_request.status = '1'
left join `category_value` on value_split.id = category_value.id
left join `category` on category_value.category_id = category.category_id
where MONTH(value_split.date) = '05'
and YEAR(value_split.date) = '2016'
group by value_split.id, value_split.date
limit 0, 2147483647
The expected answer is:
id|counts|category_name
1 | 1 | service, drop
2 | 2 | drop
3 | 1 | service
But, what I get is:
id|counts|category_name
1 | 5 | service, drop
2 | 4 | drop
3 | 5 | service
If I remove line 10 and 11, I get the correct answer. But, if I include them, I get the wrong one. But, I want to display that too. So, what should I do to get the correct output? What's wrong with this?

Use
COUNT(DISTINCT invoice_request.id) AS counts,

Related

How to find all the opposite combinations between two columns in SQL

I am making a web dating app that needs to match users and let them chat with each other.
I want to figure out how to find all the matches for a particular user.
Right now I have a table called follows that has 2 columns.
UserID | MatchUserID
--------------------
1 | 2
2 | 1
1 | 3
1 | 4
1 | 5
4 | 1
5 | 4
The idea is that for two users to match they need to follow one another. The table above shows which user follows which.
Assuming that the user who is currently logged on is UserID = 1.
I need a query that will return from the MatchUserID table the following results:
2, 4
In a way, I am looking to find all the opposite combinations between the two columns.
This is the code I use to create the table.
CREATE TABLE Match
(
UserID INT NOT NULL,
MatchUserID INT NOT NULL,
PRIMARY KEY (UserID, MatchUserID)
);
You can do it with a self join:
select m.MatchUserID
from `Match` m inner join `Match` mm
on mm.MatchUserID = m.UserId
where
m.UserId = 1
and
m.MatchUserID = mm.UserId
See the demo.
Results:
| MatchUserID |
| ----------- |
| 2 |
| 4 |
The simplest way possibly is to use EXISTS and a correlated subquery that searches for the other match.
SELECT t1.matchuserid
FROM elbat t1
WHERE t1.userid = 1
AND EXISTS (SELECT *
FROM elbat t2
WHERE t2.matchuserid = t1.userid
AND t2.userid = t1.matchuserid);

Find duplicates from same table and constraint them from another table in sql

Oh, my title is not the best one and as English is not my main language maybe someone can fix that instead of downvoting if they've understood the issue here.
Basically i have two tables - tourneyplayers and results. Tourneyplayers is like a side table which gathers together tournament information across multiple tables - results, tournaments, players etc. I want to check duplicates from the results table over column day1_best, from single tournament and return all the tourneyplayers who have duplicates.
Tourneyplayers contain rows:
Tourneyplayers
tp_id | resultid | tourneyid
1 | 2 | 91
2 | 21 | 91
3 | 29 | 91
4 | 1 | 91
5 | 3 | 92
Results contains rows:
Results:
r_id | day1_best
1 | 3
2 | 1
3 | 4
.. | ..
21 | 1
.. | ..
29 | 2
Now tourney with id = 91 has in total 4 results, with id's 1,2,21 and 29. I want to return values which have duplicates, so currently the result would be
Result
tp_id | resultid | day1_best
1 | 2 | 1
2 | 21 | 1
I tried writing something like this:
SELECT *
FROM tourneyplayers
WHERE resultid
IN (
SELECT r1.r_id
FROM results AS r1
INNER JOIN results AS r2 ON ( r1.day1_best = r2.day1_best )
AND (
r1.r_id <> r2.r_id
)
)
AND tourneyid =91
But in addition to values which had the same day1_best it chose two more which did not have the same. How could i improve my SQL or rewrite it?
First you JOIN both tables, so you know how the data looks like.
SELECT *
FROM tourney_players t
JOIN results r
ON t.`resultid` = r.`r_id`;
Then using the same query you GROUP to see what tourneyid, day1_best combination has multiple rows
SELECT `tourneyid`, `day1_best`, count(*) as total
FROM tourney_players t
JOIN results r
ON t.`resultid` = r.`r_id`
GROUP BY `tourneyid`, `day1_best`;
Finally you use the base JOIN and perform a LEFT JOIN to see what rows has a match and show only those rows.
SELECT t.`tp_id`, r.`r_id`, r.`day1_best`
FROM tourney_players t
JOIN results r
ON t.`resultid` = r.`r_id`
LEFT JOIN (SELECT `tourneyid`, `day1_best`, count(*) as total
FROM tourney_players t
JOIN results r
ON t.`resultid` = r.`r_id`
GROUP BY `tourneyid`, `day1_best`
HAVING count(*) > 1) as filter
ON t.`tourneyid` = filter.`tourneyid`
AND r.`day1_best` = filter.`day1_best`
WHERE filter.`tourneyid` IS NOT NULL;
SQL DEMO
OUTPUT
Please try this :
Select tp.tp_id , tp.resultid ,r.day1_best from (Select * from Tourneyplayers
where tourneyid = 91)as tp inner join (select * from Result day1_best in(select
day1_best from result group by day1_best having count(*)>1 ) )as r on tp.resultid
= r.r_id ;

Issue with mysql query that calls column name from another table

I have two tables, one is an index (or map) which helps when other when pulling queries.
SELECT v.*
FROM smv_ v
WHERE (SELECT p.network
FROM providers p
WHERE p.provider_id = v.provider_id) = 'RUU='
AND (SELECT p.va
FROM providers p
WHERE p.provider_id = v.provider_id) = 'MjU='
LIMIT 1;
Because we do not know the name of the column that holds the main data, we need to look it up, using the provider_id which is in both tables, and then query.
I am not getting any errors, but also no data back. I have spent the past hour trying to put this on sqlfiddle, but it kept crashing, so I just wanted to check if my code is really wrong, hence the crashing?
In the above example, I am looking in the providers table for column network, where the provider_id matches, and then use that as the column on smv.
I am sure i have done this before just like this, but after the weekend trying I thought i would ask on here.
Thanks in Advance.
UPDATE
Here is an example of the data:
THis is the providers, this links so no matter what the name of the column on the smv table, we can link them.
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| | A | B | C | D | E | F | G | H |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
| 1 | 1 | Home | network | batch | bs | bp | va | bex |
| 2 | 2 | Recharge | code | id | serial | pin | value | expire |
+---+---+---------------+---------+-------+--------+-----+-------+--------+
In the example above, G will mean in the smv column for recharge we be value. So that is what we would look for in our WHERE clause.
Here is the smv table:
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| | A | B | C | D | E | F | value | va |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
| 1 | 1 | X2 | Home | 4 | 10 | 2016-09-26 15:20:58 | | 7 |
| 2 | 2 | X2 | Recharge | 4 | 11 | 2016-09-26 15:20:58 | 9 | |
+---+---+-----------+-----------+---+----+---------------------+-----+--+
value in the same example as above would be 9, or 'RUU=' decoded.
So we do not know the name of the rows, until the row from smv is called, once we have this, we can look up what column name we need to get the correct information.
Hope this helps.
MORE INFO
At the point of triggering, we do not know what the row consists of the right data because some many of the fields would be empty. The map is there to help we query the right column, to get the right row (smv grows over time depending on whats uploaded.)
1) SELECT p.va FROM providers p WHERE p.network = 'Recharge' ;
2) SELECT s.* FROM smv s, providers p WHERE p.network = 'Recharge';
1) gives me the correct column I need to look up and query smv, using the above examples it would come back with "value". So I need to now look up, within the smv table, C = Recharge, and value = '9'. This should bring me back row 2 of the smv table.
So individually both 1 and 2 queries work, but I need them put together so the query is done on the database server.
Hope this gives more insight
Even More Info
From reading other posts, which are not really doing what I need, i have come up with this:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND net = '20'
ORDER BY from_base64(val.bex) DESC;
The above comes back blank, but if i replace net, in the WHERE clause with a column I know exists, I do get the results expected:
SELECT s.*
FROM (SELECT
(SELECT p.va
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS net,
(SELECT p.bex
FROM dh_smv_providers p
WHERE p.provider_name = 'vodaphone'
LIMIT 1) AS bex
FROM dh_smv_providers) AS val, dh_smv_ s
WHERE s.provider_id = 'vodaphone' AND value = '20'
ORDER BY from_base64(val.bex) DESC;
So what I am doing wrong, which is net, not showing the value derived from the subquery "value" ?
Thanks
SELECT
v.*,
p.network, p.va
FROM
smv_ v
INNER JOIN
providers p ON p.provider_id = v.provider_id
WHERE
p.network = 'RUU=' AND p.va = 'MjU='
LIMIT 1;
The tables talk to each other via the JOIN syntax. This completely circumvents the need (and limitations) of sub-selects.
The INNER JOIN means that only fully successful matches are returned, you may need to adjust this type of join for your situation but the SQL will return a row of all v columns where p.va = MjU and p.network = RUU and p.provider_id = v.provider_id.
What I was trying to explain in comments is that subqueries do not have any knowledge of their outer query:
SELECT *
FROM a
WHERE (SELECT * FROM b WHERE a)
AND (SELECT * FROM c WHERE a OR b)
This layout (as you have in your question) is that b knows nothing about a because the b query is executed first, then the c query, then finally the a query. So your original query is looking for WHERE p.provider_id = v.provider_id but v has not yet been defined so the result is false.

Trying to join two tables, one ordered to get flagged row first

I have these two tables:
prt_gebouw
id | name
----+------------
1 | Building A
2 | Building B
3 | Building C
prt_image
id | building_id | name | is_primary
----+---------------+-----------+------------
1 | 1 | img1.jpg | 0
2 | 1 | img2.jpg | 0
3 | 2 | img3.jpg | 0
4 | 1 | img4.jpg | 1
5 | 2 | img5.jpg | 1
As you can see here, some buildings have more than one image and some have none. When a building has one image or more, only one image can be marked as primary; can, for this is not mandatory.
Now, what I am trying to do is list all buildings (each building once) and join this with the images table, preferrably the primary image, empty cells if no image can be found.
So first I tried this:
SELECT
pgb.id,
pgb.name,
img.id AS image_id,
img.name AS image_name,
img.is_primary AS is_primary
FROM
prt_gebouw pgb
LEFT JOIN prt_image img ON pgb.id = img.object_id AND img.kind = 'object'
GROUP BY pgb.id
ORDER BY img.is_primary DESC, pgb.id ASC;
I suspect that the grouping is done before the ordering, because the wrong image is joined with each building that has more than one image ("wrong" being here: not the primary one).
Then I tried:
SELECT
pgb.id,
pgb.name,
img.id AS image_id,
img.name AS image_name,
img.is_primary
FROM
prt_gebouw pgb
LEFT JOIN (SELECT * FROM prt_image ORDER BY is_primary DESC) AS img ON img.object_id = pgb.id
ORDER BY pgb.id ASC;
I was hoping that for each building the primary image would be listed first, but not so. I suspect this is also the problem in the previous query, but is it?
And, more importantly, how can I solve this?
I think a correlated subquery might be easier for what you want:
select pgb.*,
(select i.id
from prt_image i
where i.object_id = pgb.id and i.kind = 'object'
order by is_primary desc
limit 1
) as img_id
from prt_gebouw pgb;
If you want the other fields from the image, join them in afterwards:
select pgb.*, i.* -- I'm using `*` for inconvenience; list the columns here
from (select pgb.*,
(select i.id
from prt_image i
where i.object_id = pgb.id and i.kind = 'object'
order by is_primary desc
limit 1
) as img_id
from prt_gebouw pgb
) pgb left join
prt_image i
on pgb.img_id = i.id;

Improving this MySQL Query - Select as sub-query

I have this query
SELECT
shot.hole AS hole,
shot.id AS id,
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number > shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS next_shot_id,
shot.distance AS distance_remaining,
shot.type AS hit_type,
shot.area AS onto
FROM shot
JOIN course ON shot.course_id = course.id
JOIN round ON shot.round_id = round.id
WHERE round.uID = 78
This returns 900~ rows in around 0.7 seconds. This is OK-ish, but there are more lines like this required
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number > shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS next_shot_id,
For example
(SELECT s.id FROM shot AS s
WHERE s.hole = shot.hole AND s.shot_number < shot.shot_number AND shot.round_id = s.round_id
ORDER BY s.shot_number ASC LIMIT 1) AS past_shot_id,
Adding this increases the load time to 10s of seconds which is far too long and the page often doesn't load at all or MySQL just locks up and using show processlist shows that the query is just sat there sending data.
Removing the ORDER BY s.shot_number ASC clause in those sub queries reduces the query time down to 0.05 seconds which is much much better. But the ORDER BY is required to ensure that the next or past row (shot) is returned, rather than any old random row.
How can I improve this query to make it run faster and return the same results. Perhaps my approach for obtaining the next and past rows is sub optimal and I need to look at a different way of returning those next and previous row IDs?
EDIT - additional background info
The query was fine on my testing domain, a subdomain. But when moved to the live domain the issues started. Hardly anything was changed yet the whole site came to halt because of these new slow queries. Key notes:
Different domain
Different folder in /var/www
Same DB
Same DB credentials
Same code
Added indexes in an attempt to fix - this didn't help
Could any of these affected the load time?
This will get marked down in a minute for 'not being an answer', but it illustrates a possible solution without simply handing it to you on a plate....
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
SELECT x.i, MIN(y.i) FROM ints x LEFT JOIN ints y ON y.i > x.i GROUP BY x.i;
+---+----------+
| i | MIN(y.i) |
+---+----------+
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
| 5 | 6 |
| 6 | 7 |
| 7 | 8 |
| 8 | 9 |
| 9 | NULL |
+---+----------+
To expand on Strawberry's answer, doing additional left-join for a "pre-query" to get all the prior / next IDs, then join out to get whatever details you need.
select
Shot.ID,
Shot.Hole,
Shot.Distance as Distance_Remaining,
Shot.Type as Hit_Type,
Shot.Area as Onto
PriorShot.Hole as PriorHole,
PriorShot.Distance as PriorDistanceRemain,
NextShot.Hole as NextHole,
NextShot.Distance as NextDistanceRemain
from
( SELECT
shot.id,
MIN(nextshot.id) as NextShotID,
MAX(priorshot.id) as PriorShotID
FROM
round
JOIN shot
on round.id = shot.round_id
LEFT JOIN shot nextshot
ON shot.round_id = nextshot.round_id
AND shot.hole = nextshot.hole
AND shot.shot_number < nextshot.shot_number
LEFT JOIN shot priorshot
ON shot.round_id = priorshot.round_id
AND shot.hole = priorshot.hole
AND shot.shot_number > priorshot.shot_number
WHERE
round.uID = 78
GROUP BY
shot.id ) AllShots
JOIN Shot
on AllShots.id = Shot.ID
LEFT JOIN shot PriorShot
on AllShots.PriorShotID = PriorShot.ID
LEFT JOIN shot NextShot
on AllShots.NextShotID = NextShot.ID
The inner query gets only those for round.uID = 78, then you can join to the next / prior as needed. I did not add the joins to the course and round tables as no result columns were presented, but could easily be added.
I wonder how well the following performs. It replaces the joining operations with string operations.
SELECT shot.hole AS hole, shot.id AS id,
substring_index(substring_index(shots, ',', find_in_set(shot.id, ss.shots) + 1), ',', -1
) as nextsi,
substring_index(substring_index(shots, ',', find_in_set(shot.id, ss.shots) - 1), ',', -1
) as prevsi,
shot.distance AS distance_remaining, shot.type AS hit_type, shot.area AS onto
FROM shot JOIN
course
ON shot.course_id = course.id JOIN
round
ON shot.round_id = round.id join
(select s.round_id, s.hole, group_concat(s.id order by s.shot_number) as shots
from shot s
group by s.round_id, s.hole
) ss
on ss.round_id = shot.round_id and ss.hole = shot.hole
WHERE round.uID = 78
Note that this doesn't work fully -- it will produce erroneous results on the first and last shot. I'm wondering how the performance is before fixing those details.