CROSS JOIN doesn't work properly - mysql

anyone knows how its possible that queries:
SELECT a.id, b.id FROM a CROSS JOIN b and
SELECT a.id, b.id FROM b CROSS JOIN a
return the same result? In both cases records from less numerous table are assigned to more numerous table. I want to get something like this:
`| a.id | b.id |
-------+-----+
1 | q |
1 | w |
1 | e |
1 | r |
2 | q |
2 | w |
2 | e |
2 | r |
`
but im getting result like this:
`| a.id | b.id |
-------+-----+
1 | q |
2 | q |
1 | w |
2 | w |
1 | e |
2 | e |
1 | r |
2 | r |
`
It's kinda strange that mysql automatically choose order of cross joined tabled depending of their numerous. I know i can use ORDER BY but i need to do this by CROSS JOIN.
There is more complex problem, i want to get 10 records per a.id. I saw solution for that: row counting with IF condition in SELECT clause. That row counting require rows sorted by a.id in raw result (without order by). Is there any other solution to do that?

NO, without a ORDER BY there is no specific order guaranteed. if you want a specific order to be maintained always then use order by clause. So in your case do like
SELECT a.id, b.id FROM a CROSS JOIN b
ORDER BY a.id;
i want to get 10 records per a.id.
Use a LIMIT clause along with ORDER BY like below; but without using ORDER BY you can never assure any order. Check MySQL documentation for more information.
SELECT a.id, b.id FROM a CROSS JOIN b
ORDER BY a.id
LIMIT 0,10;

First, the two results that you show are the same. With no order by clause, SQL results sets, like SQL tables, represent unordered sets. The ordering is immaterial. So, the sets are the same.
Your problem is quite different from this. If you want ten rows from table b for each record in table a, then you need to enumerate them. Typically, the fastest way in MySQL is to use a subquery and variables:
select a.*, b.*
from a left join
(select b.*,
(#rn := if(#a = b.aid, #rn + 1,
if(#a := b.aid, 1, 1)
)
) as seqnum
from b cross join
(select #rn := 0, #a := 0) params
order by b.aid
) b
where seqnum <= 10
order by a.aid;
There are other solutions, but this is undoubtedly the best.

Related

change the mysql order of the result with union

So i get 10 results from my first select and 1 from the other one after union like this:
(SELECT a.*,
b.*
FROM all a,
names b
WHERE b.name_id = a.name_id
ORDER BY name_id DESC
LIMIT 10)
UNION
(SELECT a.*,
b.*
FROM all a,
names b
WHERE b.name_id = a.name_id
ORDER BY request_id ASC
LIMIT 1)
i would like to get the result of the second select as the second last result like this
********
name_id 100
name_id 99
name_id 98
name_id 97
name_id 96
name_id 95
name_id 94
name_id 93
name_id 92
name_id 1 <- second select result as second last result
name_id 91
********
Can someone help pls?
Synthesize a row number column for the query as it stands and shuffle positions as needed.
SELECT x.name
, x.name_id
FROM (
SELECT #rownum:=#rownum + 1 as row_number,
t.name,
t.name_id
FROM (
-- original query from the question starts here
(SELECT b.name,
a.name_id
FROM allx a,
names b
WHERE b.name_id = a.name_id
ORDER BY name_id DESC
LIMIT 10)
UNION
(SELECT b.name,
a.name_id
FROM allx a,
names b
WHERE b.name_id = a.name_id
ORDER BY request_id ASC
LIMIT 1)
) t,
(SELECT #rownum := 0) r
) x
ORDER BY CASE row_number
WHEN 10 THEN 11
WHEN 11 THEN 10
ELSE row_number
END
;
(Note that the query has been sightly modified to avoid syntax errors / support the demo: table all has been named allx, explicit projections of the union's subqueries).
That gets complicated quickly thus next to ad hoc reporting it is preferable to synthesize an attribute in the subqueries of the union that reflects a global order.
Demo here (SQL fiddle)
Credits
Row number synthesizing taken from this SO answer
Interesting question given
+----+--------+
| id | sname |
+----+--------+
| 1 | sname1 |
| 2 | sname2 |
| 3 | sname3 |
| 4 | sname4 |
| 5 | sname5 |
| 6 | sname6 |
+----+--------+
6 rows in set (0.001 sec)
(select id,sname,#r:=#r+1 rn
from users
cross join(select #r:=0) r
order by sname desc limit 3
)
union
(
select u.id,u.sname,
#r:=#r - .9
from users u
left join (select id from users order by sname desc limit 3) u1 on u1.id = u.id
where u1.id is null
order by u.id asc limit 0,1
)
order by rn;
Where a variable is used to calculate a row number in the first sub query, since this variable is not reset in the second query a simple piece of arithmetic works out where to position the second sub query result. Note the second sub query uses a left join to check that the result has not already appeared in the first sub query,
I would suggest union all and three selects:
SELECT an.*
FROM ((SELECT a.*, n.*, 1 as ord
FROM all a JOIN
names n
ON n.name_id = a.name_id
ORDER BY n.name_id DESC
LIMIT 9
) UNION ALL
(SELECT a.*, n.*, 3 as ord
FROM all a JOIN
names n
ON n.name_id = a.name_id
ORDER BY n.name_id DESC
LIMIT 9 OFFSET 9
) UNION ALL
(SELECT a.*, b.*
FROM all a JOIN
names n
WHERE n.name_id = a.name_id
ORDER BY request_id ASC
LIMIT 1
)
) an
ORDER BY ord, name_id;

Find the same sets of pairs

I have such scheme in mysql:
TableA (id integer PK, pid integer, mid integer)
Ex. data:
id | pid | mid
1 | 2 | 2
2 | 2 | 4
3 | 3 | 4
4 | 4 | 2
5 | 4 | 4
6 | 3 | 2
7 | 3 | 5
I have pid with some mid's and want to find all pid's with the same set of mid's. In example for pid=2 answer is 2,4
group_concat is not suitable for me
I think it should be simple, but the answer eludes me
UPD:
I have tried group_concat:
SELECT DISTINCT(b.pid) FROM (SELECT pid, group_concat(mid) as concated FROM TableA where pid=100293) as a, (select pid, group_concat(mid) as concated, COUNT(1) as count FROM TableA group by pid) as b where a.concated=b.concated;
Since you are working with integers, instead of group_concat you could generate a bitmask on distinct mid values for each pid and join on that. Then it's just math all the way down:
SELECT DISTINCT pid
FROM (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t1a GROUP BY pid) as t1
INNER JOIN (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t2a GROUP BY pid) as t2
ON t1.midmask = t2.midmask
IF mid is already distinct for each pid then you can get rid of the inner-inner subqueries.
Using #GordonLinoff's excellent single-subquery approach where GROUP_CONCAT is only used on the main query (where it won't be so expensive). Instead of the group_concat on the inner query we use the bitmask approach that may be quicker.
SELECT midmask>>1, group_concat(pid)
FROM (SELECT pid, sum(pow(2,mid)) as midmask FROM (SELECT distinct pid, mid FROM tableA) as t1a GROUP BY pid) as t1
GROUP BY midmask;
Results:
+---------+-------------------+
| midmask | group_concat(pid) |
+---------+-------------------+
| 10 | 2,4 |
| 26 | 3 |
+---------+-------------------+
Obviously that midmask in the result set isn't super necessary, but you can pick out the values from the bitmask if you want to see the mid values that contributed to the match if you like.
I'm using the bit right-shift operator to insure that the proper bit is set in the midmask result otherwise you'll be off by one. If you don't care about the output of the midmask, then don't bother with the >>1 portion of the query.
You can use this query. It will give you comma separated pids.
select `mid`, group_concat(`pid`) from `tableA` group by `mid`;
In MySQL, I would approach this using group_concat():
select mids, group_concat(pid)
from (select pid, group_concat(mid order by mid) as mids
from t
group by pid
) t
group by mids;
This solves the general problem, for all pids. Solving for 1 pid is a bit tricky in MySQL (no window functions), but you can try:
select t.pid, t2.pid, count(*)
from t join
t t2
on t.mid = t2.mid and t2.pid = 2
group by t.pid, t2.pid
having count(*) = (select count(*) from t where t.pid = t.pid) and
count(*) = (select count(*) from t where t.pid = t2.pid);
For this, you want indexes on t(mid, pid) and t(pid).

left join two tables where my variable is less than 5

Im creating a query that select two tables and create a total variable by count a field in one table.
Example:
Table A:
ID | email
1 | test#test
2 | test2#test
3 | test3#test
Table B
ID | email_id | username_id
1 | 1 | 11
2 | 1 | 22
3 | 2 | 33
My query:
select a.id, a.email, count(c.id) as total
from tableA a
left join tableC c on c.email_id = a.id AND total <= 5
group by a.email LIMIT 1
Output:
Unknown column 'total' in 'on clause
I need to select the first "a.id" that has total <= 5. How can I do it?
Logically Select is processed after the Where clause so you cannot use Alias name in same Where clause.
Use HAVING clause
select a.id, a.email, count(c.id) as total
from tableA a
left join tableC c on c.email_id = a.id
group by a.email
Having count(c.id) <= 5
LIMIT 1
I think Mysql allows you do this as well
Having total <= 5
Try HAVING Count(c.id) <= 5
Just to make this a bit clearer, since the correct answer has already been provided - You don't have to use the HAVING clause, and the HAVING clause is not always the solution for this problem.
The HAVING clause is usually used to place filters on aggregated columns (sum,count,max,min etc..) , but when you have a calculated column (colA + colB as calc_column for example) , then another approach , which should work here as well is to wrap the query with another select, and then the new column will be available on the WHERE :
SELECT *
FROM (The query here ) s
WHERE s.total <= 5

MySQL find user rank for each category

Let's say I have the following table:
user_id | category_id | points
-------------------------------
1 | 1 | 4
2 | 1 | 2
2 | 1 | 5
1 | 2 | 3
2 | 2 | 2
1 | 3 | 1
2 | 3 | 4
1 | 3 | 8
Could someone please help me to construct a query to return user's rank per category - something like this:
user_id | category_id | total_points | rank
-------------------------------------------
1 | 1 | 4 | 2
1 | 2 | 3 | 1
1 | 3 | 9 | 1
2 | 1 | 7 | 1
2 | 2 | 2 | 2
2 | 3 | 4 | 2
First, you need to get the total points per category. Then you need to enumerate them. In MySQL this is most easily done with variables:
SELECT user_id, category_id, points,
(#rn := if(#cat = category_id, #rn + 1,
if(#cat := category_id, 1, 1)
)
) as rank
FROM (SELECT u.user_id, u.category_id, SUM(u.points) as points
FROM users u
GROUP BY u.user_id, u.category_id
) g cross join
(SELEct #user := -1, #cat := -1, #rn := 0) vars
ORDER BY category_id, points desc;
You want to get the SUM of points for each unique category_id:
SELECT u.user_id, u.category_id, SUM(u.points)
FROM users AS u
GROUP BY uc.category_id
MySQL doesn't have analytic functions like other databases (Oracle, SQL Server) which would be very convenient for returning a result like this.
The first three columns are straightforward, just GROUP BY user_id, category_id and a SUM(points).
Getting the rank column returned is a bit more of a problem. Aside from doing that on the client, if you need to do that in the SQL statement, you could make use of MySQL user-defined variables.
SELECT #rank := IF(#prev_category = r.category_id, #rank+1, 1) AS rank
, #prev_category := r.category_id AS category_id
, r.user_id
, r.total_points
FROM (SELECT #prev_category := NULL, #rank := 1) i
CROSS
JOIN ( SELECT s.category_id, s.user_id, SUM(s.points) AS total_points
FROM users s
GROUP BY s.category_id, s.user_id
ORDER BY s.category_id, total_points DESC
) r
ORDER BY r.category_id, r.total_points DESC, r.user_id DESC
The purpose of the inline view aliased as i is to initialize user defined variables. The inline view aliased as r returns the total_points for each (user_id, category_id).
The "trick" is to compare the category_id value of the previous row with the value of the current row; if they match, we increment the rank by 1. If it's a "new" category, we reset the rank to 1. Note this only works if the rows are ordered by category, and then by total_points descending, so we need the ORDER BY clause. Also note that the order of the expressions in the SELECT list is important; we need to do the comparison of the previous value BEFORE it's overwritten with the current value, so the assignment to #prev_category must follow the conditional test.
Also note that if two users have the same total_points in a category, they will get distinct values for rank... the query above doesn't give the same rank for a tie. (The query could be modified to do that as well, but we'd also need to preserve total_points from the previous row, so we can compare to the current row.
Also note that this syntax is specific to MySQL, and that this is behavior is not guaranteed.
If you need the columns in the particular sequence and/or the rows in a particular order (to get the exact resultset specified), we'd need to wrap the query above as an inline view.
SELECT t.user_id
, t.category_id
, t.total_points
, t.rank
FROM (
SELECT #rank := IF(#prev_category = r.category_id, #rank+1, 1) AS rank
, #prev_category := r.category_id AS category_id
, r.user_id
, r.total_points
FROM (SELECT #prev_categor := NULL, #rank := 1) i
CROSS
JOIN ( SELECT s.category_id, s.user_id, SUM(s.points) AS total_points
FROM users s
GROUP BY s.category_id, s.user_id
ORDER BY s.category_id, total_points DESC
) r
ORDER BY r.category_id, r.total_points DESC, r.user_id DESC
) t
ORDER BY t.user_id, t.category_id
NOTE: I've not setup a SQL Fiddle demonstration. I've given an example query which has only been desk checked.

Using 'GROUP BY' while preferring rows associated in another table

I have a table tbl_entries with the following structure:
+----+------+------+------+
| id | col1 | col2 | col3 |
+----+------+------+------+
| 11 | a | b | c |
| 12 | d | e | a |
| 13 | a | b | c |
| 14 | X | e | 2 |
| 15 | a | b | c |
+----+------+------+------+
And another table tbl_reviewlist with the following structure:
+----+-------+------+------+------+
| id | entid | cola | colb | colc |
+----+-------+------+------+------+
| 1 | 12 | N | Y | Y |
| 2 | 13 | Y | N | Y |
| 3 | 14 | Y | N | N |
+----+-------+------+------+------+
Basically, tbl_reviewlist contains reviews about the entries in tbl_entries. However, for some known reason, the entries in tbl_entries are duplicated. I am extracting the unique records by the following query:
SELECT * FROM `tbl_entries` GROUP BY `col1`, `col2`, `col3`;
However, any one of the duplicate rows from tbl_entries will be returned no matter they have been reviewed or not. I want the query to prefer those rows which have been reviewed. How can I do that?
EDIT: I want to prefer rows which have been reviewed but if there are rows which have not been reviewed yet it should return those as well.
Thanks in advance!
Have you actually tried anything?
A hint: The SQL standard requires that every column in the result set of a query with a group by clause must be either
a grouping column
an aggregate function — sum(), count(), etc.,
a constant value/literal, or
an expression derived solely from the above.
Some broken implementations (and I believe MySQL is one of them) allow other columns to be included and offer their own...creative...behavior. If you think about it, group by essentially says to do the following:
Order this table by the grouping expressions
Partition it into subsets based on the group by sequence
Collapse each such partition into a single row computing the aggregate expressions as you go.
Once you've done that, what does it mean to ask for something that isn't uniform across the collapsed group partition?
If you have a table foo containing columns A, B, C, D and E and say something like
select A,B,C,D,E from foo group by A,B,C
per the standard, you should get a compile error. Deviant implementations [usually] treat this sort of query as the [rough] equivalent of
select *
from foo t
join ( select A,B,C
from foo
group by A,B,C
) x on x.A = t.A
and x.B = t.B
and x.C = t.C
But I wouldn't necessarily count on that without review the documentation for the specific implementation that your are using.
If you want to find just reviewed entries, then something like this:
select *
from tbl_entries t
where exists ( select *
from tbl_reviewlist x
where x.entid = t.id
)
will do you. If, however, you want to find reviewed entries that are duplicated on col1, col2 and col3 then something like this should do you:
select *
from tbl_entries t
join ( select col1,col2,col3
from tbl_entries x
group by col1,col2,col3
having count(*) > 1
) d on d.col1 = t.col1
and d.col2 = t.col2
and d.col3 = t.col3
where exists ( select *
from tbl_reviewlist x
where x.entid = t.id
)
Since your problem statement is rather unclear, another take might be something along these lines:
select t.col1 ,
t.col2 ,
t.col3 ,
t.duplicate_count ,
coalesce(x.review_count,0) as review_count
from ( select col1 ,
col2 ,
col3 ,
count(*) as duplicate_count
from tbl_entries
group by col1 ,
col2 ,
col3
) t
left join ( select cola, colb, colc , count(*) as review_count
from tbl_reviewList
group by cola, colb, colc
having count(*) > 1
) x on x.cola = t.col1
and x.colb = t.col2
and x.colc = t.col3
order by sign(coalesce(x.review_count,0)) desc ,
t.col1 ,
t.col2 ,
t.col3
This query
summarizes the entries table, developing a count of how many time seach col1/2/3 combination exists.
summarizes the review table, developing a count of reviews for each cola/b/c combination
joins them together matching cols a:1, b:2 c:3
orders them
preferring reviewed items to non-reviewed items by placing them first,
then by the col1/2/3 values.
I think there's a way with less repetition, but this should be a start:
select
tbl_entries.ID,
col1,
col2,
col3,
cola, -- ... you get the idea ...
from (
select coalesce(min(entid), min(tbl_entries.ID)) as favID
from tbl_entries left join tbl_reviewlist on entid = tbl_entries.ID
group by col1, col2, col3
) as A join tbl_entries on tbl_entries.ID = favID
left join tbl_reviewlist on entid = tbl_entries.ID
Basically you distill the desired output to a list of core ID's and then re-map back to the data...
SELECT e.col1, e.col2, e.col3,
COALESCE(MIN(r.entid), MIN(e.id)) AS id
FROM tbl_entries AS e
LEFT JOIN tbl_reviewlist AS r
ON r.entid = e.id
GROUP BY e.col1, e.col2, e.col3 ;
Tested at SQL-Fiddle