Mysql Query for Combinations - mysql

can anyone please guide me with writing MySQL query for following scenario.
The data in table is like this,
Table Name: Vals
V1 | V2 | V3 |
+-----------+----+---------+
| 143 | 1 | 1 |
| 2003 | 2 | 6 |
I want result to be like this which is basically Combinations of columns with particular Column constant.
V1 | V2 | V3 |
+-----------+----+---------+
| 143 | 1 | 1 |
| 143 | 1 | 6 |
| 143 | 2 | 1 |
| 143 | 2 | 6 |
| 2003 | 1 | 1 |
| 2003 | 1 | 6 |
| 2003 | 2 | 1 |
| 2003 | 2 | 6 |

You need to use something like this do get all combinations
SELECT DISTINCT a.V1,
b.V2,
c.V3
FROM Vals a,
Vals b,
Vals c
To get it sorted then you add ORDED BY and then query looks like
SELECT DISTINCT a.V1,
b.V2,
c.V3
FROM Vals a,
Vals b,
Vals c
ORDER BY 1,
2,
3
Tested it on my table and it worked, hope it helps you.

Related

Precalculate numbers of records for each possible combination

I have a mySQL database table containing cellphones information like this:
ID Brand Model Price Type Size
==== ===== ===== ===== ====== ====
1 Apple A71 3128 A 40
2 Samsung B7C 3128 B 20
3 Apple ZX5 3128 A 30
4 Huawei Q32 2574 B 40
5 Apple A21 2574 A 25
6 Apple A71 3369 A 30
7 Samsung A71 7413 C 40
Now I want to create another table, that would contain counts for every possible combination of the parameters.
Params Count
============================================== =======
ALL 1000000
Brand(Apple) 20000
Brand(Apple,Samsung) 40000
Brand(Apple),Model(A71) 7100
Brand(Apple),Type(A) 6000
Brand(Apple),Model(A71,B7C),Type(A,B) 7
Model(A71) 12514
Model(A71,B7C) 26584
Model(A71),Type(A) 6521
Model(A71),Type(A,B) 8958
Model(A71),Type(A,B),Size(40) 85
And so on for every possible combination. I was thinking about creating a stored procedure (that i would execute periodically), that would perform queries with every existing condition like that, but I am a little stuck on how exactly should it look like. Or is there a better way how to do this?
Edit: the reason why I want to store information like this is to be able to show number of results in filter in client application, like in the picture.
I would like to create index on the Params column to be able to get the Count number for given hash instantly, improving performance.
I also tried querying and caching the values dynamically, but I want to try this approach as well, so I can compare which one is more effective.
This is how I am calculating the counts now:
SELECT COUNT(*) FROM products;
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple', 'Samsung');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple') AND Model IN ('A71');
etc.
You can use a ROLLUP for this.
SELECT
model, type, size, COUNT(*)
FROM mytab
GROUP BY 1, 2, 3
WITH ROLLUP
With your sample data, we get the following:
| model | type | size | COUNT(*) |
| ----- | ---- | ---- | -------- |
| A21 | A | 25 | 1 |
| A21 | A | | 1 |
| A21 | | | 1 |
| A71 | A | 30 | 1 |
| A71 | A | 40 | 1 |
| A71 | A | | 2 |
| A71 | C | 40 | 1 |
| A71 | C | | 1 |
| A71 | | | 3 |
| B7C | B | 20 | 1 |
| B7C | B | | 1 |
| B7C | | | 1 |
| Q32 | B | 40 | 1 |
| Q32 | B | | 1 |
| Q32 | | | 1 |
| ZX5 | A | 30 | 1 |
| ZX5 | A | | 1 |
| ZX5 | | | 1 |
| | | | 7 |
The subtotals are present in the rows with null values in different columns, and the total is the last row where all group by columns are null.

How to update a column with the number of rows that have a matching column pair?

I have a table called related_clues which lists the id's of pairs of clues which are related
| id | clue_id | related_clue_id | relatedness |
+----+---------+-----------------+-------------+
| 1 | 1 | 232 | 1 |
| 2 | 1 | 306 | 1 |
| 3 | 1 | 458 | 1 |
| 4 | 2 | 620 | 1 |
| 5 | 2 | 72 | 1 |
| 6 | 3 | 212 | 1 |
| 7 | 3 | 232 | 1 |
| 8 | 3 | 412 | 1 |
| 9 | 3 | 300 | 1 |
+----+---------+-----------------+-------------+
Eventually after a while we may reach two id's such as:
+--------+---------+-----------------+-------------+
| id | clue_id | related_clue_id | relatedness |
+--------+---------+-----------------+-------------+
| 121267 | 1636 | 38 | 1 |
| 121331 | 1636 | 38 | 1 |
+--------+---------+-----------------+-------------+
So in this case, for two distinct id values, we have the same (clue_id, related_clue_id) pair
In this case I would like the relatedness value to be updated to 2, signalling that there are two examples of this (clue_id, related_clue_id) pair. Like so:
+--------+---------+-----------------+-------------+
| id | clue_id | related_clue_id | relatedness |
+--------+---------+-----------------+-------------+
| 121267 | 1636 | 38 | 2 |
| 121331 | 1636 | 38 | 2 |
+--------+---------+-----------------+-------------+
So essentially I would like to run some SQL that sets the relatedness value to the number of times a (clue_id, related_clue_id) pair appears.
When I have no relatedness column present, and I simply run the SQL:
SELECT id, clue_id, related_clue_id, COUNT(*) AS relatedness
FROM `related_clues`
GROUP BY clue_id, related_clue_id
It gives me the required result, but of course this doesn't store the relatedness column, it simply shows the column if I run this select. So how do I permanently have this relatedness column?
You could use a update with join
Update related_clues a
INNER JOIN (
SELECT clue_id, related_clue_id, COUNT(*) AS relatedness
FROM `related_clues`
group by clue_id, related_clue_id
having count(*) = 2
) t on t.clue_id = a.clue_id
and t.related_clue_id = a.related_clue_id
set a.relatedness = t.relatedness
I would approach this as an update/join but filter out rows that don't need to be updated:
update related_clues rc join
(select clue_id, related_clue_id, COUNT(*) AS cnt
from `related_clues`
group by clue_id, related_clue_id
) t
on t.clue_id = rc.clue_id and
t.related_clue_id = rc.related_clue_id
set rc.relatedness = t.relatedness
where rc.relatedness <> t.relatedness;

MySQL: Move data from multiple rows to one row based on column value

I have MySQL table in the following format. This is an output from a program that I run and I cannot change it.
+---+------------------------+
| | A B C D E |
+---+------------------------+
| | model amz wmt abt tgt |
| 1 | c3000 100 |
| 2 | c3000 200 |
| 3 | c3000 150 |
| 4 | c3000 125 |
| 5 | A1234 135 |
| 6 | A1234 105 |
+---+------------------------+
I want to move all the rows into one single row based on the value in column 1 i.e model. The caveat is that the blank rows are not actually blank and contain a null character
DESIRED OUTPUT:
+---+-----------------------+
| | A B C D E |
+---+-----------------------+
| | model amz wmt abt tgt |
| 1 | c3000 100 200 150 125 |
| 2 | A1234 200 105 135 |
+---+-----------------------+
I tried using
select model,group_concat(wmt),group_concat(amz)
from table_name
group by model
And the output that I get is riddled with commas
+---+----------------------------------+
| | A B |
+---+----------------------------------+
| | model amz wmt |
| 1 | c3000 ,,,,100,,,, ,,,200,,,, |
| 2 | A1234 ,,200,,,,,, ,105,,,,,, |
+---+----------------------------------+
You can use TRIM and IF to convert blank values to null.
SELECT
model,
GROUP_CONCAT(IF(TRIM(wmt) = '', NULL, wmt)),
GROUP_CONCAT(IF(TRIM(amz) = '', NULL, amz))
FROM
table_name
GROUP BY model
SELECT
model,
MIN(amz) AS amz,
MIN(wmt) AS wmt,
MIN(abt) AS abt,
MIN(tgt) AS tgt
FROM
table_name
GROUP BY
model

What happens if I select two tables with no WHERE clause?

I had a technical interview last week, and my interviewer asked me what happens if I run the following query:
SELECT * FROM tbl1, tbl2
I think I answered it correctly, but it wasn't an in-depth answer.
I said that I would select all the columns in both tables. For example if tbl1 has 3 columns, and tbl2 has 4 columns. The result set would have 7 columns.
Then he asked me why 7? and I said because I was selecting everything from each table.
That was a bad answer, but I couldn't think of anything else.
To cut to the chase, after the interviewed I executed the latter statement using two tables.
Table A, had 3 animal: dog, cat and elephant.
Table B had 2 names: Mat and Beth
This is the result set that I got after the statement being executed:
*********************************************
| id_tbl1 | name_tbl1 | id_tbl2 | name_tbl2 |
*********************************************
| 1 | dog | 1 | Mat |
| 2 | cat | 1 | Mat |
| 3 | elephant | 1 | Mat |
| 1 | dog | 2 | Beth |
| 2 | cat | 2 | Beth |
| 3 | elephant | 2 | Beth |
*********************************************
So my question is, why does the statement behaves like that?
In other words:
Why does the Table B's records repeat themselves until I reach the end of table A, and then it starts all over again?
How would you have answered the question in a way that it would've "WOW'd" the interviewer?
If this question does not belong to SO, feel free to delete it or close it!
If you do a select like this, all rows in one resultset are joined to all rows in the other resultset (Cartesian Product).
So you get a list of all rows of the first table with the first row of the second table, Then all entries for the second row and so on. The order may be an implementation detail. Not sure if it is defined that the first order is by the first table, it might be different across implementations.
If you join three tables (or more), then the same happens with all rows of all tables. This, of course, is not only for tables, but for any result set from joins.
The result will be a cartisian product
take a look at this example
SQL Example
You can see there are two tables one has 5 records and the other has 4 and the result is 20 records. Means 5 * 4 = 20 instead of 5 + 4 = 9 as you are assuming.
Table1
| IDX | VAL |
---------------
| 1 | 1val1 |
| 1 | 1val2 |
| 2 | 2val1 |
| 2 | 2val2 |
| 2 | 2val3 |
Table2
| ID | POINTS |
---------------
| 1 | 2 |
| 2 | 10 |
| 3 | 21 |
| 4 | 29 |
Result of below query
SELECT * FROM Table1 , Table2
| IDX | VAL | ID | POINTS |
-----------------------------
| 1 | 1val1 | 1 | 2 |
| 1 | 1val1 | 2 | 10 |
| 1 | 1val1 | 3 | 21 |
| 1 | 1val1 | 4 | 29 |
| 1 | 1val2 | 1 | 2 |
| 1 | 1val2 | 2 | 10 |
| 1 | 1val2 | 3 | 21 |
| 1 | 1val2 | 4 | 29 |
| 2 | 2val1 | 1 | 2 |
| 2 | 2val1 | 2 | 10 |
| 2 | 2val1 | 3 | 21 |
| 2 | 2val1 | 4 | 29 |
| 2 | 2val2 | 1 | 2 |
| 2 | 2val2 | 2 | 10 |
| 2 | 2val2 | 3 | 21 |
| 2 | 2val2 | 4 | 29 |
| 2 | 2val3 | 1 | 2 |
| 2 | 2val3 | 2 | 10 |
| 2 | 2val3 | 3 | 21 |
| 2 | 2val3 | 4 | 29 |
I think you are confusing yourself by running an example with two tables that have identical fields. You are referring to a Union, which will combine the values of 1 table with another, and using your example this would give you 3 + 4 = 7 results.
The comma separated FROM statement is doing JOIN, which will go through all values in Table X and pair them with all the values of Table Y. This would result in Size of X * Size of Y results, and using your example this would be 3 * 4 = 12.

ORDER BY complex

I've a mysql_query: select * from table ORDER BY v1, v2 ASC
Can be made a query to sort v1, v2 as below ?
+---------------+-----------------------+------+-----+
| id | name | v1 | v2 |
+---------------+-----------------------+------+-----+
| 1 | a | 1 | A |
| 2 | a | 2 | B |
| 3 | a | 3 | C |
| 4 | a | 1 | A |
| 5 | a | 2 | B |
| 6 | a | 3 | C |
| 7 | a | 1 | A |
| 7 | a | 2 | B |
| 7 | a | 3 | C |
+---------------+-----------------------+------+-----+
SQL fiddle
You need another column to sort like that. You have to tell MySQL why ids 1,2,3 come before 4,5,6. If you have another column that is e.g. 1 for 1,2,3, 2 for 4,5,6 etc, you can sort with:
ORDER BY missing_col, v1, v2
You can try this:
Select id, name, v1, v2, (v1 + ASCII(v2)) as mySum
from table
order by mySum
ASCII http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_ascii