Generate list of distinct dimensions in a table - mysql

I have a table with 3 dimensions- A, B, and C.
I essentially want values of all possible combinations for these dimensions and populate all measures(M) as 0 when a combination isn't present.
Suppose I have the table-
If I do this I get -
select a,b,c from sum(m) fact group by a,b,c
But I would like all possible combinations, -
Currently, I a doing a cross join like below, but is there some faster way to do this (as my table has about ~1M records)? -
select * from (
select distinct f1.a, f2.b, f3.c
from fact f1
cross join fact f2
cross join fact f3 ) all
left join
( select a,b,c from sum(m) fact group by a,b,c) s
on all.a=s.a and all.b=s.b and all.c=s.c

If this is Oracle Database, then this is exactly what cube is for.
select a, b, c, sum(m)
from my_table
group by cube(a,b,c)

MySQL:
GROUP BY a,b,c will produce 1 row per combination that exists in the table.
If you want all possible combinations, you need to build 1 (or 3) more tables to list all the possible values, then do a LEFT JOIN from them. You may also want COALESCE(col, 0) to turn NULLs into zeros.

Related

Explaining MySQL query with multiple tables listed in FROM

a, b are not directly related.
What does a,b have to do with the results?
select * from a,b where b.id in (1,2,3)
can you explain sql?
Since you haven't specified a relationship between a and b, this produces a cross product. It's equivalent to:
SELECT *
FROM a
CROSS JOIN b
WHERE b.id IN (1, 2, 3)
It will combine every row in a with the three selected rows from b. If a has 100 rows, the result will be 300 rows.
What you using is Multitable SELECT.
Multitable SELECT (M-SELECT) is similar to the join operation. You
select values from different tables, use WHERE clause to limit the
rows returned and send the resulting single table back to the
originator of the query.
The difference with M-SELECT is that it would return multiply tables
as the result set. For more deatils: https://dev.mysql.com/worklog/task/?id=358
In other word, you query is :
SELECT *
FROM a
CROSS JOIN b
WHERE b.id in (1,2,3)

MySQL: SELECT DISTINCT multiple columns, with the first result on some columns

Let's say I have columns a, b in a table in a MySQL database. What I'm trying to do is to select the distinct values of a with an arbitrary value of b - let's say the first one, but I actually don't care which one.
Something like the query below will give me all distinct values on both columns, so it is not good for me (too many results in my case).
SELECT DISTINCT a, b
FROM my_table;
Any suggestions?
In case I want 2 values of b for each a value, how is that possible?
Use the GROUP BY feature, like:
SELECT a, b
FROM my_table
GROUP BY a;
See my SQL Fiddle.
UPDATE
No DISTINCT is needed at all.
Thanks to dnoeth for the suggestion.
This is just a guess from what I think your'e trying to do:
SELECT DISTINCT a as distinct,
( SELECT b FROM my_table WHERE c = a LIMIT 1 ) as arbitary
FROM my_table;

Select distinct rows by multiple columns and retrieve yet another column which is not taking part in determining the distinctness

So here's how I select distinct rows by a combination of multiple columns (a, b and c):
select distinct a,b,c from my_table
This is good, but I need yet another column retrieved for these rows (d) which I can't add to the select part, because then it also plays a role in determining row uniqueness which I don't want.
How can I retrieve an additional column without it affecting row uniqueness?
You can do this using a group by. In MySQL, you can do:
select a, b, c, d
from my_table
group by a, b, c
This chooses an arbitrary value for "d", which would typically (but not guanteed!) be the first value encountered. This uses a feature of MySQL called Hidden Columns.
For code that works in MySQL and other databases, you need to be more explicit:
select a, b, c, min(d)
from my_table
group by a, b, c
Getting an actual random value for d in MySQL is a bit trickier and requires more work. Here is one way:
select distinct a, b, c,
(select d from my_table mt2
where mt.a = mt2.a and mt.b = mt2.b and mt.c = mt2.c
order by rand()
limit 1
) d
from my_table mt
Looks like a job for a join... probably a LEFT JOIN. Something like this:
SELECT DISTINCT L.a, L.b, L.c FROM `my_table` L
LEFT JOIN (
SELECT a, b, c, d FROM `my_table`
) R ON L.a=R.a AND L.b=R.b AND L.c=R.c

join two different tables?

How can I join table a and table b and get records for each? Not an actual join... not sure what this is called.
So if I have 3 records in a, and 5 records in b, I want 8 records back.
In a record for a, all b fields can be null. In a record for b, all a fields can be null.
edit: My tables have different fields.
Error Code: 1222. The used SELECT statements have a different number of columns
Like the others mentionned, you need an union
SELECT intColumn, varcharColumn, intColumn FROM a
UNION
SELECT intColumn, varcharColumn, 0 FROM b
but you must have the same number of columns and they must also have similar data types.
Here's a good tutorial about it
Also, if you want columns that are not in both tables, you can fill with nulls or constants.
You want a UNION:
SELECT something FROM a
UNION
SELECT something FROM b
Try this
SELECT * FROM a
LEFT JOIN b ON a.id1 = b.id2
UNION
SELECT * FROM a
RIGHT JOIN b ON a.id1 = b.id2
Just make sure, that A and B have different IDs
Edit: Working Fiddle
You can also use some other field other then id which are not same in two table
Edit: Updated fiddle

Transforming a Complicated Requirement into a SQL Query

I am having trouble with the relational algebra and transformation into SQL of this rather complicated query:
I need to select all values from table A joined to table B where there are no matching records in table B, or there are matching records but the set of matching records do not have a field that contains one of 4 of a possible 8 total values.
Database is MySQL 5.0... using an InnoDB engine for the tables.
Select
a.*
from
a
left join
b
on
a.id=b.id
where
b.id is null
or
b.field1 not in ("value1","value2","value3","value4");
I'm not sure if there is any real performance improvement but one other way is:
SELECT
*
FROM
tableA
WHERE
id NOT IN ( SELECT id FROM tableB WHERE field1 NOT IN ("value1", "value2"));
Your requirements are a bit unclear. My 1st interpretation is that you only want the A columns, and never more than 1 instance of a given A row.
select * from A where not exists (
select B.id
from B
where B.id=A.id
and B.field in ('badVal1','badVal2','badVal3','badVal4')
)
My 2nd interpretation is you want all columns from (A outer joined to B), with perhaps more than one instance of an A row if there are multiple B rows, as long as not exists B row with forbidden value.
select * from A
left outer join B on A.id=B.id
where not exists (
select C.id
from B as C
where A.id=C.id
and C.field in ('badVal1','badVal2','badVal3','badVal4')
)
Both queries could be expressed using NOT IN instead of correlated NOT EXISTS. Its hard to know which would be faster without knowing the data.