I want to retrieve items which have all values I want.
The table:
+------+------+
| item | val |
+------+------+
| 1 | x |
| 1 | y |
| 2 | a |
| 2 | b |
| 3 | a |
| 3 | x |
| 3 | y |
+------+------+
For example, I want the items which have x and y vals (items 1 and 3)
My SQL query:
SELECT item, GROUP_CONCAT(DISTINCT val) AS vals
FROM test
GROUP BY item
HAVING
FIND_IN_SET('x', vals) AND
FIND_IN_SET('y', vals)
That works, but I think that there is a better solution which doesn't use FIND_IN_SET function.
Related
In MySQL I have three tables:
A:
-----------------
|g_id | txt |
|------|--------|
| 1 | cat |
| 3 | dog |
and
B:
--------------------------
|g_id | txt | txt2 |
|------|---------|--------
| 1 | hat | chat |
| 3 | that | NULL |
| 3 | that | NULL |
and
C:
------------------
|g_id | txt |
|------|---------|
| 1 | hat |
| 1 | mat |
| 3 | that |
My goal is to sum up the txt column of each row across tables A, B, and C grouped by g_id...
So after my query, the expected result would be:
----------------
|g_id | size |
|------|-------|
| 1 | 16 |
| 3 | 15 |
My query fails:
SELECT
g_id,
SUM(length(txt)) + SUM(length(txt2) as size
FROM ((SELECT a.g_id, a.txt FROM a) UNION ALL
(SELECT b.g_id, b.txt, b.txt2 FROM b) UNION ALL
(SELECT c.g_id, c.txt FROM c)
) abc
GROUP BY g_id;
Error: The used SELECT statements have a different number of columns
Use union all instead of join:
SELECT g_id,
SUM(length(txt)) as size
FROM ((SELECT a.g_id, a.txt FROM a) UNION ALL
(SELECT b.g_id, b.txt FROM b) UNION ALL
(SELECT c.g_id, c.txt FROM c)
) abc
GROUP BY g_id;
With the following sample data, I want to create a query that can find where item/desc is duplicate and their respective data fields are not mutually exclusive. Meaning... there can be only 1 data value in column for a given combination of item/desc.
Sample table records:
id | item | desc | data1 | data2 | data3
----+-------+-------+-------+-------+-------
1 | 1 | cat | a | |
2 | 1 | cat | | b |
3 | 1 | cat | | e |
4 | 2 | dog | a | |
5 | 2 | dog | | h | f
6 | 3 | apple | k | | m
7 | 3 | worm | a | g | x
8 | 4 | rock | p | | s
9 | 4 | rock | | | s
10 | 4 | rock | | t | z
Expected query result:
item | desc
-------+-------
1 | cat (because of conflict in data2 with b & e)
4 | rock (because of conflict in data3 with s,s & z
This should work:
select distinct
item,
`desc`
from
table
group by item, `desc`
HAVING count(distinct data1) > 1 or count(distinct data2) > 1 or count(distinct data3) > 1
You could do simple count aggregation and HAVING clause to achieve this.
SELECT `item`,
`desc`,
COUNT(`data1`) AS `data1count`,
COUNT(`data2`) AS `data2count`,
COUNT(`data3`) AS `data3count`
FROM table
GROUP BY `item`, `desc`
HAVING `data1count` > 1
OR `data2count` > 1
OR `data3count` > 1
My list is a infinite universe, as an example, I would like to take a table such as this one:
+-------+------+
| group | name |
+-------+------+
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 1 | E |
| 1 | F |
| 1 | G |
| 1 | H |
| 1 | I |
| 2 | J |
| 2 | L |
| 2 | M |
| 3 | N |
| 4 | O |
| 4 | P |
| 4 | Q |
| 4 | R |
| 4 | S |
| 5 | U |
| 6 | V |
+-------+------+
And run a query that produces results in this order:
+-------+------+
| group | name |
+-------+------+
| 1 | A |
| 2 | J |
| 3 | N |
| 4 | O |
| 5 | U |
| 6 | V |
| 1 | B |
| 2 | L |
| 4 | P |
| 1 | C |
| 2 | M |
| 4 | Q |
| 1 | D |
| 4 | R |
| 1 | E |
| 4 | S |
| 1 | F |
| 1 | G |
| 1 | H |
| 1 | I |
+-------+------+
select `group`, name from (
select
t.*,
#rn := if(`group` != #ng, 1, #rn + 1) as ng,
#ng := `group`
from t
, (select #rn:=0, #ng:=null) v
order by `group`, name
) sq
order by ng, `group`, name
see it working live in an sqlfiddle
To explain it a little...
, (select #rn:=0, #ng:=null) v
This line is just a fancy way to initialize the variables on the fly. It's the same as omitting this line but having SET #rn := 0; SET #ng := NULL; before the SELECT.
Then the ORDER BY in the subquery is very important. In a relational database there is no order unless you specify it.
Here
#rn := if(`group` != #ng, 1, #rn + 1) as ng,
#ng := `group`
the first line is a simple check, if the value of group in the current row is different from the value of #ng. If yes, assign 1 to #rn, if not, increment #rn.
The order of the columns in the SELECT clause is very important. MySQL processes them one by one. In the second line, we assign the value of group of the current row to #ng. When the next row of the table is processed by the query, in the first line of the two above, #ng will therefore hold the value of the previous row.
The outer select is just cosmetics, to hide unnecessary columns. Feel free to ask if anything is still unclear. Oh, and here you can read more about user defined variables in MySQL.
Note though, that needing variables is rather the exception. They often result in full table scans. Whatever you want to achieve with variables in select statements is often better done on application level, rather than database level.
I am running Hive 071 I have a table, with mulitple rows, with the same column value e.g.
| x | y |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 2 |
| 3 | 2 |
| 3 | 1 |
I want to have the x column unique, and remove rows that have the same x val e.g.
| x | y |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
or
| x | y |
| 1 | 4 |
| 2 | 2 |
| 3 | 1 |
are both good as distinct works only on the whole rs in hive, I couldn't find a way to do it
help please Tx
Some options:
1) This will give you the max value of y for each value of x
select x, max(y) from table1 group by x
Equally you could use avg() or min()
2) OR, you could collect all the values of y in a list:
select x, collect_set(y) from table1 group by x
This will give you:
x|y
1|2,3,4
2|2
3|1,2
I am running Hive 071
I have a table, with mulitple rows, with the same column value
e.g.
x | y |
---------
1 | 2 |
1 | 3 |
1 | 4 |
2 | 2 |
3 | 2 |
3 | 1 |
I want to have the x column unique, and remove rows that have the same x val
e.g.
x | y |
---------
1 | 2 |
2 | 2 |
3 | 2 |
or
x | y |
---------
1 | 4 |
2 | 2 |
3 | 1 |
are both good
as distinct works only on the whole rs in hive, I couldn't find a way to do it
help please
Tx
You can use the distinct keyword:
SELECT DISTINCT x FROM table
try following query to get result :
select A.x , A.y from (select x , y , rank() over ( partition by x order by y) as ranked from testingg)A where ranked=1;