SQL group by bit set? - mysql

Is there any way of grouping the results of a select by individual bits set in a column? That is, ignoring combinations of various bits (otherwise a simple group by column would suffice).
For example, assuming a table has the values 1 through 10 for the group-by column exactly once (presented below with binary representation to simplify the construction/verification of the following group-by statement result):
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
10 1010
the group-by effect I want to achieve would look like:
select <bit>, count(*), group_concat(<column>) from <table> group by <...>
0 5 1,3,5,7,9
1 5 2,3,6,7,10
2 4 4,5,6,7
3 3 8,9,10
assuming the first bit is "bit 0".
I'm using MySQL at the moment, so ideally a solution should work there; but I'd be interested in other RDBMS solutions, if any.

You would need to split the values up and then reaggregate. Something like this:
select n.n, group_concat(col1)
from t join
(select 0 as n union all select 1 union all select 2 union all select 3
) n
on (1 << n) & bits > 0
group by n;

Related

Get the average of values in every specific epoch ranges in unix timestamp which returns -1 in specific condition in MySQL

I have a MySQL table which has some records as follows:
unix_timestamp value
1001 2
1003 3
1012 1
1025 5
1040 0
1101 3
1105 4
1130 0
...
I want to compute the average for every 10 epochs to see the following results:
unix_timestamp_range avg_value
1001-1010 2.5
1011-1020 1
1021-1030 5
1031-1040 0
1041-1050 -1
1051-1060 -1
1061-1070 -1
1071-1080 -1
1081-1090 -1
1091-1100 -1
1101-1110 3.5
1111-1120 -1
1121-1130 0
...
I saw some similar answers like enter link description here and enter link description here and enter link description here but these answers are not a solution for my specific question. How can I get the above results?
The easiest way to do this is to use a calendar table. Consider this approach:
SELECT
CONCAT(CAST(cal.ts AS CHAR(50)), '-', CAST(cal.ts + 9 AS CHAR(50))) AS unix_timestamp_range,
CASE WHEN COUNT(t.value) > 0 THEN AVG(t.value) ELSE -1 END AS avg_value
FROM
(
SELECT 1001 AS ts UNION ALL
SELECT 1011 UNION ALL
SELECT 1021 UNION ALL
...
) cal
LEFT JOIN yourTable t
ON t.unix_timestamp BETWEEN cal.ts AND cal.ts + 9
GROUP BY
cal.ts
ORDER BY
cal.ts;
In practice, if you have the need to do this sort of query often, instead of the inline subquery labelled as cal above, you might want to have a full dedicated table representing all timestamp ranges.

Why does count(distinct ..) return different values on the same table?

select count(distinct a,b,c,d) from mytable;
select count(distinct concat(a,'-',b),concat(c,'-',d)) from mytable;
Since '-' never appears in a,b,c,d fields, the 2 queries above should return the same result. Am I right ?
Actually it is not the case, the difference is 4 rows out of ~60M and I cant figure out how this is possible
Any idea or example ?
Thanks
First, I am assuming that you are using MySQL, because that is the only database of your original tags where your syntax would be accepted.
Second, this does not directly answer your question. Given your types and expressions, I do not see how you can get different results. However, very similar constructs can produce different results.
It is very important to note that NULL is not the culprit. If any argument is NULL for either COUNT(DISTINCT) or CONCAT(), then the result is NULL -- and NULLs are not counted.
However, spaces at the end of strings can be an issue. Consider the results from this query:
select count(distinct x, y),
count(distinct concat(x, '-', y)),
count(distinct concat(y, '-', x))
from (select 1 as x, 'a' as y union all
select 1, 'a ' union all
select 1, NULL
) a
I would expect the second and third arguments to return the same thing. But spaces at the end of the string cause differences. COUNT(DISTINCT) ignores them. However, CONCAT() will embed them in the string. Hence, the above returns
1 1 2
And the two values are different.
In other words, two values may not be exactly the same, but COUNT(DISTINCT) might regard them as the same. Spaces are one example. Collations are another potential culprit.
Take example of sample data
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 2 5 7
1 3 3 4
1 3 3 4
then count (distinct (a, b, c, d)) = 4
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 3 3 4
and count (distinct (a,-,b), distinct (c,-,d)) = 3
dist (a,-,b) dist (c,-,d)
1 2 3 4
5 6 7 8
1 3 5 7

What is the difference between MySQL LIMIT range of 0,500 and 1, 500?

If I want in MySQL rows 1 through 500, should I use LIMIT 0, 500 or LIMIT, 1, 500? What is the difference? Thanks!
The first one starts from the first record of the whole result, while the second one starts on the second record of the result.
Consider the following records
ID
1 -- index of the first record is zero.
2
3
4
5
6
if you execute
LIMIT 0, 3
-- the result will be ID: 1,2,3
LIMIT 1, 3
-- the result will be ID: 2,3,4
SQLFiddle Demo
OTHER(s)
Limit - MySQL Command (for more info)
In MySQL, the meaning of LIMIT n1, n2 is :
n1 : starting index
n2 : number of record/data you want to show
For example :
ID
-------------------------
1 ------------ > index 0
2
3
4
5
6
7
8
9
10 ------------ > index 9
Now if you write query like
SELECT * from tbl_name LIMIT 0,5
Output :
1
2
3
4
5
And if you write query like
SELECT * from tbl_name LIMIT 2,7
Output :
3
4
5
6
7
8
9
#JohnWoo That's not correct. The order of rows from a SELECT statement, with no ORDER BY clause, is unspecified. Therefore even though by visually looking at the output order of such a query it may seem to be in a specific order, that order is not guaranteed and therefore not reliable. If you require a result set to be ordered in a certain way you must use an ORDER BY clause.
ID
1
2
3
4
5 ------------ > index 0
6
7
8
9 ------------ > after 5 index add 4 value
10
If you want data from 5 to 9 so query should be
SELECT * from table_name LIMIT 5,4

SQL - counting rows with specific value

I have a table that looks somewhat like this:
id value
1 0
1 1
1 2
1 0
1 1
2 2
2 1
2 1
2 0
3 0
3 2
3 0
Now for each id, I want to count the number of occurences of 0 and 1 and the number of occurences for that ID (the value can be any integer), so the end result should look something like this:
id n0 n1 total
1 2 2 5
2 1 2 4
3 2 0 3
I managed to get the first and last row with this statement:
SELECT id, COUNT(*) FROM mytable GROUP BY id;
But I'm sort of lost from here. Any pointers on how to achieve this without a huge statement?
With MySQL, you can use SUM(condition):
SELECT id, SUM(value=0) AS n0, SUM(value=1) AS n1, COUNT(*) AS total
FROM mytable
GROUP BY id
See it on sqlfiddle.
As #Zane commented above, the typical method is to use CASE expressions to perform the pivot.
SQL Server now has a PIVOT operator that you might see. DECODE() and IIF() were older approaches on Oracle and Access that you might still find lying around.

MySQL: Matching inexact values using "ON"

I'm way out of my league here...
I have a mapping table (table1) to assign particular values (value) to a whole number (map_nu). My second table (table2), is a collection of averages (avg) for each user (user_id).
(I couldn't figure out how to properly make a markdown table, please feel free to edit!)
table1: table2:
(value)(Map_nu) (user_id)(avg)
---- -----
1 1 1 1.111
1.045 2 2 1.2
1.09 3 3 1.33333
1.135 4 4 1
1.18 5 5 1.389
1.225 6 6 1.42
1.27 7 7 1.07
1.315 8
1.36 9
1.405 10
The value Map_nu is a special number that each user gets assigned according to their average. I need to find a way to match the averages from table2 to the closest value in table1. I only need to match to the 2 digit past the decimal, so I've added the Truncated function
SELECT table2.user_id, map_nu
FROM `table1`
JOIN table2 ON TRUNCATE(table1.value,2)=TRUNCATE(table2.avg,2)
I still miss the values that don't match the averages exactly. Is there a way to pick the nearest truncated value or even to round to the second decimal? Rounding up/down wont matter as long as its applied to all values the same.
I am trying to have the following result (if rounded up):
(user_id)(Map_nu)
----
1 4
2 6
3 6
4 1
5 10
6 11
7 3
Thanks!
i think you might have to do this in 2 separate queries. there is no 'nearest' operator in sql, so you can either calculate it in your software, or you could use
select map_nu from table1 ORDER BY abs(value - $avg) LIMIT 1
inside a loop. however, that cannot be used as a join function as it requires the ORDER and LIMIT which are not valid as joins.
another way of looking at it is it seems that your map_nu and value are deterministic in relation to each other - value = 1 + ((map_nu - 1) * 0.045) - so maybe you could make use of that fact and calculate an integer based on that equation? assuming that relationship holds true for all values of map_nu.
This is an awkward database design. What is the data representing and what are you trying to solve? There might be a better way.
Maybe do something like...
SELECT a.user_id, b.map_nu, abs(a.avg - b.value)
FROM
table2 a
join table1 b
left join table1 c on abs(a.avg - b.value) > abs(a.avg - c.value)
where c.value is null
order by a.user_id
Doesn't actually produce the same output as the one you were expecting for (doesn't do any rounding). Though you should be able to tweak it from there. Above query will produce the output below (w/ data you've provided):
user_id map_nu abs(a.avg - b.value)
------- ------ --------------------
1 3 0.0209999999999999
2 5 0.02
3 8 0.01833
4 1 0
5 10 0.016
6 10 0.0149999999999999
7 3 0.02
Beware though if you're dealing with large tables. Evaluate the explain of the above query if it'll be practical to run it within MySQL or if better to be done outside it.
Note 2: Will produce duplicate rows if there are avg values that are equi-distant to value values within table1 (Ex. if value for map_nu's 11 and 12 are 2 and 3 and someone get's an avg of 2.5). Your question doesn't really specify what to do for that so you might want to take that into account.
Its taking a little extra work, but I figure the easiest way to get my results will be to map all values to the second decimal place in table1:
1 1
1.01 1
1.02 1
1.03 1
1.04 1
1.05 2
1.06 2
1.07 2
1.08 2
1.09 3
1.1 3
1.11 3
1.12 3
1.13 3
1.14 4
...
Thanks for the suggestions! Sorry I couldn't present the question more clear.