Database:
+------------+
| Number |
+------------+
| 0050000235 |
+------------+
| 5532003644 |
+------------+
| 1122330505 |
+------------+
| 1103220311 |
+------------+
| 1103000011 |
+------------+
| 1103020012 |
+------------+
To select numbers having pair of "0" 3 times I tried:
SELECT * FROM numbers
WHERE Number LIKE "%00%00%00%"
OR Number LIKE "%00%0000%"
OR Number LIKE "%0000%00%"
OR Number LIKE "0000%00%"
OR Number LIKE "%00%0000"
OR Number LIKE "00%0000%"
OR Number LIKE "%0000%00"
OR Number LIKE "%0000%00"
OR Number LIKE "%000000%"
OR Number LIKE "000000%"
OR Number LIKE "%000000"
This results me:
0050000235
But the way I am using, I think it's not a clean method.
Question How to fetch numbers having 3 pairs in it with clean SQL query?
The result will be:
0050000235, 5532003644, 1122330505, 1103220311 & 1103000011
where Number rlike '((00|11|22|33|44|55|66|77|88|99).*){3}'
Create a series of numbers from 0 to 9 with UNION ALL and cross join to the table.
Each of these numbers will be doubled and replaced in the column of the table with an empty string. The difference in length of each replacement will be summed and if it is greater than 6 this means that there exist at least 3 pairs:
select
n.number
from (
select 0 d union all select 1 d union all select 2 union all
select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9
) s cross join numbers n
group by n.number
having sum(
length(n.number) - length(replace(n.number, repeat(d, 2), ''))
) >= 6
See the demo.
Results:
| number |
| ---------- |
| 0050000235 |
| 1103000011 |
| 1103220311 |
| 1122330505 |
| 5532003644 |
How about using regular expressions?
where number regexp '00.*00.*00'
Or slightly shorter:
where number regexp '(00.*){3}'
You can readily generalize this to any two numbers:
where number regexp '([0-9]{2}.*){3}'
If you want to ensure exactly six '0' (as opposed to more):
where number regexp '^[^0]*00[^0]*00[^0]*00[^0]*$'
Related
My MySQL table having column with comma separated numbers. See below example -
| style_ids |
| ---------- |
| 5,3,10,2,7 |
| 1,5,12,9 |
| 6,3,5,9,4 |
| 8,3,5,7,12 |
| 7,4,9,3,5 |
So my expected result should have top 5 numbers with maximum appearance count in descending order as 5 rows as below -
| number | appearance_count_in_all_rows |
| -------|----------------------------- |
| 5 | 5 |
| 3 | 4 |
| 9 | 3 |
| 7 | 2 |
| 4 | 2 |
Is it possible to get above result by MySQL query ?
As already alluded to in the comments, this is a really bad idea. But here is one way of doing it -
WITH RECURSIVE seq (n) AS (
SELECT 1 UNION ALL SELECT n+1 FROM seq WHERE n < 20
), tbl (style_ids) AS (
SELECT '5,3,10,2,7' UNION ALL
SELECT '1,5,12,9' UNION ALL
SELECT '6,3,5,9,4' UNION ALL
SELECT '8,3,5,7,12' UNION ALL
SELECT '7,4,9,3,5'
)
SELECT seq.n, COUNT(*) appearance_count_in_all_rows
FROM seq
JOIN tbl ON FIND_IN_SET(seq.n, tbl.style_ids)
GROUP BY seq.n
ORDER BY appearance_count_in_all_rows DESC
LIMIT 5;
Just replace the tbl cte with your table.
As already pointed out you should fix the data if possible.
For further details read Is storing a delimited list in a database column really that bad?.
You could use below answer which is well explained here and a working fiddle can be found here.
Try,
select distinct_nr,count(distinct_nr) as appearance_count_in_all_rows
from ( select substring_index(substring_index(style_ids, ',', n), ',', -1) as distinct_nr
from test
join numbers on char_length(style_ids) - char_length(replace(style_ids, ',', '')) >= n - 1
) x
group by distinct_nr
order by appearance_count_in_all_rows desc ;
I have 5 different datasets from 5 different tables.. From those 5 different tables I have taken below group by data..
select number,count(*) as total from tb01 group by number limit 5;
select number,count(*) as total from tb02 group by number limit 5;
Like that I can retrieve 5 different datasets. Here is an example.
+-----------+-------+
| number | total |
+-----------+-------+
| 114000259 | 1 |
| 114000400 | 1 |
| 114000686 | 1 |
| 114000858 | 1 |
| 114003895 | 1 |
+-----------+-------+
Now I need to combine those 5 different tables such as below tabular format.
+-----------+-------+-------+-------+
| number | tb01 | tb02 | tb03 |
+-----------+-------+-------+-------+
| 114000259 | 1 | 2 | 1 |
| 114000400 | 1 | 0 | 1 |
| 114000686 | 1 | 3 | 1 |
| 114000858 | 1 | 1 | 5 |
| 114003895 | 1 | 0 | 1 |
+-----------+-------+-------+-------+
Can someone help me to combine those 5 grouped data sets and get the union as above.
Note: I dont need the header as same as table names..these headers can be anything
Further I dont need to limit 5, above is to get a sample of 5 data only. I have a large dataset.
It's a job for JOINs and subqueries. My answer will consider three tables. It should be obvious how to expand it to five.
Your first subquery: get all possible numbers.
SELECT number FROM tb01 UNION
SELECT number FROM tb02 UNION
SELECT number FROM tb03
Then you have a subquery for each table to get the count.
SELECT number, COUNT(*) AS total
FROM tb02 GROUP BY number
Then you LEFT JOIN everything and SELECT from that.
SELECT numbers.number,
tb01.total tb01,
tb02.total tb02,
tb03.total tb03
FROM (
SELECT number FROM tb01 UNION
SELECT number FROM tb02 UNION
SELECT number FROM tb03
) numbers
LEFT JOIN (
SELECT number, COUNT(*) AS total
FROM tb01 GROUP BY number
) tb01 ON numbers.number = tb01.number
LEFT JOIN (
SELECT number, COUNT(*) AS total
FROM tb02 GROUP BY number
) tb02 ON numbers.number = tb02.number
LEFT JOIN (
SELECT number, COUNT(*) AS total
FROM tb03 GROUP BY number
) tb03 ON numbers.number = tb01.number
You can add ORDER BY and LIMIT clauses to that overall query as necessary.
The first subquery together with the LEFT JOIN ensures that you get results even if some of your tables are missing number rows. (Some DBMSs have FULL OUTER JOIN, but MySQL does not.)
Pro tip: If you use LIMIT without ORDER BY, you get an unpredictable subset of your rows. Unpredictable is worse than random, because you get the same subset in testing with small tables, but when your tables grow you may start getting different subsets. You'll never catch the problem in unit testing. LIMIT without ORDER BY is a serious error.
I have a table containing some similar rows representing objects for a game. I use this table as a way to select objects randomly. Of course, I ignore the size of the table. My problem is that I would like to have a single query that returns the probability to select every object and I don't know how to proceed.
I can get the total number of objects I have in my table:
select count(id) from objects_desert_tb;
Which returns
+-----------+
| count(id) |
+-----------+
| 81 |
+-----------+
1 row in set (0.00 sec)
and I have a query that return the number of occurence of every object in the table:
select name, (count(name)) from objects_desert_tb group by name;
which gives:
+-------------------+---------------+
| name | (count(name)) |
+-------------------+---------------+
| carrots | 5 |
| metal_scraps | 14 |
| plastic_tarpaulin | 8 |
| rocks_and_stones | 30 |
| wood_scraps | 24 |
+-------------------+---------------+
5 rows in set (0.00 sec)
Computing the probability for every object just consist in doing (count(name)) divided by the total number of rows in the table. For example with the row carrots, just compute 5/81, from the two queries given above. I would like a single query that would return:
+-------------------+---------------+
| carrots | 5/81 = 0.06172839
| metal_scraps | 0.1728...
| plastic_tarpaulin | 0.09876...
| rocks_and_stones | 0.37...
| wood_scraps | 0.29...
+-------------------+---------------+
Is there a way to use the size of the table as a variable inside a SQL query? Maybe by nesting several queries?
Cross join your queries:
select c.name, c.counter / t.total probability
from (
select name, count(name) counter
from objects_desert_tb
group by name
) c cross join (
select count(id) total
from objects_desert_tb
) t
In MySQL 8+, you would just use window functions:
select name, count(*) as cnt,
count(*) / sum(count(*)) over () as ratio
from objects_desert_tb
group by name;
I'm using the group by function to get some products from my little shop like:
select name, ProductID from blog group by ProductID
+----------------------------------------------------------+
| name |
+----------------------------------------------------------+
| AAA |
| BBBB |
| CCCC |
| DDDDDDDD |
+----------------------------------------------------------+
Is it possible to get the average length name in the groupby function?
EDIT (from OP, placed in answer):
myysql> select length(name) as len, name from product where article=40 order by len asc;
+------+----------------------------------------------------------+
| len | name |
+------+----------------------------------------------------------+
| 3 | aaa |
| 6 | BBBBBB |
| 6 | CCCCCC |
| 8 | dddddddd |
+------+----------------------------------------------------------+
4 rows in set (0.00 sec)
by this example I need to get one value like BBBBBB or CCCCCC (AVG?)
Your example doesn't get the average length name, because there is no such thing. The average length would be (8 + 3 + 6 + 6) / 4 = 5.75. It doesn't exist. I think you want the median, which is the size such that 50% are bigger and 50% are smaller.
Here is one way to get the median (assuming that names don't contain commas and that the concatenation doesn't exceed certain limits):
select ProductID,
substring_index(substring_index(group_concat(name order by length(name) separator '||'
), '||', 1 + count(*)/2
), '||', -1) as MedianLengthName
from blog
group by ProductID;
Try this query:
SELECT AVG(CHAR_LENGTH(name)) AS avg FROM tbl;
If you are looking for an the mean average (which implies you would have to accept the integer above and below that decimal value), you can use this:
SELECT *
FROM (
SELECT AVG(CHAR_LENGTH(name)) AS average
FROM product) AS calculated
JOIN product
ON CHAR_LENGTH(name) BETWEEN FLOOR(average) AND CEILING(average);
I have a table like this:
Table: p
+----------------+
| id | w_id |
+---------+------+
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 6 | 5 |
| 6 | 8 |
| 6 | 10 |
| 6 | 10 |
| 7 | 8 |
| 7 | 10 |
+----------------+
What is the best SQL to get the following result? :
+-----------------------------+
| id | most_used_w_id |
+---------+-------------------+
| 5 | 8 |
| 6 | 10 |
| 7 | 8 |
+-----------------------------+
In other words, to get, per id, the most frequent related w_id.
Note that on the example above, id 7 is related to 8 once and to 10 once.
So, either (7, 8) or (7, 10) will do as result. If it is not possible to
pick up one, then both (7, 8) and (7, 10) on result set will be ok.
I have come up with something like:
select counters2.p_id as id, counters2.w_id as most_used_w_id
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters2
join (
select p_id, max(count_of_w_ids) as max_counter_for_w_ids
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters
group by p_id
) as p_max
on p_max.p_id = counters2.p_id
and p_max.max_counter_for_w_ids = counters2.count_of_w_ids
;
but I am not sure at all whether this is the best way to do it. And I had to repeat the same sub-query two times.
Any better solution?
Try to use User defined variables
select id,w_id
FROM
( select T.*,
if(#id<>id,1,0) as row,
#id:=id FROM
(
select id,W_id, Count(*) as cnt FROM p Group by ID,W_id
) as T,(SELECT #id:=0) as T1
ORDER BY id,cnt DESC
) as T2
WHERE Row=1
SQLFiddle demo
Formal SQL
In fact - your solution is correct in terms of normal SQL. Why? Because you have to stick with joining values from original data to grouped data. Thus, your query can not be simplified. MySQL allows to mix non-group columns and group function, but that's totally unreliable, so I will not recommend you to rely on that effect.
MySQL
Since you're using MySQL, you can use variables. I'm not a big fan of them, but for your case they may be used to simplify things:
SELECT
c.*,
IF(#id!=id, #i:=1, #i:=#i+1) AS num,
#id:=id AS gid
FROM
(SELECT id, w_id, COUNT(w_id) AS w_count
FROM t
GROUP BY id, w_id
ORDER BY id DESC, w_count DESC) AS c
CROSS JOIN (SELECT #i:=-1, #id:=-1) AS init
HAVING
num=1;
So for your data result will look like:
+------+------+---------+------+------+
| id | w_id | w_count | num | gid |
+------+------+---------+------+------+
| 7 | 8 | 1 | 1 | 7 |
| 6 | 10 | 2 | 1 | 6 |
| 5 | 8 | 3 | 1 | 5 |
+------+------+---------+------+------+
Thus, you've found your id and corresponding w_id. The idea is - to count rows and enumerate them, paying attention to the fact, that we're ordering them in subquery. So we need only first row (because it will represent data with highest count).
This may be replaced with single GROUP BY id - but, again, server is free to choose any row in that case (it will work because it will take first row, but documentation says nothing about that for common case).
One little nice thing about this is - you can select, for example, 2-nd by frequency or 3-rd, it's very flexible.
Performance
To increase performance, you can create index on (id, w_id) - obviously, it will be used for ordering and grouping records. But variables and HAVING, however, will produce line-by-line scan for set, derived by internal GROUP BY. It isn't such bad as it was with full scan of original data, but still it isn't good thing about doing this with variables. On the other hand, doing that with JOIN & subquery like in your query won't be much different, because of creating temporery table for subquery result set too.
But to be certain, you'll have to test. And keep in mind - you already have valid solution, which, by the way, isn't bound to DBMS-specific stuff and is good in terms of common SQL.
Try this query
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having max(ccc)
here is the sqlfidddle link
You can also use this code if you do not want to rely on the first record of non-grouping columns
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having ccc=max(ccc);