groupby get the average length name - mysql

I'm using the group by function to get some products from my little shop like:
select name, ProductID from blog group by ProductID
+----------------------------------------------------------+
| name |
+----------------------------------------------------------+
| AAA |
| BBBB |
| CCCC |
| DDDDDDDD |
+----------------------------------------------------------+
Is it possible to get the average length name in the groupby function?
EDIT (from OP, placed in answer):
myysql> select length(name) as len, name from product where article=40 order by len asc;
+------+----------------------------------------------------------+
| len | name |
+------+----------------------------------------------------------+
| 3 | aaa |
| 6 | BBBBBB |
| 6 | CCCCCC |
| 8 | dddddddd |
+------+----------------------------------------------------------+
4 rows in set (0.00 sec)
by this example I need to get one value like BBBBBB or CCCCCC (AVG?)

Your example doesn't get the average length name, because there is no such thing. The average length would be (8 + 3 + 6 + 6) / 4 = 5.75. It doesn't exist. I think you want the median, which is the size such that 50% are bigger and 50% are smaller.
Here is one way to get the median (assuming that names don't contain commas and that the concatenation doesn't exceed certain limits):
select ProductID,
substring_index(substring_index(group_concat(name order by length(name) separator '||'
), '||', 1 + count(*)/2
), '||', -1) as MedianLengthName
from blog
group by ProductID;

Try this query:
SELECT AVG(CHAR_LENGTH(name)) AS avg FROM tbl;

If you are looking for an the mean average (which implies you would have to accept the integer above and below that decimal value), you can use this:
SELECT *
FROM (
SELECT AVG(CHAR_LENGTH(name)) AS average
FROM product) AS calculated
JOIN product
ON CHAR_LENGTH(name) BETWEEN FLOOR(average) AND CEILING(average);

Related

How to get maximum appearance count of number from comma separated number string from multiple rows in MySQL?

My MySQL table having column with comma separated numbers. See below example -
| style_ids |
| ---------- |
| 5,3,10,2,7 |
| 1,5,12,9 |
| 6,3,5,9,4 |
| 8,3,5,7,12 |
| 7,4,9,3,5 |
So my expected result should have top 5 numbers with maximum appearance count in descending order as 5 rows as below -
| number | appearance_count_in_all_rows |
| -------|----------------------------- |
| 5 | 5 |
| 3 | 4 |
| 9 | 3 |
| 7 | 2 |
| 4 | 2 |
Is it possible to get above result by MySQL query ?
As already alluded to in the comments, this is a really bad idea. But here is one way of doing it -
WITH RECURSIVE seq (n) AS (
SELECT 1 UNION ALL SELECT n+1 FROM seq WHERE n < 20
), tbl (style_ids) AS (
SELECT '5,3,10,2,7' UNION ALL
SELECT '1,5,12,9' UNION ALL
SELECT '6,3,5,9,4' UNION ALL
SELECT '8,3,5,7,12' UNION ALL
SELECT '7,4,9,3,5'
)
SELECT seq.n, COUNT(*) appearance_count_in_all_rows
FROM seq
JOIN tbl ON FIND_IN_SET(seq.n, tbl.style_ids)
GROUP BY seq.n
ORDER BY appearance_count_in_all_rows DESC
LIMIT 5;
Just replace the tbl cte with your table.
As already pointed out you should fix the data if possible.
For further details read Is storing a delimited list in a database column really that bad?.
You could use below answer which is well explained here and a working fiddle can be found here.
Try,
select distinct_nr,count(distinct_nr) as appearance_count_in_all_rows
from ( select substring_index(substring_index(style_ids, ',', n), ',', -1) as distinct_nr
from test
join numbers on char_length(style_ids) - char_length(replace(style_ids, ',', '')) >= n - 1
) x
group by distinct_nr
order by appearance_count_in_all_rows desc ;

Select pair numbers in SQL Query

Database:
+------------+
| Number |
+------------+
| 0050000235 |
+------------+
| 5532003644 |
+------------+
| 1122330505 |
+------------+
| 1103220311 |
+------------+
| 1103000011 |
+------------+
| 1103020012 |
+------------+
To select numbers having pair of "0" 3 times I tried:
SELECT * FROM numbers
WHERE Number LIKE "%00%00%00%"
OR Number LIKE "%00%0000%"
OR Number LIKE "%0000%00%"
OR Number LIKE "0000%00%"
OR Number LIKE "%00%0000"
OR Number LIKE "00%0000%"
OR Number LIKE "%0000%00"
OR Number LIKE "%0000%00"
OR Number LIKE "%000000%"
OR Number LIKE "000000%"
OR Number LIKE "%000000"
This results me:
0050000235
But the way I am using, I think it's not a clean method.
Question How to fetch numbers having 3 pairs in it with clean SQL query?
The result will be:
0050000235, 5532003644, 1122330505, 1103220311 & 1103000011
where Number rlike '((00|11|22|33|44|55|66|77|88|99).*){3}'
Create a series of numbers from 0 to 9 with UNION ALL and cross join to the table.
Each of these numbers will be doubled and replaced in the column of the table with an empty string. The difference in length of each replacement will be summed and if it is greater than 6 this means that there exist at least 3 pairs:
select
n.number
from (
select 0 d union all select 1 d union all select 2 union all
select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9
) s cross join numbers n
group by n.number
having sum(
length(n.number) - length(replace(n.number, repeat(d, 2), ''))
) >= 6
See the demo.
Results:
| number |
| ---------- |
| 0050000235 |
| 1103000011 |
| 1103220311 |
| 1122330505 |
| 5532003644 |
How about using regular expressions?
where number regexp '00.*00.*00'
Or slightly shorter:
where number regexp '(00.*){3}'
You can readily generalize this to any two numbers:
where number regexp '([0-9]{2}.*){3}'
If you want to ensure exactly six '0' (as opposed to more):
where number regexp '^[^0]*00[^0]*00[^0]*00[^0]*$'

SQL sort by number in column

id| name | created_at
—-----------------------------
1 | name 1 | 2017-05-20
2 | name 2 | 2017-05-22
3 | name 66 | 2017-05-24
4 | name 44 | 2017-05-25
i have a table Orders
I have to sort it by number in name column
like
id| name | created_at
—-----------------------------
1 | name 66 | 2017-05-20
2 | name 44 | 2017-05-22
3 | name 2 | 2017-05-24
4 | name 1 | 2017-05-25
I have tried SELECT * FROM Orders ORDER BY name DESC; but no luck;
how i can do it?
If all the names are in the format name, space, number you can use this query. The SUBSTRING_INDEX extracts the characters from the last space to the end and they are then CAST as an unsigned integer, which allows them to be sorted.
SELECT *
FROM Orders
ORDER BY CAST(SUBSTRING_INDEX(name, ' ', -1) AS UNSIGNED) DESC
Output:
id name created_at
3 name 66 2017-05-24
4 name 44 2017-05-25
2 name 2 2017-05-22
1 name 1 2017-05-20
There is a simple trick to do this, that is to order by length first, then by name. Example:
SELECT *
FROM Orders
ORDER BY LENGTH(name) DESC, name DESC
Here's an example of this in action: SQL Fiddle
Edit: Please note, his will only work if your string is consistent as it is in your example data.
Following query extracts number in name column with substring function and then order it accordingly.
SELECT *
FROM Orders
ORDER BY CAST(SUBSTRING(name, length('name')+2 , 3) AS UNSIGNED) DESC
If 'name' is not the same on each row, then you can do:
order by substring_index(name, ' ', 1),
substring_index(name, ' ', -1) + 0
This will work if you have values such as:
a 1
a 2
a 10
b 1
b 100
(and of course if all start with "name" as well)

SQL - Retrieve Avg Score from Group

My Database contains the following:
| ID | UID | file | score | time |
1 | a827vgj28df | jack_123 | 75 | 12:44
2 | ayeskfkfjhk | jack_999 | 5 | 12:12
3 | b83759 | adam_123 | 7 | 12:12
Goal:
I am trying to get a query that displays the avg scores for each file prefix (jack/adam)
To display like:
| Key | AVG
jack 40
adam 7
You can use substring_index to extract the name prefixes. From there on, it's just a simple use of avg:
SELECT SUBSTRING_INDEX(file, '_', 1) AS key, AVG(score)
FROM mytable
GROUP BY SUBSTRING_INDEX(file, '_', 1)
you can get the key using SUBSTRING_INDEX and then group by
SELECT
SUBSTRING_INDEX(file, '_', 1) AS key
, avg(score) average
from my_table
group by key
Should be something like
select SUBSTRING(file, 1, 4) as Key, AVG(score)
from table
group by SUBSTRING(file, 1, 4)
You might get a better response in the future if you share what you've already tried.

Retrieve distinct values without reducing number of results

I'm writing a MySQL request for retrieving data from a list of questions.
The table looks like this :
-----------------------------------------------------
| id | answer_name | rating | question_id | answers |
-----------------------------------------------------
Where several rows can have the same answer_name value, since several questions can be asked about the same answer.
Now, for retrieving the data I use a LIMIT clause which is calculated from ratings and the total number of rows.
For example, if I wanna get the data between 80% and 100% of rating, and there are 100 rows, I would use ORDER BY rating LIMIT 80, 20.
My problem is the following : I need to retrieve data with distinct values for answer_name column, but using a GROUP BY clause makes the number of result (e.g. of rows in the table) reduce cause of aggregation, causing the top percentages of rows to return nothing cause of searching rows at a limit that doesn't exist.
Does anyone know if there is a way to keep the number of results the same and still to retrieve distinct results for the answer_name column ?
EDIT :
Here are some sample rows and expected output :
game_data table :
-----------------------------------------------------
| id | answer_name | rating | question_id | answers |
|----|-------------|--------|-------------|---------|
| 1 | A. Merkel | 40 | 1 | [1,2,3] |
| 2 | A. Merkel | 45 | 2 | [2,3,4] |
| 3 | B. Clinton | 55 | 1 | [2,5,8] |
| 4 | B. Clinton | 50 | 2 | [3,5,8] |
| 5 | L. Messi | 17 | 4 | [7,8,9] |
| 6 | L. Messi | 18 | 5 | [7,8,9] |
| 7 | L. Messi | 25 | 6 | [7,8,9] |
| 8 | D. Beckham | 21 | 4 | [6,7,8] |
| 9 | D. Beckham | 52 | 5 | [6,7,8] |
| 10 | D. Beckham | 41 | 6 | [6,7,8] |
-----------------------------------------------------
Where answers is an array of ids referring to another table.
Let's say I wanna retrieve the 50% to 80% of the table, ordered by rating.
SELECT id FROM game_data GROUP BY answer_name ORDER BY rating LIMIT 5, 3
Here the problem is the GROUP BY answer_name is gonna reduce the number of rows of the table, and therefore instead of returning 3 results, will return an empty set.
Also, I want the selected value in the GROUP BY close to be randomly chosen.
Using group by like this goes against pretty much every instinct, but you said you want random values, so it's good enough.
select * from (
select q.*, #rank := #rank + 1 as rank
from (
select * from game_data
group by answer_name
order by rating desc
) q, (select #rank := 0) qq
) qqq
where rank between (#rank * .5) and (#rank * .8)
demo here
How does it work? First (in the innermost query) we group by your answer_name, to get your distinct results, and we order it by the rating as required.
Then in the query wrapping around that one, we give those results a ranking from 1 to however many rows are in the result. Once this level of the query completes, we know our best answer is answer 1, and our 'worst' answer is the last value of our #rank variable.
Then we get to the outermost query. We can use that #rank variable to determine our percentages, which we use to filter the where clause.
In all likelihood this will give you the same results each time you run the same query, but the values chosen are indeterminate - so it could change. If you want truly random (ie changes with each execution) that's a different kettle of fish altogether.
(note, this bit: , (select #rank := 0) qq is purely to initialise the variable)
Simple is That.
Use Group By 'id' not 'answer_name' b/c Group By not get duplicate values
SELECT * FROM game_data GROUP BY id ORDER BY rating