mysql sorting cells with letters and words - mysql

I got cells that might have the following data (number of errors)
1
2
3
PASS
NoFileFound
NoLog
99
10
2
I would like sort with ascending order and descending order where I would PASS to be treated as a value of 0 and any other text based value should be treated as value of 1 error. As of now, these cells are stored as 'text' in the mysql database. How can this be done for MYSQL? What changes do I need to do?

Try this one:
SELECT Number FROM (
SELECT IF(valueField='PASS',0,1) as Number FROM TableMix
WHERE concat('',valueField * 1) <> valueField ) A
UNION ALL
SELECT Number FROM (
SELECT CAST(valueField as UNSIGNED) as Number FROM TableMix
WHERE concat('',valueField * 1) = valueField ) B
ORDER BY Number
See my SqlFiddle Demo
And this one with original values included:
SELECT Number, valueField FROM (
SELECT IF(valueField='PASS',0,1) as Number, valueField FROM TableMix
WHERE concat('',valueField * 1) <> valueField ) A
UNION ALL
SELECT Number, valueField FROM (
SELECT CAST(valueField as UNSIGNED) as Number, valueField FROM TableMix
WHERE concat('',valueField * 1) = valueField ) B
ORDER BY Number
See this Demo.

Related

Selecting multiple rows, where a difference in value is greater than x%

I'm facing the following problem...
Given this data:
table : votes
=========
value
=========
10
25
38
90
92
93
98
100
120
I would like to return the value only, if the difference between next and previously accepted value is bigger than 10% of the first one:
if abs(int(a)-int(b))*100/int(a) < 10:
return True
So the end list should be (I have added % difference in square brackets):
==========
result
==========
10 ()
25 (150%)
38 (52%)
90 (136%)
100 (11%)
120 (20%)
The query should also sort those values first.
I'm able to do it with code (as shown above), but haven't got any chance in coming even close to a direct query.
MySQL v.8.0.19
You don't mention what version of MySQL you are using, so I'll assume it's a mordern one (8.x). You can use LAG(). For example:
select
concat('', value,
case when prev_value is null then ''
else concat('', 100 * (value - prev_value) / prev_value, '%')
end
) as result
from (
select
value,
lag(value) over (order by value) as prev_value
from t
) x
where prev_value is null or value > prev_value * 1.1
order by value
In MySQL 8.0, you can do this with lag(). Assuming that you want to sort rows by value, that would be:
select value
from (
select
value,
lag(value, 1, 0) over(order by value) lag_value
from mytable t
) t
where value > lag_value * 1.10
If you want to use an different ordering column, then you can change the order by clause to use the relevant column.
In earlier versions, one option is a correlated subquery:
select value
from mytable t
where value > 1.10 * coalesce(
(
select t1.value
from mytable t1
where t1.value < t.value
order by t1.value desc
limit 1
),
0
)
To use another ordering column here, you need to change the where clause and the order by clause of the subquery.
On the other hand, if you want to select the next row according to the ratio against the previously selected row, then that's a different question. You need some kind of iterative process: in SQL, one approach is a recursive query:
with
data as (
select value, row_number() over(order by value) rn
from mytable t
) d,
cte as (
select 1 is_valid, value, rn from data where rn = 1
union all
select
(d.value > 1.1 * c.value),
case when d.value > 1.1 * c.value then d.value else c.value end,
d.rn
from cte c
inner join data d on d.rn = c.rn + 1
)
select value from cte where is_valid order by value
The query enumerates the values, then walks the dataset sequentially while keeping track of the last selected value, and setting flags on records that should appear in the final resultset.

MySQL Query to average 3 columns and exclude 0's?

This is obviously wrong, but what would be the correct way to average the SUM of 3 columns and exclude the 0's?
SELECT (
AVG(NULLIF(`dices`.`Die1`,0)) +
AVG(NULLIF(`dices`.`Die2`,0)) +
AVG(NULLIF(`dices`.`Die3`,0))
) /3 as avgAllDice
FROM (
SELECT `Die1`,`Die2`,`Die3` FROM `GameLog`
WHERE PlayerId = "12345"
) dices
Thanks.
If I was keeping the inline view query (it's not clear why it's needed). I'd probably do something like this:
SELECT AVG( NULLIF( CASE d.i
WHEN 1 THEN dices.`Die1`
WHEN 2 THEN dices.`Die2`
WHEN 3 THEN dices.`Die3`
END
,0)
) AS `avgAllDice`
FROM ( SELECT gl.`Die1`
, gl.`Die2`
, gl.`Die3`
FROM `GameLog` gl
WHERE gl.playerId = '12345'
) dices
CROSS
JOIN ( SELECT 1 AS i UNION ALL SELECT 2 UNION ALL SELECT 3 ) d
The trick is the cross join operation, giving me three rows for each row returned from dices, and an expression that picks out values of Die1, Die2 and Die3 on each of three rows, respectively.
To exclude values of 0, we replace 0 with with NULL (since AVG doesn't include NULL values.)
Now with all of the non-zero DieN values stacked into a single column, we can just use the AVG function.
Another way to do it would be to get the numerator and denominator for each of Die1, Die2, Die3.... and then total up the numerators, total up the denominators, and then divide the total numerator by the total denominator.
This will should give an equivalent result.
SELECT ( IFNULL(t.n_die1,0) + IFNULL(t.n_die2,0) + IFNULL(t.n_die3,0) )
/ ( t.d_die1 + t.d_die2 + t.d_die3 )
AS avgAllDice
FROM ( SELECT SUM( NULLIF(gl.die1,0)) AS n_die1
, COUNT(NULLIF(gl.die1,0)) AS d_die1
, SUM( NULLIF(gl.die2,0)) AS n_die2
, COUNT(NULLIF(gl.die2,0)) AS d_die2
, SUM( NULLIF(gl.die3,0)) AS n_die3
, COUNT(NULLIF(gl.die3,0)) AS d_die3
FROM `GameLog` gl
WHERE gl.playerid = '12345'
) t
(I didn't work out what gets returned in the edge and corner cases... no matching rows in GameLog, all values of Die1, Die2 and Die3 are zero, etc., for either query. The results might be slightly different, returning a zero instead of NULL, divide by zero edge case, etc.)
FOLLOWUP
I ran a quick test of both queries.
CREATE DATABASE d20170228 ;
USE d20170228 ;
CREATE TABLE GameLog
( playerid VARCHAR(5) DEFAULT '12345'
, die1 TINYINT
, die2 TINYINT
, die3 TINYINT
);
INSERT INTO GameLog (die1,die2,die3)
VALUES (3,0,0),(2,1,0),(4,3,3),(3,3,3),(0,0,0),(4,4,4),(5,4,0),(0,0,2)
;
SELECT (3+2+1+4+3+3+3+3+3+4+4+4+5+4+2)/15 AS manual_avg
manual_avg is coming out 3.2.
Both queries are also returning 3.2
If you want to eliminate zeroes and NULLs, you can simply SELECT from the filtered master set multiple times, doing a UNION ALL on the results, then averaging against that.
SELECT AVG(`allDice`.`DieResult`)
FROM (
SELECT `Die1` AS `DieResult` FROM `GameLog` WHERE COALESCE(`Die1`, 0) <> 0 AND PlayerId = '12345'
UNION ALL
SELECT `Die2` FROM `GameLog` WHERE COALESCE(`Die2`, 0) <> 0 AND PlayerId = '12345'
UNION ALL
SELECT `Die3` FROM `GameLog` WHERE COALESCE(`Die3`, 0) <> 0 AND PlayerId = '12345'
) AS `allDice`
There's no need to overthink this one, it's not too difficult a problem

how to select rows of a table in 3th multiply order in sql like 3,6,9 etc

My table consists of 1024 rows and two columns.
I want to select and display rows like 3,4,6,etc..
Is it possible in sql query.
If possible what is the code for that..
In a relational database there is no concept of the "3rd" or "6th" row unless you can define a sort order.
Assuming you do have a column by which you can order the result in order to define a row as "3rd", or "4th", the following will work with Postgres:
select *
from (
select col1, col2, col3,
row_number() over (order by your_sort_column) as rn
from your_table
) t
where rn in (3,4,6);
Where your_sort_column is the one that defines the order or rows. This could be an incremental id or a timestamp column that stores the time of insert or update of the row.
In MySQL it can be accomplished as follows:
SELECT * from
(SELECT (#row := #row + 1) num, tab.* FROM your_table, (SELECT #row := 0) r
ORDER by your_sort_column) t
WHERE (num % 3) = 0
We have used a MODULO operator to get row which is at multiple of 3.
See it working at SQL Fiddle: http://www.sqlfiddle.com/#!2/192b0/4
In PostgreSQL it can be accomplished as follows:
select *
from (
select *, row_number() over (order by your_sort_column) as rn
from your_table
) t
where (rn % 3)=0 ;
http://www.sqlfiddle.com/#!15/cc649/5

Calculate medians for multiple columns in the same table in one query call

StackOverflow to the rescue!, I need to find the medians for five columns at once, in one query call.
The median calculations below work for single columns, but when combined, multiple uses of "rownum" throws the query off. How can I update this to work for multiple columns? THANK YOU. It's to create a web tool where nonprofits can compare their financial metrics to user-defined peer groups.
SELECT t1_wages.totalwages_pctoftotexp AS median_totalwages_pctoftotexp
FROM (
SELECT #rownum := #rownum +1 AS `row_number` , d_wages.totalwages_pctoftotexp
FROM data_990_c3 d_wages, (
SELECT #rownum :=0
)r_wages
WHERE totalwages_pctoftotexp >0
ORDER BY d_wages.totalwages_pctoftotexp
) AS t1_wages, (
SELECT COUNT( * ) AS total_rows
FROM data_990_c3 d_wages
WHERE totalwages_pctoftotexp >0
) AS t2_wages
WHERE 1
AND t1_wages.row_number = FLOOR( total_rows /2 ) +1
--- [that was one median, below is another] ---
SELECT t1_solvent.solvent_days AS median_solvent_days
FROM (
SELECT #rownum := #rownum +1 AS `row_number` , d_solvent.solvent_days
FROM data_990_c3 d_solvent, (
SELECT #rownum :=0
)r_solvent
WHERE solvent_days >0
ORDER BY d_solvent.solvent_days
) AS t1_solvent, (
SELECT COUNT( * ) AS total_rows
FROM data_990_c3 d_solvent
WHERE solvent_days >0
) AS t2_solvent
WHERE 1
AND t1_solvent.row_number = FLOOR( total_rows /2 ) +1
[those are two - there are five in total I'll eventually need to find medians for at once]
This kind of thing is a big pain in the neck in MySQL. You might be wise to use the free Oracle Express Edition or postgreSQL if you're going to do tonnage of this statistical ranking work. They all have MEDIAN(value) aggregate functions that are either built-in or available as extensions. Here's a little sqlfiddle demonstrating that. http://sqlfiddle.com/#!4/53de8/6/0
But you didn't ask about that.
In MySQL, your basic problem is the scope of variables like #rownum. You also have a pivoting problem: that is, you need to turn rows of your query into columns.
Let's tackle the pivot problem first. What you're going to do is create a union of several big fat queries. For example:
SELECT 'median_wages' AS tag, wages AS value
FROM (big fat query making median wages) A
UNION
SELECT 'median_volunteer_hours' AS tag, hours AS value
FROM (big fat query making median volunteer hours) B
UNION
SELECT 'median_solvent_days' AS tag, days AS value
FROM (big fat query making median solvency days) C
So here are your results in a table of tag / value pairs. You can pivot that table like so, to get one row with a value in each column.
SELECT SUM( CASE tag WHEN 'median_wages' THEN value ELSE 0 END
) AS median_wages,
SELECT SUM( CASE tag WHEN 'median_volunteer_hours' THEN value ELSE 0 END
) AS median_volunteer_hours,
SELECT SUM( CASE tag WHEN 'median_solvent_days' THEN value ELSE 0 END
) AS median_solvent_days
FROM (
/* the above gigantic UNION query */
) Q
That's how you pivot up rows (from the UNION query in this case) to columns. Here's a tutorial on the topic. http://www.artfulsoftware.com/infotree/qrytip.php?id=523
Now we need to tackle the median-computing subqueries. The code in your question looks pretty good. I don't have your data so it's hard for me to evaluate it.
But you need to avoid reusing the #rownum variable. Call it #rownum1 in one of your queries, #rownum2 in the next one, and so on. Here's a dinky sql fiddle doing just one of these. http://sqlfiddle.com/#!2/2f770/1/0
Now let's build it up a bit, doing two different medians. Here's the fiddle http://sqlfiddle.com/#!2/2f770/2/0 and here's the UNION query. Notice the second half of the union query uses #rownum2 instead of #rownum.
Finally, here's the full query with the pivoting. http://sqlfiddle.com/#!2/2f770/13/0
SELECT SUM( CASE tag WHEN 'Boston' THEN value ELSE 0 END ) AS Boston,
SUM( CASE tag WHEN 'Bronx' THEN value ELSE 0 END ) AS Bronx
FROM (
SELECT 'Boston' AS tag, pop AS VALUE
FROM (
SELECT #rownum := #rownum +1 AS `row_number` , pop
FROM pops,
(SELECT #rownum :=0)r
WHERE pop >0 AND city = 'Boston'
ORDER BY pop
) AS ordered_rows,
(
SELECT COUNT( * ) AS total_rows
FROM pops
WHERE pop >0 AND city = 'Boston'
) AS rowcount
WHERE ordered_rows.row_number = FLOOR( total_rows /2 ) +1
UNION ALL
SELECT 'Bronx' AS tag, pop AS VALUE
FROM (
SELECT #rownum2 := #rownum2 +1 AS `row_number` , pop
FROM pops,
(SELECT #rownum2 :=0)r
WHERE pop >0 AND city = 'Bronx'
ORDER BY pop
) AS ordered_rows,
(
SELECT COUNT( * ) AS total_rows
FROM pops
WHERE pop >0 AND city = 'Bronx'
) AS rowcount
WHERE ordered_rows.row_number = FLOOR( total_rows /2 ) +1
) D
This is just two medians. You need five. I think it's easy to make the case that this median computation is absurdly difficult to do in MySQL in a single query.
Suppose you have a table with three columns like table(key, value1, value2).
this query gives you the median value of the two value columns for each key:
SELECT key,
((array_agg(value1 order by value1 asc) )[floor( (count(*)+1)::float/2)] + (array_agg(value1 order by value1 asc) )[ceiling( (count(*)+1)::float/2) ] )/2,
((array_agg(value2 order by value2 asc) )[floor( (count(*)+1)::float/2)] + (array_agg(value2 order by value2 asc) )[ceiling( (count(*)+1)::float/2) ] )/2
FROM table
GROUP BY key

MySQL - match post code based on one or two first characters

I'm trying to create a SQL statement to find the matching record based on the provided post code and stored post codes in the database plus the weight aspect.
The post codes in the database are between 1 or 2 characters i.e. B, BA ...
Now - the value passed to the SQL statement will always have 2 first characters of the client's post code. How can I find the match for it? Say I have a post code B1, which would only match the single B in the database plus the weight aspect, which I'm ok with.
Here's my current SQL statement, which also takes the factor of the free shipping above certain weight:
SELECT `s`.*,
IF (
'{$weight}' > (
SELECT MAX(`weight_from`)
FROM `shipping`
WHERE UPPER(SUBSTRING(`post_code`, 1, 2)) = 'B1'
),
(
SELECT `cost`
FROM `shipping`
WHERE UPPER(SUBSTRING(`post_code`, 1, 2)) = 'B1'
ORDER BY `weight_from` DESC
LIMIT 0, 1
),
`s`.`cost`
) AS `cost`
FROM `shipping` `s`
WHERE UPPER(SUBSTRING(`s`.`post_code`, 1, 2)) = 'B1'
AND
(
(
'{$weight}' > (
SELECT MAX(`weight_from`)
FROM `shipping`
WHERE UPPER(SUBSTRING(`post_code`, 1, 2)) = 'B1'
)
)
OR
('{$weight}' BETWEEN `s`.`weight_from` AND `s`.`weight_to`)
)
LIMIT 0, 1
The above however uses the SUBSTRING() function with hard coded number of characters set to 2 - this is where I need some help really to make it match only number of characters that matches the provided post code - in this case B1.
Marcus - thanks for the help - outstanding example - here's what my code look like for those who also wonder:
First I've run the following statement to get the right post code:
(
SELECT `post_code`
FROM `shipping`
WHERE `post_code` = 'B1'
)
UNION
(
SELECT `post_code`
FROM `shipping`
WHERE `post_code` = SUBSTRING('B1', 1, 1)
)
ORDER BY `post_code` DESC
LIMIT 0, 1
Then, based on the returned value assigned to the 'post_code' index my second statement followed with:
$post_code = $result['post_code'];
SELECT `s`.*,
IF (
'1000' > (
SELECT MAX(`weight_from`)
FROM `shipping`
WHERE `post_code` = '{$post_code}'
),
(
SELECT `cost`
FROM `shipping`
WHERE `post_code` = '{$post_code}'
ORDER BY `weight_from` DESC
LIMIT 0, 1
),
`s`.`cost`
) AS `cost`
FROM `shipping` `s`
WHERE `s`.`post_code` = '{$post_code}'
AND
(
(
'1000' > (
SELECT MAX(`weight_from`)
FROM `shipping`
WHERE `post_code` = '{$post_code}'
ORDER BY LENGTH(`post_code`) DESC
)
)
OR
('1000' BETWEEN `s`.`weight_from` AND `s`.`weight_to`)
)
LIMIT 0, 1
The following query will get all results where the post_code in the shipping table matches the beginning of the passed in post_code, then it orders it most explicit to least explicit, returning the most explicit one:
SELECT *
FROM shipping
WHERE post_code = SUBSTRING('B1', 1, LENGTH(post_code))
ORDER BY LENGTH(post_code) DESC
LIMIT 1
Update
While this query is flexible, it's not very fast, since it can't utilize an index. If the shipping table is large, and you'll only pass in up to two characters, it might be faster to make two separate calls.
First, try the most explicit call.
SELECT *
FROM shipping
WHERE post_code = 'B1'
If it doesn't return a result then search on a single character:
SELECT *
FROM shipping
WHERE post_code = SUBSTRING('B1', 1, 1)
Of course, you can combine these with a UNION if you must do it in a single call:
SELECT * FROM
((SELECT *
FROM shipping
WHERE post_code = 'B1')
UNION
(SELECT *
FROM shipping
WHERE post_code = SUBSTRING('B1', 1, 1))) a
ORDER BY post_code DESC
LIMIT 1