mqsql - check if selected rows contain the same value - mysql

I need to check in mysql if certain columns contain the same value, but don't actually know the value yet. All the solutions I found until now were using count in combination with a where clause. But that doesn't work for me, because I don't know the values of the colums. For example:
Index ColB ColC ColD ColE
1 1 cat 1.3 black
2 1 cat 1.3 black
3 1 cat 1.3 white
4 1 cat 1.3 tiger
5 1 cat 1.3 white
I would like to check if the 3 columns ColB,ColC and ColD have the same value. For the table above it should return true. However for the following table it should return false
Index ColB ColC ColD ColE
1 1 dog 1.3 black
2 1 cat 1.3 black
3 2 cat 1.3 white
4 1 cat 1.3 tiger
5 1 cat 2.7 white
The rule should be sth like that: if(ColB_hasDifferentValues || ColC_hasDifferentValues || ColD_hasDifferentValues) { return true } ;
Is that possible? As I said before, I don't know which animals are included in ColC, as users can insert new animals.
Thanks a lot in advance!

Just use max() and min():
select (case when max(b) = min(b) and max(c) = min(c) and max(d) = min(d)
then 'same'
else 'different'
end)
from t;
This logic ignores NULL values (the OP does not mention NULL values at all). The idea can be extended, but the logic is a wee bit more complex.

Related

Why does count(distinct ..) return different values on the same table?

select count(distinct a,b,c,d) from mytable;
select count(distinct concat(a,'-',b),concat(c,'-',d)) from mytable;
Since '-' never appears in a,b,c,d fields, the 2 queries above should return the same result. Am I right ?
Actually it is not the case, the difference is 4 rows out of ~60M and I cant figure out how this is possible
Any idea or example ?
Thanks
First, I am assuming that you are using MySQL, because that is the only database of your original tags where your syntax would be accepted.
Second, this does not directly answer your question. Given your types and expressions, I do not see how you can get different results. However, very similar constructs can produce different results.
It is very important to note that NULL is not the culprit. If any argument is NULL for either COUNT(DISTINCT) or CONCAT(), then the result is NULL -- and NULLs are not counted.
However, spaces at the end of strings can be an issue. Consider the results from this query:
select count(distinct x, y),
count(distinct concat(x, '-', y)),
count(distinct concat(y, '-', x))
from (select 1 as x, 'a' as y union all
select 1, 'a ' union all
select 1, NULL
) a
I would expect the second and third arguments to return the same thing. But spaces at the end of the string cause differences. COUNT(DISTINCT) ignores them. However, CONCAT() will embed them in the string. Hence, the above returns
1 1 2
And the two values are different.
In other words, two values may not be exactly the same, but COUNT(DISTINCT) might regard them as the same. Spaces are one example. Collations are another potential culprit.
Take example of sample data
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 2 5 7
1 3 3 4
1 3 3 4
then count (distinct (a, b, c, d)) = 4
A B C D
1 2 3 4
5 6 7 8
1 2 5 7
1 3 3 4
and count (distinct (a,-,b), distinct (c,-,d)) = 3
dist (a,-,b) dist (c,-,d)
1 2 3 4
5 6 7 8
1 3 5 7

MYSQL Can WHERE IN default to ALL if no rows returned

Have a existing table of results like this;
race_id race_num racer_id place
1 0 32 2
1 1 32 3
1 2 32 1
1 3 32 6
1 0 44 2
1 1 44 2
1 2 44 2
1 3 44 2
etc...
Have lots of PHP scripts that access this table output the results in a nice format.
Now I have a case where I need to output the results for only certain race_nums.
So I have created this table races_included.
race_view race_id race_num
Day 1 1 0
Day 1 1 1
Day 2 1 2
Day 2 1 3
And can use this query to get the right results.
SELECT racer_id, place from results WHERE race_id=1
AND race_num IN
(SELECT race_num FROM races_included WHERE race_id='1' AND race_view='Day 1')
This is great but I only need this feature for a few races and to have it work in a compatible mode for the simple case show all races. I need to add alot of rows to the races_included table. Like
race_view race_id race_num
All 1 0
All 1 1
All 1 2
All 1 3
95% of my races don't use the daily feature.
So I am looking for a way to change the query so that if for race 1 there are no records in the races_included table it defaults to all races. In addition I need it to be close the same execution speed as the query without the IN clause, because this query Or variations of it are used a lot.
One way that does work is to redefine the table as races_excluded and use NOT IN. This works great but is a pain to manage the table when races are added or deleted.
Is there a simple way to use EXISTS and IN in tandem as a subquery to get the desired results? Or some other neat trick I am missing.
To clarify I have found a working but very slow solution.
SELECT * FROM race_results WHERE race_id=1
AND FIND_IN_SET(race_num, (SELECT IF((SELECT Count(*) FROM races_excluded
WHERE rid=1>0),(SELECT GROUP_CONCAT(rnum) FROM races_excluded
WHERE rid=1 AND race_view='Day 1' GROUP BY rid),race_num)))
It basically checks if any records exists for that race_id and if not return a set equal to the current race_num and if yes returns a list of included race nums.
You can do this by using or in the subquery:
SELECT racer_id, plac
from results
WHERE race_id = 1 AND
race_num IN (SELECT race_num
FROM races_included
WHERE race_id = '1' AND (race_view = 'Day 1' or raw_view = 'ANY')
);

Compare rows and get percentage

I found it hard to find a fitting title. For simplicity let's say I have the following table:
cook_id cook_rating
1 2
1 1
1 3
1 4
1 2
1 2
1 1
1 3
1 5
1 4
2 5
2 2
Now I would like to get an output of 'good' cooks. A good cook is someone who has a rating of at least 70% of 1, 2 or 3, but not 4 or 5.
So in my example table, the cook with id 1 has a total of 10 ratings, 7 of which have type 1, 2 and 3. Only three have type 4 or 5. Therefore the cook with id 1 would be a 'good' cook, and the output should be the cook's id with the number of good ratings.
cook_id cook_rating
1 7
The cook with id 2, however, doesn't satisfy my condition, therefore should not be listed at all.
select cook_id, count(cook_rating) - sum(case when cook_rating = 4 OR cook_rating = 5 then 1 else 0 end) as numberOfGoodRatings from cook
where cook_rating in (1,2,3,4,5)
group by cook_id
order by numberOfGoodRatings desc
However, this doesn't take into account the fact that there might be more 4 or 5 than good ratings, resulting in negative outputs. Plus, the requirement of at least 70% is not included.
You can get this with a comparison in your HAVING clause. If you must have just the two columns in the result set, this can be wrapped as a sub-select select cook_id, positive_ratings FROM (...)
SELECT
cook_id,
count(cook_rating < 4 OR cook_rating IS NULL) as positive_ratings,
count(*) as total_ratings
FROM cook
GROUP BY cook_id
HAVING (positive_ratings / total_ratings) >= 0.70
ORDER BY positive_ratings DESC
Edit Note that count(cook_rating < 4) is intended to only count rows where the rating is less than 4. The MySQL documentation says that count will only count non-null rows. I haven't tested this to see if it equates FALSE with NULL but I would be surprised it it doesn't. Worst case scenario we would need to wrap that in an IF(cook_rating < 4, 1,NULL).
I suggest you change a little your schema to make this kind of queries trivial.
Suppose you add 5 columns to your cook table, to simply count the number of each ratings :
nb_ratings_1 nb_ratings_2 nb_ratings_3 nb_ratings_4 nb_ratings_5
Updating such a table when a new rating is entered in DB is trivial, just as would be recomputing those numbers if having redundancy makes you nervous. And it makes all filterings and sortings fast and easy.

MySQL: Matching inexact values using "ON"

I'm way out of my league here...
I have a mapping table (table1) to assign particular values (value) to a whole number (map_nu). My second table (table2), is a collection of averages (avg) for each user (user_id).
(I couldn't figure out how to properly make a markdown table, please feel free to edit!)
table1: table2:
(value)(Map_nu) (user_id)(avg)
---- -----
1 1 1 1.111
1.045 2 2 1.2
1.09 3 3 1.33333
1.135 4 4 1
1.18 5 5 1.389
1.225 6 6 1.42
1.27 7 7 1.07
1.315 8
1.36 9
1.405 10
The value Map_nu is a special number that each user gets assigned according to their average. I need to find a way to match the averages from table2 to the closest value in table1. I only need to match to the 2 digit past the decimal, so I've added the Truncated function
SELECT table2.user_id, map_nu
FROM `table1`
JOIN table2 ON TRUNCATE(table1.value,2)=TRUNCATE(table2.avg,2)
I still miss the values that don't match the averages exactly. Is there a way to pick the nearest truncated value or even to round to the second decimal? Rounding up/down wont matter as long as its applied to all values the same.
I am trying to have the following result (if rounded up):
(user_id)(Map_nu)
----
1 4
2 6
3 6
4 1
5 10
6 11
7 3
Thanks!
i think you might have to do this in 2 separate queries. there is no 'nearest' operator in sql, so you can either calculate it in your software, or you could use
select map_nu from table1 ORDER BY abs(value - $avg) LIMIT 1
inside a loop. however, that cannot be used as a join function as it requires the ORDER and LIMIT which are not valid as joins.
another way of looking at it is it seems that your map_nu and value are deterministic in relation to each other - value = 1 + ((map_nu - 1) * 0.045) - so maybe you could make use of that fact and calculate an integer based on that equation? assuming that relationship holds true for all values of map_nu.
This is an awkward database design. What is the data representing and what are you trying to solve? There might be a better way.
Maybe do something like...
SELECT a.user_id, b.map_nu, abs(a.avg - b.value)
FROM
table2 a
join table1 b
left join table1 c on abs(a.avg - b.value) > abs(a.avg - c.value)
where c.value is null
order by a.user_id
Doesn't actually produce the same output as the one you were expecting for (doesn't do any rounding). Though you should be able to tweak it from there. Above query will produce the output below (w/ data you've provided):
user_id map_nu abs(a.avg - b.value)
------- ------ --------------------
1 3 0.0209999999999999
2 5 0.02
3 8 0.01833
4 1 0
5 10 0.016
6 10 0.0149999999999999
7 3 0.02
Beware though if you're dealing with large tables. Evaluate the explain of the above query if it'll be practical to run it within MySQL or if better to be done outside it.
Note 2: Will produce duplicate rows if there are avg values that are equi-distant to value values within table1 (Ex. if value for map_nu's 11 and 12 are 2 and 3 and someone get's an avg of 2.5). Your question doesn't really specify what to do for that so you might want to take that into account.
Its taking a little extra work, but I figure the easiest way to get my results will be to map all values to the second decimal place in table1:
1 1
1.01 1
1.02 1
1.03 1
1.04 1
1.05 2
1.06 2
1.07 2
1.08 2
1.09 3
1.1 3
1.11 3
1.12 3
1.13 3
1.14 4
...
Thanks for the suggestions! Sorry I couldn't present the question more clear.

MySQL: get differences of each sorted column in set of rows

Here is a simple scenario with table characters:
CharacterName GameTime Gold Live
Foo 10 100 3
Foo 20 100 2
Foo 30 95 2
How do I get this output for the query SELECT Gold, Live FROM characters WHERE name = 'Foo' ORDER BY GameTime:
Gold Live
100 3
0 -1
-5 0
using MySQL stored procedure (or query) if it's even possible? I thought of using 2 arrays like how one would normally do in a server-side language, but MySQL doesn't have array types.
While I'm aware it's probably easier to do in PHP (my server-side langauge), I want to know if it's possible to do in MySQL, just as a learning material.
Do you have an ID on your Table.
GameID CharacterName GameTime Gold Live
----------- ------------- ----------- ----------- -----------
1 Foo 10 100 3
2 Foo 20 100 2
3 Foo 30 95 2
If so you could do a staggered join onto itself
SELECT
c.CharacterName,
CASE WHEN c_offset.Gold IS NOT NULL THEN c.Gold - c_offset.Gold ELSE c.Gold END AS Gold,
CASE WHEN c_offset.Live IS NOT NULL THEN c.Live - c_offset.Live ELSE c.Live END AS Live
FROM Characters c
LEFT OUTER JOIN Characters c_offset
ON c.GameID - 1 = c_offSet.GameID
ORDER BY
c.GameTime
Essentially it joins each game row to the previous game row and does a diff between the values. That returns the following for me.
CharacterName Gold Live
------------- ----------- -----------
Foo 100 3
Foo 0 -1
Foo -5 0
One possible solution using a temporary table:
CREATE TABLE characters_by_gametime (
id INTEGER AUTO_INCREMENT PRIMARY KEY,
gold INTEGER,
live INTEGER);
INSERT INTO characters_by_gametime (gold, live)
SELECT gold, live
FROM characters
ORDER BY game_time;
SELECT
c1.id,
c1.gold - IFNULL(c2.gold, 0) AS gold,
c1.live - IFNULL(c2.live, 0) AS live
FROM
characters_by_gametime c1
LEFT JOIN characters_by_gametime c2
ON c1.id = c2.id + 1
ORDER BY
c1.id;
Of course Eoin's solution is better if your id column follows the order you want in the output.