Prioritizing rows in a mysql result - mysql

I am trying to find a simple solution to a Mysql problem. The problem is, I have 2 columns A and B, and in some rows I have data X and Y in these columns respectively. Now I have this data represented as below (only an example). The table may also have other data as presented below.
A | B
X | Y
X | Z
F | Y
X | Y
S | T
X | T
S | Y
Now I want to run a query which finds X and Y in their respective columns but shows them by priority. That is, columns where X and Y appear together should come up first and then the ones with individual columns.
Kindly note that there may be more columns and the data prioritization would apply accordingly.So more the columns match, more should be their priority while fetching results.
Data comparison is per field only.
I'm just exploring if this situation could be handled by Mysql or I have to look at something from PHP.

You can order by value, it should be st. like
... ORDER BY FIELD(A, 'X') DESC, FIELD(B, 'Y') DESC

You can use a "formula" to compute the weighs for your ordering:
SELECT A, B FROM TEST
ORDER BY CASE WHEN (A = 'X' AND B = 'Y') THEN 1 ELSE 0 END DESC
Check the example online: http://sqlfiddle.com/#!9/6545e4/8

Related

Resize tables based on number of records in SSRS

I have two table having data like below:
SELECT *
FROM [dbo].[TestTable_1]
ID Value
----------
1 gjha
2 dc
3 d
4 ds
5 dg
6 hn
2nd table:
SELECT *
FROM [dbo].[TestTable_2]
Value
-----
jklsa
dfv
b
grt
trj
h
muik
rg
kuu
wd
gb
nm
wef
I'm fetching the data in SSRS report as below:
Question is:
How can I maintain the table size same? That is, if small table in SSRS report has 6 records (which is in this case),
the bigger one should adjust same size as small and the extra (/more) records that are coming in the large table should shift to right.
Here is the expected output from SSRS
Value Value
-------- -----------------
gjha jklsa |muik | wef
dc dfv |rg |
d b |kuu |
ds grt |wd |
dg trj |gb |
hn h |nm |
Note: The above details are just example, however, the number of records are really dynamic.
This is not a full answer as it's just what came to mind and is completely untested.
First thing is to search SO for ways to create a multi-column table, there are plenty of answers already so I won't explain in detail here. They usually involve adding RowNumber to each row which you can then use to calculate a matrix row and matrix column number, the column number can be used in a matrix as the column group. (e.g. if the row limit is 6 and the row number is 14, that will have a final row number of 2 (14 mod 6 = 2) and a column number of 3 as Floor(14/6)+1 = 3.
Next, create dataset that just gets the highest row count from each of your tables. Something like
DECLARE #a int
DECLARE #b int
SELECT #a = COUNT(*) FROM myTableA
SELECT #b = COUNT(*) FROM myTableA
SELECT CASE WHEN #a<=#b THEN #a ELSE #b END AS maxRows
Now you have the size of the smallest table, you can pass that as a parameter to the proc that gets the actual data from the two tables (this would be 6 in our example above)
I just answered a similar question here: https://stackoverflow.com/a/56350614/2033717
You can adapt this solution to your situation by replacing the 3 in the expressions with:
=Floor(Count(Fields!ColumnName.Value, "Dataset1") / Count(Fields!ColumnName.Value, "Dataset1"))
In other words, you're determining how many columns you need. And then grouping each row of the dataset into rows and columns of the matrix. This will work if you know the second table can be bigger than the first, but I'm not sure if it will work both ways without some additional conditions on the expressions.

MySQL Query which require multiple joins

I have a system that is used to log kids' their behavior. If a child is naughty it is logged as negative and if it has a well behaviour it is logged as positive.
For instance - if a child is rude it gets a 'Rude' negative and this is logged in the system with minus x points.
My structure can be seen in this sqlfiddle - http://sqlfiddle.com/#!9/46904
In the users_rewards_logged table, the reward_id column is a foreign key linked to either the deductions OR achievements table depending on the type of column.
If type is 1 is a deduction reward, if the type value is 2 is a achievement reward.
I basically want a query to list out something like this:
+------------------------------+
| reward | points | count |
+------------------------------+
| Good Work | 100 | 1 |
| Rude | -50 | 2 |
+------------------------------+
So it tallys up the figures and matches the reward depending on type (1 is a deduction, 2 is a achievement)
What is a good way to do this, based on the sqlfiddle?
Here's a query that gets the above desired results:
SELECT COALESCE(ua.name, ud.name) AS reward,
SUM(url.points) AS points, COUNT(url.logged_id) AS count
FROM users_rewards_logged url
LEFT JOIN users_deductions ud
ON ud.deduction_id = url.reward_id
AND url.type = 1
LEFT JOIN users_achievements ua
ON ua.achievement_id = url.reward_id
AND url.type = 2
GROUP BY url.reward_id, url.type
Your SQLFiddle had the order of points and type in the wrong order for the table users_rewards_logged.
Here's the fixed SQLFiddle with the result:
reward points count
Good Work 100 1
Rude -50 2
Although eggyal is correct--this is rather bad design for your data--what you ask can be done, but requires a UNION clause:
SELECT users_achievements.name, users_rewards_logged.points, COUNT(*)
FROM users_rewards_logged
INNER JOIN users_achievements ON users_achievements.achievement_id = users_rewards_logged.reward_id
WHERE users_rewards_logged.type = 2
UNION
SELECT users_deductions.name, users_rewards_logged.points, COUNT(*)
FROM users_rewards_logged
INNER JOIN users_deductions ON users_deductions.deduction_id = users_rewards_logged.reward_id
WHERE users_rewards_logged.type = 1
GROUP BY 1, 2
There's no reason NOT to combine the achievements and deductions tables and just use non-conflicting codes. If you combined the tables, then you would no longer need the UNION clause--your query would be MUCH simpler.
I noticed that you have two tables (users_deductions and users_achievements) that defines the type of reward. As #eggyal stated, you are violating the principle of orthogonal design, which causes the lack of normalization of your schema.
So, I have combined the tables users_deductions and users_achievements in one table called reward_type.
The result is in this fiddle: http://sqlfiddle.com/#!9/813d5/6

Select one value from a group based on order from other columns

Problem
Suppose I have this table tab (fiddle available).
| g | a | b | v |
---------------------
| 1 | 3 | 5 | foo |
| 1 | 4 | 7 | bar |
| 1 | 2 | 9 | baz |
| 2 | 1 | 1 | dog |
| 2 | 5 | 2 | cat |
| 2 | 5 | 3 | horse |
| 2 | 3 | 8 | pig |
I'm grouping rows by g, and for each group I want one value from column v. However, I don't want any value, but I want the value from the row with maximal a, and from all of those, the one with maximal b. In other words, my result should be
| 1 | bar |
| 2 | horse |
Current solution
I know of a query to achieve this:
SELECT grps.g,
(SELECT v FROM tab
WHERE g = grps.g
ORDER BY a DESC, b DESC
LIMIT 1) AS r
FROM (SELECT DISTINCT g FROM tab) grps
Question
But I consider this query rather ugly. Mostly because it uses a dependant subquery, which feels like a real performance killer. So I wonder whether there is an easier solution to this problem.
Expected answers
The most likely answer I expect to this question would be some kind of add-on or patch for MySQL (or MariaDB) which does provide a feature for this. But I'll welcome other useful inspirations as well. Anything which works without a dependent subquery would qualify as an answer.
If your solution only works for a single ordering column, i.e. couldn't distinguish between cat and horse, feel free to suggest that answer as well as I expect it to be still useful to the majority of use cases. For example, 100*a+b would be a likely way to order the above data by both columns while still using only a single expression.
I have a few pretty hackish solutions in mind, and might add them after a while, but I'll first look and see whether some nice new ones pour in first.
Benchmark results
As it is pretty hard to compare the various answers just by looking at them, I've run some benchmarks on them. This was run on my own desktop, using MySQL 5.1. The numbers won't compare to any other system, only to one another. You probably should be doing your own tests with your real-life data if performance is crucial to your application. When new answers come in, I might add them to my script, and re-run all the tests.
100,000 items, 1,000 groups to choose from, InnoDb:
0.166s for MvG (from question)
0.520s for RichardTheKiwi
2.199s for xdazz
19.24s for Dems (sequential sub-queries)
48.72s for acatt
100,000 items, 50,000 groups to choose from, InnoDb:
0.356s for xdazz
0.640s for RichardTheKiwi
0.764s for MvG (from question)
51.50s for acatt
too long for Dems (sequential sub-queries)
100,000 items, 100 groups to choose from, InnoDb:
0.163s for MvG (from question)
0.523s for RichardTheKiwi
2.072s for Dems (sequential sub-queries)
17.78s for xdazz
49.85s for acatt
So it seems that my own solution so far isn't all that bad, even with the dependent subquery. Surprisingly, the solution by acatt, which uses a dependent subquery as well and which I therefore would have considered about the same, performs much worse. Probably something the MySQL optimizer can't cope with. The solution RichardTheKiwi proposed seems to have good overall performance as well. The other two solutions heavily depend on the structure of the data. With many groups small groups, xdazz' approach outperforms all other, whereas the solution by Dems performs best (though still not exceptionally good) for few large groups.
SELECT g, a, b, v
FROM (
SELECT *,
#rn := IF(g = #g, #rn + 1, 1) rn,
#g := g
FROM (select #g := null, #rn := 0) x,
tab
ORDER BY g, a desc, b desc, v
) X
WHERE rn = 1;
Single pass. All the other solutions look O(n^2) to me.
This way doesn't use sub-query.
SELECT t1.g, t1.v
FROM tab t1
LEFT JOIN tab t2 ON t1.g = t2.g AND (t1.a < t2.a OR (t1.a = t2.a AND t1.b < t2.b))
WHERE t2.g IS NULL
Explanation:
The LEFT JOIN works on the basis that when t1.a is at its maximum value, there is no s2.a with a greater value and the s2 rows values will be NULL.
Many RDBMS have constructs that are particularly suited to this problem. MySQL isn't one of them.
This leads you to three basic approaches.
Check each record to see if it is one you want, using EXISTS and a correlated sub-query in an EXISTS clause. (#acatt's answer, but I understand that MySQL doesn't always optimise this very well. Ensure that you have a composite index on (g,a,b) before assuming that MySQL won't do this very well.)
Do a half cartesian product to full-fill the same check. Any record which does not join is a target record. Where each group ('g') is large, this can quickly degrade performance (If there are 10 records for each unique value of g, this will yield ~50 records and discard 49. For a group size of 100 it yields ~5000 records and discard 4999), but it is great for small group sizes. (#xdazz's answer.)
Or use multiple sub-queries to determine the MAX(a) and then the MAX(b)...
Multiple sequential sub-queries...
SELECT
yourTable.*
FROM
(SELECT g, MAX(a) AS a FROM yourTable GROUP BY g ) AS searchA
INNER JOIN
(SELECT g, a, MAX(b) AS b FROM yourTable GROUP BY g, a) AS searchB
ON searchA.g = searchB.g
AND searchA.a = searchB.a
INNER JOIN
yourTable
ON yourTable.g = searchB.g
AND yourTable.a = searchB.a
AND yourTable.b = searchB.b
Depending on how MySQL optimises the second sub-query, this may or may not be more performant than the other options. It is, however, the longest (and potentially least maintainable) code for the given task.
Assuming an composite index on all three search fields (g, a, b), I would presume it to be best for large group sizes of g. But that should be tested.
For small group sizes of g, I'd go with #xdazz's answer.
EDIT
There is also a brute force approach.
Create an identical table, but with an AUTO_INCREMENT column as an id.
Insert your table into this clone, ordered by g, a, b.
The id's can then be found with SELECT g, MAX(id).
This result can then be used to look-up the v values you need.
This is unlikely to be the best approach. If it is, it is effectively a condmenation of MySQL's optimiser's ability to deal with this type of problem.
That said, every engine has it's weak spots. So, personally, I try everything until I think I understand how the RDBMS is behaving and can make my choice :)
EDIT
Example using ROW_NUMBER(). (Oracle, SQL Server, PostGreSQL, etc)
SELECT
*
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY g ORDER BY a DESC, b DESC) AS sequence_id,
*
FROM
yourTable
)
AS data
WHERE
sequence_id = 1
This can be solved using a correlated query:
SELECT g, v
FROM tab t
WHERE NOT EXISTS (
SELECT 1
FROM tab
WHERE g = t.g
AND a > t.a
OR (a = t.a AND b > t.b)
)

mySQL - return best results

I want to have a query that returns the best results from a table.
I am defining the best results to be the addition of two columns a + b (each column holds an int)
ie:
entry a b
1 4 5
2 3 2
3 20 30
Entry 3 would be returned because a + b is the highest in this case.
Is there a way to do this? One idea I had was to create another column in the table which holds the addition of a and b and then ORDER by DESC, but that seems a little bit messy.
Any ideas?
Thanks!
SELECT *
FROM mytable
ORDER BY
a + b DESC
LIMIT 1
Adding another column, however, would be a good option, since you could index this column which would improve the query.

MySQL: adding a position column

Given a table:
id | score | position
=====================
1 | 20 | 2
2 | 10 | 3
3 | 30 | 1
What query can I use to optimally set the position columns using the (not nullable) score column?
(I'm sure I've seen this before but I can't seem to get the correct keywords to find it!)
set #rank = 0;
update tbl a join (select score, #rank:=#rank+1 as rank from tbl group by score
order by score desc) b on a.score = b.score set a.position = b.rank;
to update the position in one fell swoop that would do the trick. equal scores get equal position
SELECT * FROM your_table ORDER BY score;
Your position column appears to be redundant, although it is hard to tell from the information given in the question. The functionality you appear to want is accomplished using database row ordering, seen in the above example with the 'ORDER BY' expression. Even if you would want a position column for some good reason, remember that most likely there exists an index for the score column anyway, which would in most cases be doing exactly the same thing that a position column would do. Either that, or I completely misunderstand your question.
Calculate it in the language which you are using. Thus your solution will be:
more portable
more readable
equivalently efficient