Missing an option to use SQL's HAVING clause in F3 - mysql

Is there a way to use MySQL's HAVING clause with any of Fat Free Framework's SQL Mapper object's methods? Let's assume I have the following DB table:
+----+-------+--------+
| id | score | weight |
+----+-------+--------+
| 2 | 1 | 1 |
| 2 | 2 | 3 |
| 2 | 3 | 1 |
| 2 | 2 | 2 |
| 3 | 1 | 4 |
| 3 | 3 | 1 |
| 3 | 4 | 3 |
+----+-------+--------+
Now I would like to run a following query:
SELECT id, SUM(score*weight)/SUM(weight) AS weighted_score GROUP BY id HAVING weighted_score>2
Truth to be told I would actually like to count the number of these records, but a count method doesn't support $options.
I can run the query without a HAVING clause and then loop through them to check weighted_score against the value, but with a growing number of records will make it more and more resource consuming. Is there any built-in solution to solve this problem?
EDIT 1:
The way I know how to do it if there is no support for the HAVING clause (based on manual):
$databaseObject = new DB\SQL(...);
$dataMapper = new \DB\SQL\Mapper($databaseObject, "tableName");
$dataMapper->weightedScore = "SUM(weight*score)/SUM(weight)";
$usersInfo = $dataMapper->find([],["group"=>"id"]);
$place = 1;
foreach ( $usersInfo as $userInfo ) {
if ( $usersScores->weightedScore > 2) $place++;
}
If I were able to use HAVING clause then the foreach loop would not be needed and the number of items loaded by a query would be reduced:
$databaseObject = new DB\SQL(...);
$dataMapper = new \DB\SQL\Mapper($databaseObject, "tableName");
$dataMapper->weightedScore = "SUM(weight*score)/SUM(weight)";
$usersInfo = $dataMapper->find([],["group"=>"id", "having"=>"weighted_score<2"]); // rough idea
$place = count($usersInfo);
And if count method supported $options it would be even simpler and it would save memory used by the app as no records would be loaded:
$databaseObject = new DB\SQL(...);
$dataMapper = new \DB\SQL\Mapper($databaseObject, "tableName");
$dataMapper->weightedScore = "SUM(weight*score)/SUM(weight)";
$place = $dataMapper->count([],["group"=>"id", "having"=>"weighted_score<2"]); // rough idea

Use Sub Query.
select count (0) from (SELECT id, SUM(score*weight)/SUM(weight) AS weighted_score GROUP BY id) where weighted_score>2;
Hope it will help.

As far as I know, you can put the HAVING clause into the group option:
$usersInfo = $dataMapper->find([],["group"=>"id HAVING weighted_score<2"]);
Another way could be to create a VIEW in mysql and filter the records on a virtual fields in that view.

Related

MySQL: Performance on user settings stored in one row (flat) versus multiple rows (key-value pairs)

Which of the following two queries would be faster? more performant?
Setup A
userSetting table just includes all parameters as columns
userSettingId | userId | marketingEmail | weeklyEmail | pushNotifications
--------------------------------------------------------------------
120 | 1 | 1 | 1 | 0
select userSetting.userId, user.email
from userSetting
INNER JOIN user ON userSetting.userId = user.userId
where marketingNotifEmail = 1;
or
Setup B
userBoolSetting table with keeps key/value pairs, where the value is a boolean, 0 or 1
userBoolSettingId | userId | description | value
-----------------------------------------------------------------
121 | 1 | marketingEmail | 1
122 | 1 | weeklyEmail | 1
123 | 1 | pushNotifications | 0
select userBoolSetting.userId, user.email
from userBoolSetting
INNER JOIN user ON userBoolSetting.userId = user.userId
where notificationType = 'marketingEmail'
AND isEnabled = 1;
Also, for the sake of clarity, I'd be looking at the performance at a bit larger table than these examples. Which query would be most performant for a larger data set, say 50-100 parameters, not just 3 as shown.
As has been pointed out many times in this forum, splaying an array of things across columns is not good because it is unmaintainable, etc.
Performance of fetching a hundred rows is not bad.
Throwing them into a JSON string is another option.

SELECT from Union x 3 using filter of another table

Background
I have a web application which must remove entries from other tables, filtered through a selection of 'tielists' from table 1 -> item_table 1, table 2, table 3.... now basically my result set is going to be filthy big unless I use a filter statement from another table, using a user_id... so can someone please help me structure my statement as needed? TY!
Tables
cars_belonging_to_user
-----------------------------
ID | user_id | make | model
----------------------------
1 | 1 | Toyota | Camry
2 | 1 |Infinity| Q55
3 | 1 | DMC | DeLorean
4 | 2 | Acura | RSX
Okay, Now the three 'tielists'
name:tielist_one
----------------------------
id | id_of_car | id_x | id_y|
1 | 1 | 12 | 22 |
2 | 2 | 23 | 32 |
-----------------------------
name:tielist_two
-------------------------------
id | id_of_car | id_x | id_z|
1 | 3 | 32 | 22 |
-----------------------------
name: tielist_three
id | id_of_car | id_x | id_a|
1 | 4 | 45 | 2 |
------------------------------
Result Set and Code
echo name_of_tielist_table
// I can structure if statements to echo result sets based upon the name
// Future Methodology: if car_id is in tielist_one, delete id_x from x_table, delete id_y from y_table...
// My output should be a double select base:
--SELECT * tielists from WHERE car_id is 1... output name of tielist... then
--SELECT * from specific_tielist where car_id is 1.....delete x_table, delete y_table...
Considering the list will be massive, and the tielist equally long, I must filter the results where car_id(id) = $variable && user_id = $id....
Side Notes
Only one car id will appear once in any single tielist..
This select statement MUST be filtered with user_id = $variable... (and remember, i'm looking for which car id too)
I MUST HAVE THE NAME of the tielist it comes from able to be echo'd into a variable...
I will only be looking for one single id_of_car at any given time, because this select will be contained in a foreach loop.
I was thinking a union all items would do the trick to select the row, but how can I get the name of the tielist the row is in, and how can the filter be used from the user_id row
If you want performance, I would suggest left outer join instead of union all. This will allow the query to make efficient use of indexes for your purpose.
Based on what you say, a car is in exactly one of the lists. This is important for this method to work. Here is the SQL:
select cu.*,
coalesce(tl1.id_x, tl2.id_x, tl3.id_x) as id_x,
tl1.y, tl2.idz, tl3.id_a,
(case when tl1.id is not null then 'One'
when tl2.id is not null then 'Two'
when tl3.id is not null then 'Three'
end) as TieList
from Cars_Belonging_To_User cu left ouer join
TieList_One tl1
on cu.id_of_car = tl1.id_of_car left outer join
TieList_Two tl2
on cu.id_of_car = tl2.id_of_car left outer join
TieList_Three tl3
on cu.id_of_car = tl3.id_of_car;
You can then add a where clause to filter as you need.
If you have an index on id_of_car for each tielist table, then the performance should be quite good. If the where clause uses an index on the first table, then the joins and where should all be using indexes, and the query will be quite fast.

selecting specific row number from select

I'm a beginner at MySQL syntax. So there are a few question I want to ask.
I got a clue DB where users can add in clues. And have a webmethod that select a range of numbers to do random function. (random for the sake of the game, no point doing same clue over and over right?)
But my main problem right now is that what if the author decided to add in more clues?
then my clue db will be looking like this.
+--------+-------------+-----------+--------+
| cID | clueDetails | location | author |
+--------+-------------+-----------+--------+
| 1 | abcde | loc 1 | auth 1 |
| 2 | efghi | loc 1 | auth 1 |
| 3 | jklmno | loc 2 | auth 1 |
| 4 | pqrstu | loc 2 | auth 1 |
| 5 | vwxyz | loc 1 | auth 1 |
+--------+-------------+-----------+--------+
If the player select loc1 auth 1, it will be showing cID 1,2 and 5. so I couldn't use my random function effectively as it select the first and last of loc and auth and 3 and 4 doesnt fit in. I know right now it's very vague as information are scarce. And to actually understand the whole process, goes right down to the game, and the method/function I have. (which will be very long)
Cutting to the chase, my result will be something as shown below, and the way to identify it will be by cID, but in the event that clue were added in different order ( as shown above) then my function will get rather screw up.
EDIT: assuming this random function give me back 2 clues, because I want to play 2 clues. this random function give me back 1 and 3. so from the table result below, 1 and 3 will give me cID1 and cID5 as they are row number 1 and 3. (sorry for the confusion caused)
+--------+-------------+-----------+--------+
| cID | clueDetails | location | author |
+--------+-------------+-----------+--------+
| 1 | abcde | loc 1 | auth 1 |
| 2 | efghi | loc 1 | auth 1 |
| 5 | vwxyz | loc 1 | auth 1 |
+--------+-------------+-----------+--------+
So with that, I want to ask if can we select row by its number? e.g row[3] = cID 5, vwxyz, loc 1, auth 1.
As far as I'm concerned, I've done massive research and there doesn't seem to be any function in MySQL that allow us to select by row number. (though all the article were pretty old dated, 2010 and before. Not sure if MySQL has added in any new function)
I saw a SO thread - MySQL - Get row number on select and from how I see it, it seems to be generating a field called ranking.
What I want to know is, is this field ranking temp or permanent? Because if it's just a temp field, then I could shift the identifier from cID to this numbering.
Or do any of you have any suggestion to go around solving this issue? I thought of clearing the db, and re create the db, but that will be taking too much time. And over time when the DB get large it will be slower as well. And another method is to make a datatable to fill all the current clue where loc=?loc and auth=?auth and add them in again with the new clue(latest), but i figure that will cause the cID to boom and fly at a very fast rate. And I'm afraid this will cause memory management issue / memory leak.
EDIT2: As the create field is just a temp field, and seem to be the only alternative, I tried this MySQL command.
set #rank=0;
select #rank:=#rank+1 AS rank, cId, clueDetails, location, author from tbl_clue where location = "loc" and author = "auth" order by rank ASC
It seem to display what I want, but my command seem different from what other usually give. (more bracket and other stuff). Is my command ok? will there be any indirect implication caused by it?
You can try this one. Please add a comment if this helps :)
SELECT cID, clueDetails, location, author
FROM
(
SELECT #rownum := #rownum + 1 as `RowNo`,
p.cID,
p.clueDetails,
p.location,
p.author
FROM (
SELECT cID, clueDetails, location, author
FROM myTableName
WHERE location = 'loc 1' AND author = 'auth 1'
) p , (SELECT #rownum:=0) r
) y
WHERE y.RowNo = 3
ORDER BY RowNo
I'm not sure if I understand you correctly, but assuming you end up with:
+--------+-------------+-----------+--------+
| cID | clueDetails | location | author |
+--------+-------------+-----------+--------+
| 1 | abcde | loc 1 | auth 1 |
| 2 | efghi | loc 1 | auth 1 |
| 5 | vwxyz | loc 1 | auth 1 |
+--------+-------------+-----------+--------+
and you only want one record at random instead of 3 records you could do the following:
$query = "THE QUERY";
if ($result = $dbc->query($query))
{
$num_rows = mysql_num_rows($result);
$random_number = rand(1, $num_rows);
$count = 1;
while($nt = $result->fetch_assoc())
{
if ($count = $random_number)
{
//SAVE THE CLUE DETAILS
}
$count = $count + 1;
}
}

Mysql query Max not working

What i want to happen is group by parentid first, then group by position, which i have done. In that group i want the name with the highest rating to be displayed, which isn't happening. Instead the lowest id for each group is being displayed. The results should be tv1,tv3,tv5,tv7; as these are the highest rated values for each group.
id | name| parentid| position| rating |
1 | tv1 | 1 | 1 | 6 |
2 | tv2 | 1 | 2 | 5 |
3 | tv3 | 1 | 2 | 7 |
4 | tv4 | 1 | 2 | 3 |
5 | tv5 | 5 | 1 | 8 |
6 | tv6 | 5 | 1 | 2 |
7 | tv7 | 3 | 1 | 9 |
8 | tv8 | 3 | 1 | 3 |
$getquery = mysql_query("SELECT name,MAX(rating) FROM outcomes GROUP BY position,parentid") or die(mysql_error());
while($row=mysql_fetch_assoc($getquery)) {
$name = $row['name'];
$rating = $row['rating'];
echo "<p>Name: $name - $rating</p><p></p>";
}
It's not that the lowest id is being displayed -- you're not actually selecting the id column. Probably what you are seeing is the first entry in the name column for each group.
SELECT name, MAX(rating)
doesn't do what you think it does -- it doesn't instruct MySQL to pick the maximum value from the rating column, and also return the name that is associated with that row (aside: what do you think it would return if there was a tie for the maximum rating? What do you think it would return if you used AVERAGE rather than MAX?)
What it does instead is return the correctly calculated MAX(rating), and then one of the names out of that group. It doesn't guarantee which one gets returned, and it can change depending on how it decides to execute the query.
In fact, because of the undefined nature of a query such as this, it's not even legal SQL in other databases. (Try this in Postgres, and you'll get an error. Heck, try it in MySQL with the ONLY_FULL_GROUP_BY option enabled, and you'll get a similar error)
If what you want to do is find the maximum rating for each group, and then find the name associated with it, you'll have to do something like this:
SELECT name, max_rating FROM outcomes
JOIN (SELECT position, parentid, MAX(rating) AS max_rating from outcomes group by position, parentid) AS aggregated_table
USING (position, parentid)
WHERE rating = max_rating
(There are four or five other ways to do this, searching this site for mysql and aggregation will likely turn them up)

Optimizing sql join query, comparing query effectiveness

I'm a student working on a module for moodle cms (course management system) of my college. I have to write some join queries for my module. I can not make changes to table structures, they are pretty much set in stone (I didn't make them, they were given to me).
I have no experience with writing queries for large databases. I've created a working prototype of my module and now I'm trying to organize the code/optimize queries etc.
Tasks:
| id | task |
--------------------
| 1 | task1 |
| 2 | task3 |
| 3 | task3 |
| 4 | task4 |
| ... | ... |
Assets:
| id | asset |
--------------------
| 1 | task1 |
| 2 | task3 |
| 3 | task3 |
| 4 | task4 |
| ... | ... |
TaskAsset:
| id | taskid | assetid | coefficient |
-----------------------------------------------
| 1 | 2 | 33 | coefficient1 |
| 2 | 5 | 35 | coefficient2 |
| 3 | 6 | 36 | coefficient3 |
| 4 | 8 | 37 | coefficient4 |
| 5 | ... | ... | ... |
$query = "SELECT TaskAsset.id as id, Assets.asset AS asset, Tasks.task AS task
, coefficient
FROM Tasks, Assets, Taskasset
WHERE Taskasset.taskid= Tasks.id AND TaskAsset.assetid = Assets.id";
$result = mysql_query($query) or die(mysql_error());
while($row = mysql_fetch_array($result))
{
echo $row['id']." - ".$row['asset']." - ".$row['task'] . $row['coefficient'];
echo "<br />";
}
Questions:
1.) So, if table structures are like these, is my query effective?
If they are, is a simple join still effective if I have to join more tables? Like 4 or 5?
2.) How do I rate effectiveness of queries? In phpmyadmin, I can see the time it took for the query to run. I've never used anything else for this because my tables had very few records, so it did not matter.
The only thing that I would do differently is explicitly specify the joins.
$query = "SELECT ta.id as id, a.asset AS asset, t.task AS task
, coefficient
FROM TaskAsset ta
JOIN Tasks t ON ta.taskId = t.id
JOIN Assets a ON ta.assetId = a.id";
This does the same thing but I personally prefer it a lot better. That said, you should try to run an EXPLAIN on your query. That is where you'll see the pressure points.
Your query is fine as is from an optimality standpoint, assuming indexes are present on the id fields of the tables. With the right indexes, you can join many more tables and the performance will still be good.
You should try to get yourself familiar with the ANSI join syntax - as this is much easier to read than the old FROM x, y, z ... style joins - and it's also more difficult to get wrong!
This query is appropriate for the results that you want.
TaskAssets is a mapping table that is meant to join columns of Task and Asset together by foreign keys. You need to view columns from all three tables for your result set so this is the most efficient way for it to be done.
What might be even more important than the query are the indexes in the tables.
You are doing
SELECT ta.id as id, a.asset AS asset, t.task AS task, coefficient
FROM TaskAsset ta
JOIN Tasks t ON ta.taskId = t.id <-- equi join here
JOIN Assets a ON ta.assetId = a.id <-- another equi join.
This query has two equi joins.
Always assign indexes on fields involved in an equi-join.
Consider assigning indexes on fields involved in a where clause (this query doesn't have any but that's beside the point)
Strongly consider putting an index on a field used in a group by clause