I have a database representing something like a bookstore. There's a table containing the categories that books can be in. Some categories are defined simply using another table that contains the category-item relationships. But there are also some categories that can be defined programmatically -- a category for a specific author can be defined using a query (SELECT item_id FROM items WHERE author = "John Smith"). So my categories table has a "query" column; if it's not null, I use this to get the items in the category, otherwise I use the category_items table.
Currently, I have the application (PHP code) make this decision, but this means lots of separate queries when we iterate over all the categories. Is there some way to incorporate this dynamic SQL into a join? Something like:
SELECT c.category, IF(c.query IS NULL, count(i.items), count(EXECUTE c.query)
FROM categories c
LEFT OUTER JOIN category_items i
ON c.category = i.category
EXECUTE requires a prepared statement, but I need to prepare a different statement for each row. Also, EXECUTE can't be used in expressions, it's just a toplevel statement. Suggestions?
What happens when you want to list books by publisher? Country? Language? You'd have to throw them all into a single "category_items" table. How would you pick which dynamic query to execute? The query-within-a-query method is not going to work.
I think your concept of "category" is too broad, which is resulting in overly complicated SQL. I would replace "category" to represent only "genre" (for books). Genres are defined in their own table, and item_genres connects them to the items table. Books-by-author and books-by-genre should just be separate queries at the application level, rather than trying to do them both with the same (sort of) query at the database/SQL level. (If you have music as well as books, they probably shouldn't all be stored in a single "items" table because they're different concepts ... have different genres, author vs. artist, etc.)
I know this does not really solve your problem in the way you'd like, but I think you'll be happier not trying to do it that way.
Here's how I finally ended up solving this in the PHP client.
I decided to just keep the membership in the category_items table, and use the dynamic queries during submission to update this table.
This is the function in my script that's called to update an item's categories during submission or updating. It takes a list of user-selected categories (which can only be chosen from categories that don't have dynamic queries), and using this and the dynamic queries it figures out the difference between the categories that an item is currently in and the ones it should be in, and inserts/deletes as necessary to get them in sync. (Note that the actual table names in my DB are not the same as in my question, I was using somewhat generic terms.)
function update_item_categories($dbh, $id, $requested_cats) {
$data = mysql_check($dbh, mysqli_query($dbh, "select id, query from t_ld_categories where query is not null"), 'getting dynamic categories');
$clauses = array();
while ($row = mysqli_fetch_object($data))
$clauses[] = sprintf('select %d cat_id, (%d in (%s)) should_be_in',
$row->id, $id, $row->query);
if (!$requested_cats) $requested_cats[] = -1; // Dummy entry that never matches cat_id
$requested_cat_string = implode(', ', $requested_cats);
$clauses[] = "select c.id cat_id, (c.id in ($requested_cat_string)) should_be_in
from t_ld_categories c
where member_type = 'lessons' and query is null";
$subquery = implode("\nunion all\n", $clauses);
$query = "select c.cat_id cat_id, should_be_in, (member_id is not null) is_in
from ($subquery) c
left outer join t_ld_cat_members m
on c.cat_id = m.cat_id
and m.member_id = $id";
// printf("<pre>$query</pre>");
$data = mysql_check($dbh, mysqli_query($dbh, $query), 'getting current category membership');
$adds = array();
$deletes = array();
while ($row = mysqli_fetch_object($data)) {
if ($row->should_be_in && !$row->is_in) $adds[] = "({$row->cat_id}, $id)";
elseif (!$row->should_be_in && $row->is_in) $deletes[] = "(cat_id = {$row->cat_id} and member_id = $id)";
}
if ($deletes) {
$delete_string = implode(' or ', $deletes);
mysql_check($dbh, mysqli_query($dbh, "delete from t_ld_cat_members where $delete_string"), 'deleting old categories');
}
if ($adds) {
$add_string = implode(', ', $adds);
mysql_check($dbh, mysqli_query($dbh, "insert into t_ld_cat_members (cat_id, member_id) values $add_string"),
"adding new categories");
}
}
Related
I am not a professional programmer, but I assist a school in automating their assessments. I have a list of just over 1000 students with 3 assessment scores for each one every year and I need to create a list with the average of these three scores in descending order, limiting it to the top 30. I can calculate averages and display the results, but I can't sort or limit. In the first part of the code, I select the IDs from all students and store them into the array $alunos[] for the current year ($IDanoatual). In the second part, I use a for loop to calculate the average of these grades for each student and display them. Both codes lookup the same table ( audp_l_notasfinais). I tried using the foreach statement to filter and sort, but I couldn't resolve this issue.
$sela = "select id_aluno from audp_l_notasfinais
where id_ano = '$IDanoatual'
";
$qsela = mysqli_query($conn,$sela);
$contasel = mysqli_num_rows($qsela);
while ($row = mysqli_fetch_assoc($qsela)){
$alunos[] = $row['id_aluno'];
}
for ($i=0; $i<$contasel; $i++){
$selnotasA = "select avg(NULLIF(nts.mef,0)) as NtutA, aln.stdname Naln
from audp_l_notasfinais nts
inner join audp_c_alunos aln on aln.id_alunos = nts.id_aluno
where nts.id_aluno = '$alunos[$i]' and nts.id_ano='$IDanoatual'
";
$qrynmal = mysqli_query($conn,$selnotasA);
while ($row=mysqli_fetch_assoc($qrynmal)){
echo "Name: ".$row['Naln']." - Average: ".$row['NtutA']."<br>";
}
}
You have not included much detail in your question. Adding your CREATE TABLE statements and some sample data in markdown tables would help, and get a better response.
It looks like $IDanoatual could be coming from user input, in which case you really need to read about and understand SQL Injection and how to mitigate the risk with prepared statements.
Best guess -
select aln.id_alunos, aln.stdname Naln, avg(NULLIF(nts.mef,0)) as NtutA
from audp_c_alunos aln
inner join audp_l_notasfinais nts
on aln.id_alunos = nts.id_aluno
and nts.id_ano = '$IDanoatual'
group by aln.id_alunos
order by NtutA desc
limit 30;
Can you use Doctrine QueryBuilder to INNER JOIN a temporary table from a full SELECT statement that includes a GROUP BY?
The ultimate goal is to select the best version of a record. I have a viewVersion table that has multiple versions with the same viewId value but different timeMod. I want to find the version with the latest timeMod (and do a lot of other complex joins and filters on the query).
Initially people assume you can do a GROUP BY viewId and then ORDER BY timeMod, but ORDER BY has no effect on GROUP BY, and MySQL will return random results. There are a ton of answers out there (e.g. here) that explain the problem with using GROUP and offer a solution, but I am having trouble interpreting the Doctrine docs to find a way to implement the SQL with Doctrine QueryBuilder (if it's even possible). Why don't I just use DQL? I may have to, but I have a lot of dynamic filters and joins that are much easier to do with QueryBuilder, so I wanted to see if that's possible.
Sample MySQL to Reproduce in Doctrine QueryBuilder
SELECT vv.*
FROM view_version vv
#inner join only returns where the result sets overlap, i.e. one record
INNER JOIN (
SELECT MAX(timeMod) maxTimeMod, viewId
FROM view_version
GROUP BY viewId
) version ON version.viewId = vv.viewId AND vv.timeMod = version.maxTimeMod
#join other tables for filter, etc
INNER JOIN view v ON v.id = vv.viewId
INNER JOIN content_type c ON c.id = v.contentTypeId
WHERE vv.siteId=1
AND v.contentTypeId IN (2)
ORDER BY vv.title ASC;
Theoretical Solution via Query Builder (not working)
I am thinking that the JOIN needs to inject a DQL statement, e.g.
$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');
$queryMax = $viewVersionRepo->createQueryBuilder()
->addSelect('MAX(timeMod) AS timeModMax')
->addSelect('viewId')
->groupBy('viewId');
$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
// I tried putting the query in a parenthesis, to no avail
->join('('.$queryMax->getDQL().')', 'version', 'WITH', 'vv.viewId = version.viewId AND vv.timeMod = version.timeModMax')
// Join other Entities
->join('e.view', 'view')
->addSelect('view')
->join('view.contentType', 'contentType')
->addSelect('contentType')
// Perform random filters
->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
->addOrderBy('e.title', 'ASC');
$query = $queryBuilder->getQuery();
$results = $query->getResult();
My code (which may not match the above example perfectly) outputs:
SELECT e, view, contentType
FROM Gutensite\CmsBundle\Entity\View\ViewVersion e
INNER JOIN (
SELECT MAX(v.timeMod) AS timeModMax, v.viewId
FROM Gutensite\CmsBundle\Entity\View\ViewVersion v
GROUP BY v.viewId
) version WITH vv.viewId = version.viewId AND vv.timeMod = version.timeModMax
INNER JOIN e.view view
INNER JOIN view.contentType contentType
WHERE e.siteId = :siteId
AND view.contentTypeId IN (:contentTypeId)
ORDER BY e.title ASC
This Answer seems to indicate that it's possible in other contexts like IN statements, but when I try the above method in the JOIN, I get the error:
[Semantical Error] line 0, col 90 near '(SELECT MAX(v.timeMod)': Error: Class '(' is not defined.
A big thanks to #AdrienCarniero for his alternative query structure for sorting the highest version with a simple JOIN where the entity's timeMod is less than the joined table timeMod.
Alternative Query
SELECT view_version.*
FROM view_version
#inner join to get the best version
LEFT JOIN view_version AS best_version ON best_version.viewId = view_version.viewId AND best_version.timeMod > view_version.timeMod
#join other tables for filter, etc
INNER JOIN view ON view.id = view_version.viewId
INNER JOIN content_type ON content_type.id = view.contentTypeId
WHERE view_version.siteId=1
# LIMIT Best Version
AND best_version.timeMod IS NULL
AND view.contentTypeId IN (2)
ORDER BY view_version.title ASC;
Using Doctrine QueryBuilder
$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');
$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
// Join Best Version
->leftJoin('GutensiteCmsBundle:View\ViewVersion', 'bestVersion', 'WITH', 'bestVersion.viewId = e.viewId AND bestVersion.timeMod > e.timeMod')
// Join other Entities
->join('e.view', 'view')
->addSelect('view')
->join('view.contentType', 'contentType')
->addSelect('contentType')
// Perform random filters
->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
// LIMIT Joined Best Version
->andWhere('bestVersion.timeMod IS NULL')
->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
->addOrderBy('e.title', 'ASC');
$query = $queryBuilder->getQuery();
$results = $query->getResult();
In terms of performance, it really depends on the dataset. See this discussion for details.
TIP: The table should include indexes on both these values (viewId and timeMod) to speed up results. I don't know if it would also benefit from a single index on both fields.
A native SQL query using the original JOIN method may be better in some cases, but compiling the query over an extended range of code that dynamically creates it, and getting the mappings correct is a pain. So this is at least an alternative solution that I hope helps others.
Good evening guys,
I'm a newbie to web programming and I need your help to solve a problem inherent to SQL query.
The database engine I'm using is MySQL and I access it via PHP, here I'll explain a simplified version of my database, just to fix ideas.
Let's suppose to work with a database containing three tables: teams, teams_information, attributes. More precisely:
1) teams is a table containing some basic information about italian football teams (soccer, not american football :D), it is formed by three fields: 'id' (int, primary key), 'name' (varchar, team name), nickname (Varchar, team nickname);
2) attributes is a table containing a list of possible information about a football team, such as city (the city where team plays its home match), captain (team captain's fullname), f_number (number of fans) and so on. This table is formed by three fields: id (int, primary key), attribute_name (varchar, an identifier for the attribute), attribute_desc (text, an explanation of the meaning of attribute). Each record of this table represents a single possible attribute of a football team;
3) teams_information is a table where some information, about teams listed in team table, are available. This table contains three fields: id (int, primary key), team_id (int, a foreign key which identifies a team), attribute_id (int, a foreign key which identifies one of the attributes listed in attributes table), attribute_value (varchar, the value of the attribute). Each record represents a single attribute of a single team. In general, different teams will have a different number of information, so for some teams a large number of attributes will be available while for other teams only a small number of attributes will be available.
Note that relation between teams and teams_information is one to many and the same relation exists between attributes and teams_information
Well, given this model my purpose is to realize a grid (maybe with ExtJS 4.1) to show user the list of italian football team, each record of this grid will represent a single football team and will contain all possible attributes: some fields may be empty (because, for considered team, the correspondent attribute is unknown), while the others will contain the values stored in teams_information table (for the considered team).
According to the above grid's field are: id, team_name and a number of fields to represent all the different attributes listed in 'attributes' table.
My question is: can I realize such a grid by using a SINGLE SQL query (maybe a proper SELECT query, to fetch all data I need from database tables) ?
Can anyone suggest me how to write a similar query (if it exists) ?
Thanks in advance for helping me.
Regards.
Enrico.
The short answer to your question is no, there is no simple construct in MySQL to achieve the result set you are looking for.
But it is possible to carefully (painstakingly) craft such a query. Here is an example, I trust you will be able to decipher it. Basically, I'm using correlated subqueries in the select list, for each attribute I want returned.
SELECT t.id
, t.name
, t.nickname
, ( SELECT v1.attribute_value
FROM team_information v1
JOIN attributes a1
ON a1.id = v1.attribute_id AND a1.attribute_name = 'city'
WHERE v1.team_id = t.id ORDER BY 1 LIMIT 1
) AS city
, ( SELECT v2.attribute_value
FROM team_information v2 JOIN attributes a2
ON a2.id = v2.attribute_id AND a2.attribute_name = 'captain'
WHERE v2.team_id = t.id ORDER BY 1 LIMIT 1
) AS captain
, ( SELECT v3.attribute_value
FROM team_information v3 JOIN attributes a3
ON a3.id = v3.attribute_id AND a3.attribute_name = 'f_number'
WHERE v3.team_id = t.id ORDER BY 1 LIMIT 1
) AS f_number
FROM teams t
ORDER BY t.id
For 'multi-valued' attributes, you'd have to pull each instance of the attribute separately. (Use the LIMIT to specify whether you are retrieving the first one, the second one, etc.)
, ( SELECT v4.attribute_value
FROM team_information v4 JOIN attributes a4
ON a4.id = v4.attribute_id AND a4.attribute_name = 'nickname'
WHERE v4.team_id = t.id ORDER BY 1 LIMIT 0,1
) AS nickname_1st
, ( SELECT v5.attribute_value
FROM team_information v5 JOIN attributes a5
ON a5.id = v5.attribute_id AND a5.attribute_name = 'nickname'
WHERE v5.team_id = t.id ORDER BY 1 LIMIT 1,1
) AS nickname_2nd
, ( SELECT v6.attribute_value
FROM team_information v6 JOIN attributes a6
ON a6.id = v6.attribute_id AND a6.attribute_name = 'nickname'
WHERE v6.team_id = t.id ORDER BY 1 LIMIT 2,1
) AS nickname_3rd
I use nickname as an example here, because American soccer clubs frequently have more than one nickname, e.g. Chicago Fire Soccer Club has nicknames: 'The Fire', 'La Máquina Roja', 'Men in Red', 'CF97', et al.)
NOT AN ANSWER TO YOUR QUESTION, BUT ...
Have I mentioned numerous times before, how much I dislike working with EAV database implementations? What should IMO be a very simple query turns into an overly complicated beast of a potentially light dimming query.
Wouldn't it be much simpler to create a table where each "attribute" is a separate column? Then queries to return reasonable result sets would look more reasonable...
SELECT id, name, nickname, city, captain, f_number, ... FROM team
But what really makes me shudder is the prospect that some developer is going to decide that the LDQ should be "hidden" in the database as a view, to enable the "simpler" query.
If you go this route, PLEASE PLEASE PLEASE resist any urge you may have to store this query in the database as a view.
I'm going to take a slightly different route. Spencer's answer is fantastic, and it addresses the issue quite well, but there's still a large underlying problem.
The data that you are trying to display on the site is over-normalized in the database. I won't elaborate, since, again, Spencer's answer highlights the issue pretty well.
Rather, I'd like to recommend a solution that denormalizes the data a bit.
Convert all of your Team data into a single table with many columns. (If there is Player data that isn't covered in the question, that would be a second table, but I'll gloss over that for now.)
Sure, you'll have a whole bunch of columns, and a lot of the columns might be NULL for a lot of the rows. It's not normalized, and it's not pretty, but here's the huge advantage that you gain.
Your query becomes:
SELECT * FROM Teams
That's it. That gets displayed right to the website and you are done. You might have to go out of your way to realize this schema, but it would be totally worth the time investment.
I think what you're saying is that you want the rows in the attributes table to appear as columns in the result recordset. If this is correct, then then in SQL you would use PIVOT.
A quick search on SO seems to indicate that there is no PIVOT equivalent in MySql.
I wrote a simple PHP script to generalize spencer's idea to solve my issue.
Here's the code:
<?php
require_once('includes/db.config.php'); //this file performs connection to mysql
/*
* Following function requires a table name ($table)
* and a number of service fields ($num). Given those parameters
* it returns the number of table fields (excluding service fields).
*/
function get_fields_number($table,$num,$conn)
{
$query = "SELECT * FROM $table";
$result = mysql_query($query,$conn);
return mysql_num_fields($result)-$num; //remember there are $num service fields
}
/*
* Following function requires a table name ($table) and an array
* containing a list of service fields names. Given those parameters,
* it returns the list of field names. That list is contained within an array and
* service fields are excluded.
*/
function get_fields_name($table,$service,$conn)
{
$query = "SELECT * FROM $table";
$result = mysql_query($query,$conn);
$name = array(); //Array to be returned
for ($i=0;$i<mysql_num_fields($result);$i++)
{
if(!in_array(mysql_field_name($result,$i),$service))
{
//currently selected field is not a service field
$name[] = mysql_field_name($result,$i);
}
}
return $name;
}
//Below $conn is db connection created in 'db.config.php'
$query = "SELECT `name` FROM `detail_arg` WHERE visibility = 0";
$res = mysql_query($query,$conn);
if($res===false)
{
$err_msg = mysql_real_escape_string(mysql_error($conn));
echo "{success:false,data:'".$err_msg."'}";
die();
}
$arg = array(); //list of argument names
while($row = mysql_fetch_assoc($res))
{
$arg[] = $row['name'];
}
//Following function writes the select subquery which is
//necessary to build a column containing a single attribute.
function make_subquery($attribute) //$attribute contains attribute name
{
$query = "";
$query.="(SELECT incident_detail.arg_value ";
$query.="FROM incident_detail ";
$query.="INNER JOIN detail_arg ";
$query.="ON incident_detail.arg_id = detail_arg.id AND detail_arg.name='".$attribute."' ";
$query.="WHERE incident.id = incident_detail.incident_id) ";
$query.="AS $attribute";
return $query;
}
/*
echo make_subquery("date"); //debug code
*/
$subquery = array(); //list of subqueries
for($i=0;$i<count($arg);$i++)
{
$subquery[] = make_subquery($arg[$i]);
}
$query = "SELECT "; //final query containing subqueries
$fields = get_fields_name("incident",array("id","visibility"),$conn);
//list of 'incident' table's fields
for($i=0;$i<count($fields);$i++)
{
$query.="incident.".$fields[$i].", ";
}
//insert the subqueries
$sub = implode($subquery,", ");
$query .= $sub;
$query.=" FROM incident ORDER BY incident.id";
echo $query;
?>
I have two tables: Users and Groups
In my table "Users", there is a column called "ID" for all the user ids.
In my table "Groups" there is a column called "Participants", fields in this column are filled with all the user ids like this "PID_134,PID_489,PID_4784," - And there is a column "ID" that identifies a specific group.
Now what i want to do, i want to create a menu that shows all the users that are not yet in this particular group.
So i need to get all the user ids, that are not yet in the Participants column of a group with a particular ID.
It would be cool if there was a single mysql query for that - But any PHP + MySQL solutions are okay, too.
How does that work? Any guesses?
UPDATE:
i know, that's not code, but is there a way I could do something like this that would return me a list of all the users?
SELECT *
FROM users, groups
WHERE groups.participants NOT LIKE '%PID_'users.id'%' AND groups.id = 1;
Something like this. You just get rid of "PID_" part of ID.
SELECT * FROM [users] WHERE [id] NOT IN
(SELECT replace(id,'PID_','') FROM groups WHERE group_name='group1')
Group1 would be your variable - group id/name of menu that you've opened.
You can select from multiple tables as shown below:
SELECT * from users, groups WHERE users.id != groups.participants AND groups.id = 1;
This will list all users who are not in group id 1; A more elegant solution can be found by using joins, but this is simple and will do the trick.
I believe something like that should help:
SELECT * FROM users WHERE users.id NOT IN (SELECT groups.participants FROM groups)
But this works only if your DB is normalized. So for your case I see only PHP + MySQL solution. Not very elegant, but it does the job.
<?php
$participants_array = mysql_query("SELECT participants FROM groups");
$ids = array();
while ($participant = mysql_fetch_assoc($participants_array))
{
$id = explode(',', $participant['participant']);
foreach ($id as $instance)
{
if (!in_array($instance, $ids)) $ids[] = $instance;
}
}
$participants = implode(',', $ids);
$result = mysql_query("SELECT * FROM users WHERE id NOT IN ( $participants )");
But I highly recommend normalizing the database.
Consider the following:
Table - id, parentid
What I'd like to do, is I'd like to pull all the children (not only direct children, but all of them, i.e. children of children of children etc.) of a specific parent.
So let's say the table contains the following row: (2, 1), (3, 1), (4, 2), (5, 4)
Then for parentid = 1, the table would return ids 2, 3, 4 AND 5.
Is this possible?
If not (and I guess it's indeed not possible), what are my options?
I really don't want to use dozens of queries...
P.S. I can't change the database structure.
Also, as there might be hundreds of thousands of records in the table, I can pull them all and do the whole thing using PHP instead.
This might help:
$parentId = 1; // the parent id
$arrAllChild = Array(); // array that will store all children
while (true) {
$arrChild = Array(); // array for storing children in this iteration
$q = 'SELECT `id` FROM `table` WHERE `parentid` IN (' . $parentId . ')';
$rs = mysql_query ($q);
while ($r = mysql_fetch_assoc($rs)) {
$arrChild[] = $r['id'];
$arrAllChild[] = $r['id'];
}
if (empty($arrChild)) { // break if no more children found
break;
}
$parentId = implode(',', $arrChild); // generate comma-separated string of all children and execute the query again
}
print_r($arrAllChild);
You may as well use recursion to do so but I think the above will need fewer iterations.
Hope it helps!
EDIT - I forgot to mention that you can as well implement the same logic in a MySQL stored procedure except that you cant use Arrays. The above example is implemented in PHP as you might have already guessed
not in one step.
I have done recursive queries in MySQL using PHP... looping through one level, collecting the data, modifying the query to use the results brought back in the last iteration, running the query again, etc.
Mysql is not very friendly for this sort of thing. MSSQL, Oracle, or PostgreSQL support it in singular query format.
Here is a query I wrote just now for a similar problem:
select if(e.id is not null, e.id, if(d.id is not null, d.id, if(c.id is not null, c.id, if(b.id is not null, b.id, a.id)))) as ID
from groups a
left join groups b on b.parent = a.id
left join groups c on c.parent = b.id
left join groups d on d.parent = c.id
left join groups e on e.parent = d.id
where a.parent = SOMETOPLEVELPARENTIDHERE;
This approach does have a fixed depth limit. I know from my own data that it happens to span at most five levels of depth. If the depth is fairly stable you can accommodate some growth by simply adding more left joins. Also, not sure how the query will perform with hundreds of thousands of records.