SQL issue: one to many relationship and EAV model - mysql

Good evening guys,
I'm a newbie to web programming and I need your help to solve a problem inherent to SQL query.
The database engine I'm using is MySQL and I access it via PHP, here I'll explain a simplified version of my database, just to fix ideas.
Let's suppose to work with a database containing three tables: teams, teams_information, attributes. More precisely:
1) teams is a table containing some basic information about italian football teams (soccer, not american football :D), it is formed by three fields: 'id' (int, primary key), 'name' (varchar, team name), nickname (Varchar, team nickname);
2) attributes is a table containing a list of possible information about a football team, such as city (the city where team plays its home match), captain (team captain's fullname), f_number (number of fans) and so on. This table is formed by three fields: id (int, primary key), attribute_name (varchar, an identifier for the attribute), attribute_desc (text, an explanation of the meaning of attribute). Each record of this table represents a single possible attribute of a football team;
3) teams_information is a table where some information, about teams listed in team table, are available. This table contains three fields: id (int, primary key), team_id (int, a foreign key which identifies a team), attribute_id (int, a foreign key which identifies one of the attributes listed in attributes table), attribute_value (varchar, the value of the attribute). Each record represents a single attribute of a single team. In general, different teams will have a different number of information, so for some teams a large number of attributes will be available while for other teams only a small number of attributes will be available.
Note that relation between teams and teams_information is one to many and the same relation exists between attributes and teams_information
Well, given this model my purpose is to realize a grid (maybe with ExtJS 4.1) to show user the list of italian football team, each record of this grid will represent a single football team and will contain all possible attributes: some fields may be empty (because, for considered team, the correspondent attribute is unknown), while the others will contain the values stored in teams_information table (for the considered team).
According to the above grid's field are: id, team_name and a number of fields to represent all the different attributes listed in 'attributes' table.
My question is: can I realize such a grid by using a SINGLE SQL query (maybe a proper SELECT query, to fetch all data I need from database tables) ?
Can anyone suggest me how to write a similar query (if it exists) ?
Thanks in advance for helping me.
Regards.
Enrico.

The short answer to your question is no, there is no simple construct in MySQL to achieve the result set you are looking for.
But it is possible to carefully (painstakingly) craft such a query. Here is an example, I trust you will be able to decipher it. Basically, I'm using correlated subqueries in the select list, for each attribute I want returned.
SELECT t.id
, t.name
, t.nickname
, ( SELECT v1.attribute_value
FROM team_information v1
JOIN attributes a1
ON a1.id = v1.attribute_id AND a1.attribute_name = 'city'
WHERE v1.team_id = t.id ORDER BY 1 LIMIT 1
) AS city
, ( SELECT v2.attribute_value
FROM team_information v2 JOIN attributes a2
ON a2.id = v2.attribute_id AND a2.attribute_name = 'captain'
WHERE v2.team_id = t.id ORDER BY 1 LIMIT 1
) AS captain
, ( SELECT v3.attribute_value
FROM team_information v3 JOIN attributes a3
ON a3.id = v3.attribute_id AND a3.attribute_name = 'f_number'
WHERE v3.team_id = t.id ORDER BY 1 LIMIT 1
) AS f_number
FROM teams t
ORDER BY t.id
For 'multi-valued' attributes, you'd have to pull each instance of the attribute separately. (Use the LIMIT to specify whether you are retrieving the first one, the second one, etc.)
, ( SELECT v4.attribute_value
FROM team_information v4 JOIN attributes a4
ON a4.id = v4.attribute_id AND a4.attribute_name = 'nickname'
WHERE v4.team_id = t.id ORDER BY 1 LIMIT 0,1
) AS nickname_1st
, ( SELECT v5.attribute_value
FROM team_information v5 JOIN attributes a5
ON a5.id = v5.attribute_id AND a5.attribute_name = 'nickname'
WHERE v5.team_id = t.id ORDER BY 1 LIMIT 1,1
) AS nickname_2nd
, ( SELECT v6.attribute_value
FROM team_information v6 JOIN attributes a6
ON a6.id = v6.attribute_id AND a6.attribute_name = 'nickname'
WHERE v6.team_id = t.id ORDER BY 1 LIMIT 2,1
) AS nickname_3rd
I use nickname as an example here, because American soccer clubs frequently have more than one nickname, e.g. Chicago Fire Soccer Club has nicknames: 'The Fire', 'La Máquina Roja', 'Men in Red', 'CF97', et al.)
NOT AN ANSWER TO YOUR QUESTION, BUT ...
Have I mentioned numerous times before, how much I dislike working with EAV database implementations? What should IMO be a very simple query turns into an overly complicated beast of a potentially light dimming query.
Wouldn't it be much simpler to create a table where each "attribute" is a separate column? Then queries to return reasonable result sets would look more reasonable...
SELECT id, name, nickname, city, captain, f_number, ... FROM team
But what really makes me shudder is the prospect that some developer is going to decide that the LDQ should be "hidden" in the database as a view, to enable the "simpler" query.
If you go this route, PLEASE PLEASE PLEASE resist any urge you may have to store this query in the database as a view.

I'm going to take a slightly different route. Spencer's answer is fantastic, and it addresses the issue quite well, but there's still a large underlying problem.
The data that you are trying to display on the site is over-normalized in the database. I won't elaborate, since, again, Spencer's answer highlights the issue pretty well.
Rather, I'd like to recommend a solution that denormalizes the data a bit.
Convert all of your Team data into a single table with many columns. (If there is Player data that isn't covered in the question, that would be a second table, but I'll gloss over that for now.)
Sure, you'll have a whole bunch of columns, and a lot of the columns might be NULL for a lot of the rows. It's not normalized, and it's not pretty, but here's the huge advantage that you gain.
Your query becomes:
SELECT * FROM Teams
That's it. That gets displayed right to the website and you are done. You might have to go out of your way to realize this schema, but it would be totally worth the time investment.

I think what you're saying is that you want the rows in the attributes table to appear as columns in the result recordset. If this is correct, then then in SQL you would use PIVOT.
A quick search on SO seems to indicate that there is no PIVOT equivalent in MySql.

I wrote a simple PHP script to generalize spencer's idea to solve my issue.
Here's the code:
<?php
require_once('includes/db.config.php'); //this file performs connection to mysql
/*
* Following function requires a table name ($table)
* and a number of service fields ($num). Given those parameters
* it returns the number of table fields (excluding service fields).
*/
function get_fields_number($table,$num,$conn)
{
$query = "SELECT * FROM $table";
$result = mysql_query($query,$conn);
return mysql_num_fields($result)-$num; //remember there are $num service fields
}
/*
* Following function requires a table name ($table) and an array
* containing a list of service fields names. Given those parameters,
* it returns the list of field names. That list is contained within an array and
* service fields are excluded.
*/
function get_fields_name($table,$service,$conn)
{
$query = "SELECT * FROM $table";
$result = mysql_query($query,$conn);
$name = array(); //Array to be returned
for ($i=0;$i<mysql_num_fields($result);$i++)
{
if(!in_array(mysql_field_name($result,$i),$service))
{
//currently selected field is not a service field
$name[] = mysql_field_name($result,$i);
}
}
return $name;
}
//Below $conn is db connection created in 'db.config.php'
$query = "SELECT `name` FROM `detail_arg` WHERE visibility = 0";
$res = mysql_query($query,$conn);
if($res===false)
{
$err_msg = mysql_real_escape_string(mysql_error($conn));
echo "{success:false,data:'".$err_msg."'}";
die();
}
$arg = array(); //list of argument names
while($row = mysql_fetch_assoc($res))
{
$arg[] = $row['name'];
}
//Following function writes the select subquery which is
//necessary to build a column containing a single attribute.
function make_subquery($attribute) //$attribute contains attribute name
{
$query = "";
$query.="(SELECT incident_detail.arg_value ";
$query.="FROM incident_detail ";
$query.="INNER JOIN detail_arg ";
$query.="ON incident_detail.arg_id = detail_arg.id AND detail_arg.name='".$attribute."' ";
$query.="WHERE incident.id = incident_detail.incident_id) ";
$query.="AS $attribute";
return $query;
}
/*
echo make_subquery("date"); //debug code
*/
$subquery = array(); //list of subqueries
for($i=0;$i<count($arg);$i++)
{
$subquery[] = make_subquery($arg[$i]);
}
$query = "SELECT "; //final query containing subqueries
$fields = get_fields_name("incident",array("id","visibility"),$conn);
//list of 'incident' table's fields
for($i=0;$i<count($fields);$i++)
{
$query.="incident.".$fields[$i].", ";
}
//insert the subqueries
$sub = implode($subquery,", ");
$query .= $sub;
$query.=" FROM incident ORDER BY incident.id";
echo $query;
?>

Related

MySQL query from an array with limit and order by

I am not a professional programmer, but I assist a school in automating their assessments. I have a list of just over 1000 students with 3 assessment scores for each one every year and I need to create a list with the average of these three scores in descending order, limiting it to the top 30. I can calculate averages and display the results, but I can't sort or limit. In the first part of the code, I select the IDs from all students and store them into the array $alunos[] for the current year ($IDanoatual). In the second part, I use a for loop to calculate the average of these grades for each student and display them. Both codes lookup the same table ( audp_l_notasfinais). I tried using the foreach statement to filter and sort, but I couldn't resolve this issue.
$sela = "select id_aluno from audp_l_notasfinais
where id_ano = '$IDanoatual'
";
$qsela = mysqli_query($conn,$sela);
$contasel = mysqli_num_rows($qsela);
while ($row = mysqli_fetch_assoc($qsela)){
$alunos[] = $row['id_aluno'];
}
for ($i=0; $i<$contasel; $i++){
$selnotasA = "select avg(NULLIF(nts.mef,0)) as NtutA, aln.stdname Naln
from audp_l_notasfinais nts
inner join audp_c_alunos aln on aln.id_alunos = nts.id_aluno
where nts.id_aluno = '$alunos[$i]' and nts.id_ano='$IDanoatual'
";
$qrynmal = mysqli_query($conn,$selnotasA);
while ($row=mysqli_fetch_assoc($qrynmal)){
echo "Name: ".$row['Naln']." - Average: ".$row['NtutA']."<br>";
}
}
You have not included much detail in your question. Adding your CREATE TABLE statements and some sample data in markdown tables would help, and get a better response.
It looks like $IDanoatual could be coming from user input, in which case you really need to read about and understand SQL Injection and how to mitigate the risk with prepared statements.
Best guess -
select aln.id_alunos, aln.stdname Naln, avg(NULLIF(nts.mef,0)) as NtutA
from audp_c_alunos aln
inner join audp_l_notasfinais nts
on aln.id_alunos = nts.id_aluno
and nts.id_ano = '$IDanoatual'
group by aln.id_alunos
order by NtutA desc
limit 30;

MySQL n-part Query for 1-n Relationship

(Just started learning SQL a few days ago so sorry if this is a stupid question!)
I have three tables, Users, Addresses, and AddressCategories. Each User has multiple Addresses, but no more than 1 Address per AddressCategory. I would like to make a single query that searches for Users based on different criteria for each AddressCategory.
Table structure looks like:
Users:
id
1
2
AddressCategories:
category
HomeAddress
WorkAddress
Addresses:
userId category address
1 HomeAddress 1 Washington Street
1 WorkAddress 53 Elm Avenue
2 HomeAddress 7 Bernard Street
Let's say I want to search for all users whose home address contains the word "Street" and work address contains the word "Avenue". I can use the query:
SELECT * FROM Users
INNER JOIN Addresses a1 ON Users.id=a1.userId
INNER JOIN Addresses a2 ON Users.id=a2.userId
WHERE a1.category='HomeAddress' AND a1.address LIKE '%Street%'
AND a2.category='WorkAddress' AND a2.address LIKE '%Avenue%'
If I want to query across an arbitrary number of AddressCategories, I can dynamically build a query using the same principle above:
// dictionary of query parts
var q_parts = {HomeAddress: 'Street',
WorkAddress: 'Avenue'
...}
// build the query string piece by piece
let q_str1="", q_str2="";
let i=0;
for (q in q_parts) {
i++;
q_str1 += "INNER JOIN Addresses a${i} ON Users.id=a${1}.userId ";
q_str2 += (i==1) ? "WHERE " : "AND ";
q_str2 += "a${i}.category='${q}' AND a${i}.address LIKE '%${q_parts[q]}%' ";
}
// complete query string
let q_str = "SELECT * FROM Users "+q_str1+q_str2;
The way I'm doing it now works, but it's easy to make a mistake building the query string and the final string quickly becomes enormous as the number of categories grows. Seems like there must be a better way. What is the right way to perform such queries in MySQL? (Or is there a problem with how I've organized my tables?)
You can use this one for query building.
Official site: https://knexjs.org/
Npm link: https://www.npmjs.com/package/knex
A sample SQL for don't have to join many times. It is not tested, and just an idea.
You can use When/then in Where clause for verifying case by case. And finally, filter base on the total categories of a User (group by).
SELECT *
FROM
Users Inner Join
(SELECT userId,
count(category) AS categoryCount
WHERE address LIKE '%Street%' LIKE CASE
WHEN category = 'HomeAddress' THEN '%Street%'
WHEN category = 'WorkAddress' THEN '%Avenue%'
END
GROUP BY userId) a ON Users.id = a.userId
WHERE categoryCount = ? -- inject your count of all categories here, maybe get from another query

How to retrieve, values, names and group names in one SQL query

I used to have an EAV shema with 4 tables in MySQl 5.7:
articles
article attributes
attribute names
attribute group names
After running into huge complexity, I learned from another question that this is not a good shema. So I got rid of table 2 where all the attributes have been stored and saved them either with values or value_ids directly into table one, as the STI model suggests.
Now I ended up with 3 tables:
articles
attribute names
attribute group names
At first it looked like it made my live easier, but while trying to replace a simple query that was getting all attribute group names and attribute names of a specific article I figured that this is also not ideal.
My previous query looked like this:
SELECT
cag.name_de,
cag.attr_group_id,
attr.attr_de,
attr.attr_id
FROM
articles_attr aa,
cat_attr attr,
cat_attr_groups cag
WHERE
aa.article_id = '181206'
AND aa.attr_id = attr.attr_id
AND cag.attr_group_id = attr.attr_group_id
Now with the new schema, I would need at least this:
Get all group names like e.g. "color"
SELECT
name_de,
attr_group_id
FROM
cat_attr_groups
Get all indirect values which have an ID like e.g. "green"
SELECT
attr.attr_group_id,
attr.attr_de
FROM
articles a,
cat_attr attr
WHERE
a.article_id = '181206'
AND (
(a.dial_c_id = attr.attr_id)
OR (a.dial_n_id = attr.attr_id)
OR (a.bracelet_color_id = attr.attr_id)
)
// pseudo code
$attr[$row->attr_group_id] = $row->attr_de;
Get all direct values:
SELECT
jewels,
vibrations
FROM
articles a
WHERE
a.article_id = '181206'
// pseudo code
$attr[4] = $row->jewels;
Map group names with group ids
foreach($attr AS $key => $value){
// somehow
}
This does not seem to be very elegant. How could I design my shema better or how could those queries be rewritten to retrieve the values in an acceptable query time?

Database Design - Multilingual Website

I'm having a multilingual website which it's users are able to have profile in different languages, for example each user could have his profile published in "English" and "French" and "Spanish", something like LinkedIn.
Now, I'm a user who is seeing the website in "English" language, so while I go to other members profile page, I should see that member profile in "English", if that profile is not available in that language, I should see that profile in that member "main_lang".
So I have a "members" table which has a column as "published_profile_langs", in this col the languages which each member has published his profile in is gonna be stored comma separated: "english,spanish", and "main_lang" col which is the user main language (his profile is definitely published in main_lang since we're asking for details on sign up step).
In another hand, members details are stored in different tables, such as "members_details_english", "members_details_spanish", "members_details_french".
I want to join my query, but it seems it's not possible in the way which I managed, currently I need to use 2 queries for loading the members details in the mentioned above scenario, my code in "members_profile.php" is:
// FIRST QUERY
$check_member = mysql_query("SELECT main_lang, profile_published_langs FROM members WHERE id = '$this_user_id'");
while($row = mysql_fetch_array($check_member)){
$this_main_lang = $row['main_lang'];
$this_profile_published_langs = $row['profile_published_langs'];
}
$this_profile_published_langs_arr = explode(',', $this_profile_published_langs);
if(!in_array($lang, $this_profile_published_langs_arr)) $lang = $this_main_lang;
$details_table = 'members_details_' . $lang;
// SECOND QUERY
$get_details = mysql_query("SELECT * FROM $details_table WHERE member_id = '$this_user_id'");
while($row_details = mysql_fetch_array($get_details)){
//blah blah
}
Is there any better way to achieve this? maybe someway to query once and not twice? any better database structure for this scenario?
I would appreciate any kind of help
Try considering language as just another entity in the database. Use the table members to store all data not dependant on the language, and have a second table with data that is; members_i18n - short for memebers_internationalization.
In the first table you can have a column called main_language_id and use the second table to store columns for data in different languages for each member by relating to it with member_id and language_id. This way you can fetch data for each member in all languages, just their main language, or any specific set of languages you need.
Plus, you won't need to use serialized data in your tables like profile_published_langs.
So a few example queries would be:
-- Main language
SELECT *
FROM member AS m
JOIN member_i18n AS mi
ON m.member_id = mi.member_id
AND m.main_language_id = mi.language_id
-- Specific language
SELECT *
FROM member AS m
JOIN member_i18n AS mi
ON m.member_id = mi.member_id
WHERE mi.language_id = 'eng'
-- All languages
SELECT *
FROM member AS m
JOIN member_i18n AS mi
ON m.member_id = mi.member_id
EDIT:
Personally, I usually use a third table with languages that looks like this:
CREATE TABLE `language` (
`language_id` char(3) NOT NULL,
`name` varchar(30) NOT NULL,
`code2` char(2) NOT NULL,
PRIMARY KEY (`language_id`)
);
-- Sample data
INSERT INTO `language` (`language_id`, `name`, `code2`) VALUES
('deu', 'Deutsch', 'de'),
('eng', 'English', 'en');
I found it to be very useful when printing out multilingual data.
EDIT 2:
So to fetch data for a user in the "current" language and their main language, just write a single query like this:
-- Current language (i.e. 'eng') + member's main language
SELECT *
FROM member AS m
JOIN member_i18n AS mi
ON m.member_id = mi.member_id
WHERE mi.language_id = m.main_language_id
OR mi.language_id = 'eng'
You'll end up with one or two rows, depending on the member's profile.
You can have a table members (id_user, mainlang, firstname, ...) and a table profile (id_user, language, and all data in that language). and for a user you have a row for each language. Your select is like this
select * from profile where id_user = $userId and language = $lang
I would probably build a user table, a language table and a mapping table
user_main - all usual columns, witha mail language column here
language table - whatever you want to keep, code etc, plus a column-
lang table name (equivalent to your user_details_spanish etc tables)
mapping table - user id, lanague id ( a row here means that that
particular user has a profile in that language)
Now, i would agree, that you might still need two queries, but I think its much more manageable since the table name is available from the lang table, and you might keep it in session (e.g. where you let the user choose from a drop down which language to switch to, the table name can be available there itself, so that you dont have to fetch it)...
Let me know if this direction of thought helps.. perhaps I can help further...

Per-row dynamic sql

I have a database representing something like a bookstore. There's a table containing the categories that books can be in. Some categories are defined simply using another table that contains the category-item relationships. But there are also some categories that can be defined programmatically -- a category for a specific author can be defined using a query (SELECT item_id FROM items WHERE author = "John Smith"). So my categories table has a "query" column; if it's not null, I use this to get the items in the category, otherwise I use the category_items table.
Currently, I have the application (PHP code) make this decision, but this means lots of separate queries when we iterate over all the categories. Is there some way to incorporate this dynamic SQL into a join? Something like:
SELECT c.category, IF(c.query IS NULL, count(i.items), count(EXECUTE c.query)
FROM categories c
LEFT OUTER JOIN category_items i
ON c.category = i.category
EXECUTE requires a prepared statement, but I need to prepare a different statement for each row. Also, EXECUTE can't be used in expressions, it's just a toplevel statement. Suggestions?
What happens when you want to list books by publisher? Country? Language? You'd have to throw them all into a single "category_items" table. How would you pick which dynamic query to execute? The query-within-a-query method is not going to work.
I think your concept of "category" is too broad, which is resulting in overly complicated SQL. I would replace "category" to represent only "genre" (for books). Genres are defined in their own table, and item_genres connects them to the items table. Books-by-author and books-by-genre should just be separate queries at the application level, rather than trying to do them both with the same (sort of) query at the database/SQL level. (If you have music as well as books, they probably shouldn't all be stored in a single "items" table because they're different concepts ... have different genres, author vs. artist, etc.)
I know this does not really solve your problem in the way you'd like, but I think you'll be happier not trying to do it that way.
Here's how I finally ended up solving this in the PHP client.
I decided to just keep the membership in the category_items table, and use the dynamic queries during submission to update this table.
This is the function in my script that's called to update an item's categories during submission or updating. It takes a list of user-selected categories (which can only be chosen from categories that don't have dynamic queries), and using this and the dynamic queries it figures out the difference between the categories that an item is currently in and the ones it should be in, and inserts/deletes as necessary to get them in sync. (Note that the actual table names in my DB are not the same as in my question, I was using somewhat generic terms.)
function update_item_categories($dbh, $id, $requested_cats) {
$data = mysql_check($dbh, mysqli_query($dbh, "select id, query from t_ld_categories where query is not null"), 'getting dynamic categories');
$clauses = array();
while ($row = mysqli_fetch_object($data))
$clauses[] = sprintf('select %d cat_id, (%d in (%s)) should_be_in',
$row->id, $id, $row->query);
if (!$requested_cats) $requested_cats[] = -1; // Dummy entry that never matches cat_id
$requested_cat_string = implode(', ', $requested_cats);
$clauses[] = "select c.id cat_id, (c.id in ($requested_cat_string)) should_be_in
from t_ld_categories c
where member_type = 'lessons' and query is null";
$subquery = implode("\nunion all\n", $clauses);
$query = "select c.cat_id cat_id, should_be_in, (member_id is not null) is_in
from ($subquery) c
left outer join t_ld_cat_members m
on c.cat_id = m.cat_id
and m.member_id = $id";
// printf("<pre>$query</pre>");
$data = mysql_check($dbh, mysqli_query($dbh, $query), 'getting current category membership');
$adds = array();
$deletes = array();
while ($row = mysqli_fetch_object($data)) {
if ($row->should_be_in && !$row->is_in) $adds[] = "({$row->cat_id}, $id)";
elseif (!$row->should_be_in && $row->is_in) $deletes[] = "(cat_id = {$row->cat_id} and member_id = $id)";
}
if ($deletes) {
$delete_string = implode(' or ', $deletes);
mysql_check($dbh, mysqli_query($dbh, "delete from t_ld_cat_members where $delete_string"), 'deleting old categories');
}
if ($adds) {
$add_string = implode(', ', $adds);
mysql_check($dbh, mysqli_query($dbh, "insert into t_ld_cat_members (cat_id, member_id) values $add_string"),
"adding new categories");
}
}