sql: selecting a value in an attributes table - mysql

I have a database with the following schema:
thing
id
id_thing_type
thing_attribute
id
name
id_thing_type
thing_attribute_value
id_thing_attribute
value_date
Thing has many Thing_Attributes joined on thing.id_thing_type = thing_attribute.id_thing_type
Thing_Attributes have one Thing_Attribute_Value joined on thing_attribute.id = thing_attribute_value.id_thing_attribute
I am trying to write a query that will return the value of a specific attribute (with a certain name) for a specific thing record (with a certain id). thing.id represents a unique row.
Said another way, thing_attribute and thing both have an id_thing_type. These tables need to be joined on id_thing_type = id_thing_type. Think of this as each type of thing has its own unique set of thing_attributes. The task is to find the value of a specific attribute of a specific thing.
This is what I have so far, however it returns many rows:
SELECT t.id, tav.value_date
FROM thing_attribute_value tav
JOIN thing_attribute ta
ON tav.id_node_attribute = ta.id
JOIN thing t
ON ta.id_thing_type = t.id_thing_type
WHERE ta.name = 'Birth Date'
AND t.id = '123'
Here is an example result. As you can see, many rows are returned, all with the same id for thing, but with different dates.
123,2015-12-02
123,2014-11-02
123,2013-07-11
123,2014-03-12
etc....

So, as discussed in the comments to my original answer:
Thing has a Thing_Type_Id Column.
And Thing_Attribute has a Thing_Type_Id Column.
For which there is a 1::1 relationship.
Thing_Attribute has a Thing_Attribute_Value_Id column.
And Thing_Attribute_Value has an Id column.
For which there is a 1::Many relationship.
This is what is causing your query to fail. The error is actually in your data relationships, not the query.
Let's try to piece this together:
Thing
- needs a unique id
- can have any number of other columns
- will have many attributes, but this column will not be in this table
Thing_Attributes
- needs a unique id
- needs a Thing id (fk to Thing)
- needs a Type id (fk to Thing_Attribute_Type)
- needs a Value
Thing_Attribute_Value - can go away - Replace it with a lookup for Type
Thing_Attribute_Type
- needs unique id
- needs Value
With this relational structure you can have many Attributes for each Thing
You use an id field for Attribute Type so you are not repeating a strings.
Use a lookup that gets an Attribute_Type value based on the Id.
This is an overhaul to your current relationship model, but with your current one you cannot isolate a given attribute value to a given Thing, which I think was at least one of the goals in building your Thing objects.

Related

How to join 2 sql tables where one table contains multiple values in a single column

Currently, this is what my SELECT code looks like:
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON course.stu_code ?????;
Basically, to elaborate the student table inherits from user table, therefore I had user_id = stu_code. What I'm confused about is how to join course table with student table.
Let's say that the course table has a course code (PK), a few other attributes and a stu_code column, however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Example: Student table has stu_code string value of '123' and course table has a stu_code with string value of '123, 246, 369'.
How would I go about joining these two tables together and separating the stu_code in the course table so that it represents 3 separate stu_code values -> i.e. '123', '246', '369'.
Any help is greatly appreciated!
however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Your data model is broken. Put your effort into fixing the data model. You want a junction/association table courseStudents or perhaps enrolled, with columns like:
stu_code (foreign key to students)
course_code (foreign key to students)
enrollment_date
and so on
What is wrong with your data model? Here are a few things:
You are storing numbers as a string.
You are putting multiple values into a string column.
You cannot define foreign key relationships.
SQL has poor string handling capabilities.
SQL has a great way to store lists of things. It is not called "string". It is called "table".
Your data model is ~broken~ hindering you from elegant solutions.
You cannot join your two tables efficiently. While they might both contain strings they do not contain data with the same rules. Thus, you must transform the data in order to join them so you could do this in a few ways but one way is using regular expression function.
You can use it to evaluate a test on whether the stu_code matches the list of codes. Further, you can do this dynamically ... constructing the test string itself based upon values from the left and right
join based on REGEXP
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON student.stu_code REGEXP CONCAT('[[:&lt:]]',course.stu_code,'[[:&gt:]]')
Assuming tables and data:
Student
- - - -
stu_code
123
Course
- - - -
stu_code
'123, 246, 369'
Example:
http://sqlfiddle.com/#!9/672b57f/4
about the regular expression
in mysql the regex syntax can be a little bit different. [[:<:]] is the character class in spencer notation for word boundary.
if you have a new enough version of mysql/mariadb you can use more typical ICU notation of \b.
more about that here : https://dev.mysql.com/doc/refman/8.0/en/regexp.html
about efficiency
in large datasets the performance will be awful. you will have to scan all records and you will have to perform the function on all of them. In a large set you might get some gains by joining on like first (which is faster than regexp). This will be much faster at filtering-out and then the regexp can deal with filtering-in.
Perhaps your model was based upon an assumption of having a courses table with very few rows?
It ironic because you have made your course table unnecessarily large. You would actually be better off with an intermediary table that represents the many-to-many nature (the fact that students can take many courses and courses can have many students) with 1 row per unique relationship. While this table would be an order of magnitude "longer" it would be leaner and it could be indexed and query performance would be faster.
The courses table does not need to have any awareness of the student list and thus you can alter courses by removing courses.stu_code once you change the model (aside: It might be useful if courses cached a hint of the expected student count for that course)
possible link table
would be a new table like this (note how it only ever needs these 2 columns)
stu_course_lnk
- - - - - - - -
stu_code course_id
123 ABC
124 ABC
...
123 XYZ
...
124 LMN
then you add joins of
...
student.stu_code = stu_course_lnk.stu_code
and
stu_course_lnk.course_id = course.id
...

Is there way to add multiple values to 1 ID in access

I have a table that has Act ID, and another table that has Act ID, percentage complete. This can have multiple entries for different days. I need the sum of the percentage added for the Act ID on the first tableZA.108381.080
First table
Act ID Percent Date
ZA.108381.110 Total from 2 table
ZA.108381.120
ZA.108476.020
ZA.108381.110 25% 5/25/19
ZA.108381.110 75 6/1/19
ZA.108381.120
ZA.108476.020
This would be generally considering not good practice. Your primary key should be uniquely identifiable for that specific table, and any other data related to that key should be stored in separate columns.
However since an answer is not a place for a lecture, if you want to store multiple values in you Act ID column, I would suggest changing your primary key to something more generic "RowID". Then using vba to insert multiple values into this field.
However changing the primary key late in a databases life may cause alot of issues or be difficult. So good luck
Storing calculated values in a table has the disadvantage that these entries may become outdated as the other table is updated. It is preferable to query the tables on the fly to always get uptodate results
SELECT A.ActID, SUM(B.Percentage) AS SumPercent
FROM
table1 A
LEFT JOIN table2 B
ON A.ActID = B.ActID
GROUP BY A.ActID
ORDER BY A.ActID
This query allows you to add additional columns from the first table. If you only need the ActID from table 1, then you can simplify the query, and instead take it from table 2:
SELECT ActID, SUM(Percentage) AS SumPercent
FROM table2
GROUP BY ActID
ORDER BY ActID
If you have spaces other other special characters in a column or table name, you must escape it with []. E.g. [Act ID].
Do not change the IDs in the table. If you want to have the result displayed as the ID merged with the sum, change the query to
SELECT A.ActID & "." & Format(SUM(B.Percentage), "0.000") AS Result
FROM ...
See also: SQL GROUP BY Statement (w3schools)

MySQL join 2 tables on non-unique column and with timestamp conditions

I have 2 MySQL Tables: "parts_revisions" and "categories_revisions". My goal is to use the revisions data in these tables to create a log that lists out all the changes made to parts and categories. Listing the changes to "parts" in one single SQL statement has proven tricky though! Here is the situation:
All entries of each table have "timestamp" columns.
Every parts_revisions entry has a "categoryId" that basically links it to the categories_revisions table. (Every part is a child of a parent category.)
All I want to do is list out all the parts_revisions, but use the human-friendly "name" column from the categories_revisions table based on the categoryId column in parts_revisions. This will make the log more readable.
The trick is that, because there are usually multiple revisions for each category within the categories_revisions table, I cannot do just one big 'ol join on the categoryId column to get the name. The categoryId column is non-unique, and "name"s may vary. What I have to do is get the latest category_revisions entry that has a timestamp that is no later than the timestamp of the part_revisions entry. In other words, we want to get the appropriate category name that was in use AT THE TIME the part revision was made.
Not sure if this matches your table structure, but here's a go at it. It's a bit of an ugly subquery inside a subquery. Guessing it won't be terribly efficient
select part_name,
category,
(select name
from categories_revisions
where categories_revisions.match_id = parts_revisions.category
and categories_revisions.timestamp = (select MAX(categories_revisions.timestamp)
from categories_revisions
where categories_revisions.match_id = parts_revisions.category
and categories_revisions.timestamp < parts_revisions.timestamp)) as name
from parts_revisions;
http://sqlfiddle.com/#!2/da74e/1/0

Displaying Results if a Condition Matches a Name

There is a table in our database that contains customer information including their first and last name. The first and last name are stored as separate fields and not together as one name. There is also a table which stores a referral field. In this field, someone can place the name of the customer that referred them to our services.
I would like to utilize a query that will take the referral field (which would contain the name of a prior customer) and match it up to the record to that prior customer.
I thought the below would work:
SELECT APPLICATION_ID
FROM APPLICATION_TABLE
JOIN APPU_USER ON APPU_APPLICATION_ID = APPLICATION_ID
LEFT JOIN APBD_APP_BASIC_DATA ON APBD_APPLICATION_ID = APPLICATION_ID
WHERE CONCAT(APPU_FIRST_NAME,' ',APPU_LAST_NAME) = APBD_REFERRAL_STRING;
What do I need to utilize to be able to do this?
everything looks fine in your query. Is a good practice to put the table names when you use two or more tables in a query to avoid same fields conflicts, something like:
LEFT JOIN APBD_APP_BASIC_DATA ON APBD_APP_BASIC_DATA.APBD_APPLICATION_ID = APPLICATION_TABLE.APPLICATION_ID
also, take in mind than
CONCAT(APPU_FIRST_NAME,' ',APPU_LAST_NAME) = APBD_REFERRAL_STRING;
can cause problems if referral string is in format last name,first name or first name, last name, or with 2 spaces

The optimal way to store multiple-selection survey answers in a database

I'm currently working on a survey creation/administration web application with PHP/MySQL. I have gone through several revisions of the database tables, and I once again find that I may need to rethink the storage of a certain type of answer.
Right now, I have a table that looks like this:
survey_answers
id PK
eid
sesid
intvalue Nullable
charvalue Nullable
id = unique value assigned to each row
eid = Survey question that this answer is in reply to
sesid = The survey 'session' (information about the time and date of a survey take) id
intvalue = The value of the answer if it is a numerical value
charvalue = the value of the answer if it is a textual representation
This allowed me to continue using MySQL's mathematical functions to speed up processing.
I have however found a new challenge: storing questions that have multiple responses.
An example would be:
Which of the following do you enjoy eating? (choose all the apply)
Girl Scout Cookies
Bacon
Corn
Whale Fat
Now, when I want to store the result, I'm not sure of the best way to handle it.
Currently, I have a table just for multiple choice options that looks like this:
survey_element_options
id PK
eid
value
id = unique value associated with each row
eid = question/element that this option is associated with
value = textual value of that option
With this setup, I then store my returned multiple selection answers in 'survey_answers' as strings of comma separated id's of the element_options rows that were selected in the survey. (ie something like "4,6,7,9") I'm wondering if that is indeed the best solution, or if it would be more practical to create a new table that would hold each answer chosen, and then reference back to a given answer row which in turn references back to the element and ultimately the survey.
EDIT
for anyone interested, here is the approach I ended up taking (In PhpMyAdmin Relations View):
And a rudimentary query to gather the counts for a multiple select question would look like this:
SELECT e.question AS question, eo.value AS value, COUNT(eo.value) AS count
FROM survey_elements e, survey_element_options eo, survey_answer_options ao
WHERE e.id = 19
AND eo.eid = e.id
AND ao.oid = eo.id
GROUP BY eo.value
This really depends on a lot of things.
Generally, storing lists of comma separated values in a database is bad, especially if you plan to do anything remotely intelligent with that data. Especially if you want to do any kind of advanced reporting on the answers.
The best relational way to store this is to also define the answers in a second table and then link them to the users response to a question in a third table (with multiple entries per user-question, or possibly user-survey-question if the user could take multiple surveys with the same question on it.
This can get slightly complex as a a possible scenario as a simple example:
Example tables:
Users (Username, UserID)
Questions (qID, QuestionsText)
Answers (AnswerText [in this case example could be reusable, but this does cause an extra layer of complexity as well], aID)
Question_Answers ([Available answers for this question, multiple entries per question] qaID, qID, aID),
UserQuestionAnswers (qaID, uID)
Note: Meant as an example, not a recommendation
Convert primary key to not unique index and add answers for the same question under the same id.
For example.
id | eid | sesid | intval | charval
3 45 30 2
3 45 30 4
You can still add another column for regular unique PK if needed.
Keep things simple. No need for relation here.
It's a horses for courses thing really.
You can store as a comma separated string (But then what happens when you have a literal comma in one of your answers).
You can store as a one-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
intvalue Nullable
charvalue Nullable
And then loop over that table. If you picked one answer, it would create one row in this table. If you pick two answers, it will create two rows in this table, etc. Then you would remove the intvalue and charvalue from the survey_answers table.
Another choice, since you're already storing the element options in their own table, is to create a many-to-many table, such as:
survey_element_answers
id PK
survey_answers_id FK
survey_element_options_id FK
Again, one row per option selected.
Another option yet again is to store a bitmask value. This will remove the need for a many-to-many table.
survey_element_options
id PK
eid FK
value Text
optionnumber unique for each eid
optionbitmask 2 ^ optionnumber
optionnumber should be unique for each eid, and increment starting with one. There will impose a limit of 63 options if you are using bigint, or 31 options if you are using int.
And then in your survey_answers
id PK
eid
sesid
answerbitmask bigint
Answerbitmask is calculated by adding all of the optionbitmask's together, for each option the user selected. For example, if 7 were stored in Answerbitmask, then that means that the user selected the first three options.
Joins can be done by:
WHERE survey_answers.answerbitmask & survey_element_options.optionbitmask > 0
So yeah, there's a few options to consider.
If you don't use the id as a foreign key in another query, or if you can query results using the sesid, try a many to one relationship.
Otherwise I'd store multiple choice answers as a serialized array, such as JSON or through php's serialize() function.