I have two tables: one called tweets and one called references. tweets consists out of the rows tweet_id and classified amongst others. references consists out of the rows tweet_id and class_id.
The row tweet_id in the table references only consists out of a fraction of the total tweet_ids in the table tweets.
What I would like to do is combine these tables in such a way that the eventual table shows the rows r.tweet_id, t.classified and r.class_id.
I've come up with this query, but for some reason it shows zero rows of output. In reality however, there are about 900 rows in r.tweet_id which all exist in t.tweet_id.
SELECT 't.tweet_id', 't.classified', 'r.tweet_id', 'r.class_id'
FROM `tweets` t, `references` r
WHERE 'r.tweet_id' = 't.tweet_id'
Could somebody tell me what I am doing wrong and how I should change my script in order to get the desired outcome?
Mysql uses backticks ` to escape schema object names (columns, tables, databases) and apostrophes ' and quotes " to escape strings so you are comparing string r.tweet_id with string t.tweed_id in your condition (which is supposed to be false), do:
SELECT t.tweet_id, t.classified, r.tweet_id, r.class_id
FROM tweets AS t
INNER JOIN `references` AS r ON r.tweet_id = t.tweet_id
Note that you have to just escape word references because it's reserved word in mysql and you can omit other backticks.
Also if you also want to display rows like 1, 2, NULL, NULL (tweets that weren't classified) you can use LEFT JOIN instead of INNER JOIN;if you allow multiple classifications per one tweet, some GROUP BY (Aggregate) Functions may get handy.
BTW: PostgreSQL uses " for schema object names and ' for strings.
Related
Currently, this is what my SELECT code looks like:
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON course.stu_code ?????;
Basically, to elaborate the student table inherits from user table, therefore I had user_id = stu_code. What I'm confused about is how to join course table with student table.
Let's say that the course table has a course code (PK), a few other attributes and a stu_code column, however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Example: Student table has stu_code string value of '123' and course table has a stu_code with string value of '123, 246, 369'.
How would I go about joining these two tables together and separating the stu_code in the course table so that it represents 3 separate stu_code values -> i.e. '123', '246', '369'.
Any help is greatly appreciated!
however, the student code column has multiple values inside a single column to represent that multiple students are taking the course and stored as VARCHAR.
Your data model is broken. Put your effort into fixing the data model. You want a junction/association table courseStudents or perhaps enrolled, with columns like:
stu_code (foreign key to students)
course_code (foreign key to students)
enrollment_date
and so on
What is wrong with your data model? Here are a few things:
You are storing numbers as a string.
You are putting multiple values into a string column.
You cannot define foreign key relationships.
SQL has poor string handling capabilities.
SQL has a great way to store lists of things. It is not called "string". It is called "table".
Your data model is ~broken~ hindering you from elegant solutions.
You cannot join your two tables efficiently. While they might both contain strings they do not contain data with the same rules. Thus, you must transform the data in order to join them so you could do this in a few ways but one way is using regular expression function.
You can use it to evaluate a test on whether the stu_code matches the list of codes. Further, you can do this dynamically ... constructing the test string itself based upon values from the left and right
join based on REGEXP
SELECT student.stu_code, user.f_name, user.l_name
FROM user
INNER JOIN student
ON student.stu_code = user.user_id
INNER JOIN course
ON student.stu_code REGEXP CONCAT('[[:<:]]',course.stu_code,'[[:>:]]')
Assuming tables and data:
Student
- - - -
stu_code
123
Course
- - - -
stu_code
'123, 246, 369'
Example:
http://sqlfiddle.com/#!9/672b57f/4
about the regular expression
in mysql the regex syntax can be a little bit different. [[:<:]] is the character class in spencer notation for word boundary.
if you have a new enough version of mysql/mariadb you can use more typical ICU notation of \b.
more about that here : https://dev.mysql.com/doc/refman/8.0/en/regexp.html
about efficiency
in large datasets the performance will be awful. you will have to scan all records and you will have to perform the function on all of them. In a large set you might get some gains by joining on like first (which is faster than regexp). This will be much faster at filtering-out and then the regexp can deal with filtering-in.
Perhaps your model was based upon an assumption of having a courses table with very few rows?
It ironic because you have made your course table unnecessarily large. You would actually be better off with an intermediary table that represents the many-to-many nature (the fact that students can take many courses and courses can have many students) with 1 row per unique relationship. While this table would be an order of magnitude "longer" it would be leaner and it could be indexed and query performance would be faster.
The courses table does not need to have any awareness of the student list and thus you can alter courses by removing courses.stu_code once you change the model (aside: It might be useful if courses cached a hint of the expected student count for that course)
possible link table
would be a new table like this (note how it only ever needs these 2 columns)
stu_course_lnk
- - - - - - - -
stu_code course_id
123 ABC
124 ABC
...
123 XYZ
...
124 LMN
then you add joins of
...
student.stu_code = stu_course_lnk.stu_code
and
stu_course_lnk.course_id = course.id
...
I have an SQL table on users where particular accounts are tagged with country code (2 letter words in uppercase) while other substrings in the tags (all separated by commas) are either in lowercase or more than 2 letters long.
In user table
Eg:
id User_tags
1 alu,US,ATD
2 GB,xx
3 ol,tuds,FR
Users 1,2 and 3 are tagged to countries US, GB and FR and I need to extract them from the user_tags column. I understand that regex functions are needed but I am not able to make them work in an SQL query.
Create a country code ref table and join to using the below. I don’t have sqlserver open, so unable to double check syntax , but it should work.
Select *
From yourtable y left join refcountry r
On charindex (r.code+’,’,y.string+’,’)>0
Note this might be slow for a large dataset
If not sqlserver find the equiv function for charindex in your rdbms
I want to order all rows of a table ("posts") by a (sort-)value ("sortDate") that is stored in a second table ("meta").
The (sort-)value of the second table is stored as a key-value pair. the key is 'publishDate'
The linking column between both tables is "postID".
The (sort-)value of the second table is optional, or can be entered multiple times.
-> If the (sort-)value is entered multiple times, i want to use the maximum.
-> If the (sort-)value is not present in the second table, i want to use the "postDate" - value of the first table instead.
This is my solution:
SELECT posts.postID,posts.postDate,metaDate.publishDate,
CASE
WHEN metaDate.publishDate is null Then posts.postDate
ELSE metaDate.publishDate
END AS sortDate /*fallback for those rows that do not have a matching key-value pair in second table*/
From posts
Left Join
(
Select meta.postID,MAX(metaValue) as publishDate
From meta
Where meta.metaKey = 'publishDate'
GROUP BY meta.postID
) As metaDate /*create a table with the maximum of publishDate, therefor handle multiple entries*/
ON posts.postID = metaDate.postID
ORDER BY sortDate DESC;
see also
sqlfiddle with this solution --->
Is there a smarte / faster way to do so?
As i am not a sql expert - anything i have overseen ?
(Background:
the structure of the tables is a wordpress-database-structure, therefore it is given, a related topic would be "sort posts by custom fields in wordpress" - but the solutions i found did not handle multiple or optional custom fields)
Thanks for comments and support
I am looking to perform an exact match on a phrase within specified delimiters in MySQL. I have the following data in a full text index field.
,garden furniture,patio heaters,best offers,best deals,
I am performing the following query which is returning the aforementioned record.
SELECT id, tags
FROM Store
WHERE MATCH(tags) AGAINST(',garden,' IN BOOLEAN MODE)
I only want to return records which contain the value: ,garden, not ,garden furniture, or ,country garden, etc.
It is currently performing a greedy match and ignoring the comma delimiters specified in the query. I have attempted to escape the commas to force them to be included in the query, but this does not work.
Is is possible to specify non-alphanumeric delimiters as part of the match? I want to be able to perform an exact match, like a regular expression i.e '/,garden,/'.
From the docs:
Modify a character set file: This requires no recompilation. The true_word_char() macro uses a “character type” table to distinguish letters and numbers from other characters. . You can edit the contents of the <ctype><map> array in one of the character set XML files to specify that ',' is a “letter.” Then use the given character set for your FULLTEXT indexes. For information about the <ctype><map> array format, see Section 9.3.1, “Character Definition Arrays”.
An other option is to add a new collation.
Either way, you'll have to rebuild the index:
REPAIR TABLE Store QUICK;
Only match against can use an index on your search.
However if your table if not too big, you can use:
SELECT id, tags
FROM Store
WHERE tags LIKE "garden" OR tags LIKE "garden,%" OR tags LIKE "%, garden,%"
There are other options (find_in_set), but I really don't want to go into those, because they perform even worse than the above SQL.
The real problem, never use CSV in a database!
Use CSV in a database is a really really bad idea, because
• It is wasteful, your data is not normalized
• You cannot join on a CSV field
• You cannot use indexes on a CSV field
• Full-text indexes does not play nice with separators (as you've seen)
The answer to create 2 extra tables.
Table tag (innoDB)
----------
id integer primary key auto_increment
tag varchar(50) //one tag per row!
Table tag_link (innoDB)
--------------
store_id integer foreign key references store(id)
tag_id integer foreign key references tag(id)
primary key = (store_id + tag_id) //composite PK
Now you can easily do all sorts of queries on tags.
SELECT s.id, GROUP_CONCAT(t2.tag) FROM store s
INNER JOIN tag_link tl1 ON (s.id = tl1.store_id)
INNER JOIN tag t1 ON (t1.id = tl1.tag_id)
INNER JOIN tag_link tl2 ON (s.id = tl2.store_id)
INNER JOIN tag t2 ON (t2.id = tl2.tag_id)
WHERE t1.tag = 'garden'
GROUP BY s.id
This will select one tag named garden (using t1 and tl1), find all stores linked to that tag and then get all tags linked to those stores (using t2 and tl2).
Very fast and very flexible.
I have a table products where one of the columns are relatedproducts. relatedproducts contains strings of concatenated product IDs (column productid) separated by colon, e.g. abc-123:foo-prod:ada69, etc. Due to some bad design, there is the case that a product may be removed from products table and still be referenced in the relatedproducts column.
So I need a sql query that goes through all rows in products table, checks the relatedproducts column by exploding the data (hence the exploding the the title) and sees if each referenced product exists in the same products table. However, I am a novice at sql and having trouble writing the join/regexp query to do this.
Any help will be appreciated!
MySQL can match regexp's, but unfortunately cannot return the matched substring.
You better do it using FIND_IN_SET:
SELECT *
FROM products p
JOIN product rel
ON FIND_IN_SET(rel.id, REPLACE(p.related, ':', ','))