Multiple table joins in rails - mysql

How do I write the mysql query below into rails activerecord
select
A.*,
B.*
from
raga_contest_applicants_songs AS A
join
raga_contest_applicants AS B
ON B.contest_applicant_id = A.contest_applicant_id
join
raga_contest_rounds AS C
ON C.contest_cat_id = B.contest_cat_id
WHERE
C.contest_cat_id = contest_cat_id
GROUP BY
C.contest_cat_id
I know how to write joins on two tables; however, I'm not very confident on how to use join on 3 tables.

To rewrite the SQL query you've got in your question, I think it should be like the following (though I'm having a hard time fully visualizing your model relationships, so this is a bit of guesswork):
RagaContextApplicantsSong.
joins(:raga_contest_applicants => [:raga_content_rounds], :contest_cat).
group('raga_contest_rounds.contest_cat_id')
...such that the joins method takes care of both of the two joins as well as the WHERE clause, followed finally by the group call.
As more for reference:
If you're joining multiple associations to the same model you can simply list them:
Post.joins(:category, :comments)
Returns all posts that have a category and at least one comment
If you're joining nested tables you can list them as in a hash:
Post.joins(:comments => :guest)
Returns all comments made by a guest
Nested associations, multiple level:
Category.joins(:posts => [{:comments => :guest}, :tags])
Returns all posts with their comments where the post has at least one comment made by a guest
You can also chain ActiveRecord Query Interface calls such that:
Post.joins(:category, :comments)
...produces the same SQL as...
Post.joins(:category).joins(:comments)
If all else fails you can always pass a SQL fragment directly into the joins method as a stepping stone to getting from your working query to something more ARQI-centric
Client.joins('LEFT OUTER JOIN addresses ON addresses.client_id = clients.id')
=> SELECT clients.* FROM clients LEFT OUTER JOIN addresses ON addresses.client_id = clients.id

Related

How best to retrieve a record with related table items in using JOOQ?

First of I apologise for the title but can't think of a better way to word it.
Secondly I have the following table:
Profiles Table:
Primary Key: profileName <-----
|
Repositories Table: |
Composite Primary Keys: (profileName, repository_name)
simulating a 1 - n relationship between the profiles table and repositories table.
I recently discovered jooq and using it to retrieve and store the data from the db and have this code for retrieving a profile from the db:
profile = db.select().from(Profiles.PROFILES, Repositories.REPOSITORIES).fetch().stream()
.filter(t -> t.getValue(Profiles.PROFILES.PROFILENAME).equalsIgnoreCase(profileName))
.limit(1) //limit it to just the one result
.map(this::convertToProfile)
.collect(Collectors.toList()).get(0);
works fine, but I am unsure of how to improve this to include the the retrieval of possible repositories found in the repositories table. That is to say repositories aren't mandatory rather optional to the profiles table.
My only option right now is to create a 'second cycle logic' to retrieve the repositories using the profile name before unmarshalling the data.
Push operations into the database
Your query will be very slow as your data grows. Why? Because your SQL query only runs the cartesian product between PROFILES and REPOSITORIES tables, whereas the join predicate and the limit clause are then applied in Java memory.
The database never knows what you want to do with that cross product, so it runs this very slow query very dumbly. If you provide the database with more information by "pushing the predicate down" into the jOOQ/SQL query, the whole thing will run much faster (although, the stream result is technically equivalent). So, instead, write this:
profile = db.select()
.from(PROFILES, REPOSITORIES)
// WHERE and LIMIT are moved "up" into the SQL query:
.where(PROFILES.PROFILENAME.equalIgnoreCase(profileName))
.limit(1)
.fetch().stream()
.map(this::convertToProfile)
.collect(Collectors.toList()).get(0);
This query is the same as yours (not yet correct), but much faster
Getting joins correctly
The above query still runs a cartesian product between the two tables. You probably want to join them, instead. There are two ways of joining in SQL:
Using the WHERE clause
Just add a JOIN predicate to the where clause and you're set
profile = db.select()
.from(PROFILES, REPOSITORIES)
// Join predicate here:
.where(PROFILES.PROFILENAME.equal(REPOSITORIES.PROFILENAME))
.and(PROFILES.PROFILENAME.equalIgnoreCase(profileName))
.limit(1)
.fetch().stream()
.map(this::convertToProfile)
.collect(Collectors.toList()).get(0);
This is also called INNER JOIN, which can be written using the JOIN clause for improved readability:
Using the (INNER) JOIN clause:
Most people will find this syntax more readable, because JOIN predicates are clearly separated from "ordinary" predicates:
profile = db.select()
.from(PROFILES)
// Join expression and predicates here:
.join(REPOSITORIES)
.on(PROFILES.PROFILENAME.equal(REPOSITORIES.PROFILENAME))
// Ordinary predicates remain in the where clause:
.where(PROFILES.PROFILENAME.equalIgnoreCase(profileName))
.limit(1)
.fetch().stream()
.map(this::convertToProfile)
.collect(Collectors.toList()).get(0);
"optional" JOIN
In SQL, this is called an OUTER JOIN, or more particularly a LEFT (OUTER) JOIN:
profile = db.select()
.from(PROFILES)
// LEFT JOIN expression and predicates here:
.leftJoin(REPOSITORIES)
.on(PROFILES.PROFILENAME.equal(REPOSITORIES.PROFILENAME))
.where(PROFILES.PROFILENAME.equalIgnoreCase(profileName))
.limit(1)
.fetch().stream()
.map(this::convertToProfile)
.collect(Collectors.toList()).get(0);
Note that the list of REPOSITORIES will not be empty but contain a repository with all values set to NULL. That's how an OUTER JOIN works

Finding which of an array of IDs has no record with a single query

I'm generating prepared statements with PHP PDO to pull in information from two tables based on an array of IDs.
Then I realized that if an ID passed had no record I wouldn't know.
I'm locating records with
SELECT
r.`DEANumber`,
TRIM(r.`ActivityCode`) AS ActivityCode,
TRIM(r.`ActivitySubCode`) as ActivitySubCode,
// other fields...
a.Activity
FROM
`registrants` r,
`activities` a
WHERE r.`DEAnumber` IN ( ?,?,?,?,?,?,?,? )
AND a.Code = ActivityCode
AND a.Subcode = ActivitySubCode
But I am having trouble figuring out the negative join that says which of the IDs has no record.
If two tables were involved I think I could do it like this
SELECT
r.DEAnumber
FROM registrant r
LEFT JOIN registrant2 r2 ON r.DEAnumber = r2.DEAnumber
WHERE r2.DEAnumber IS NULL
But I'm stumped as to how to use the array of IDs here. Obviously I could iterate over the array and track which queries had not result but it seems like such a manual and wasteful way to go...
Obviously I could iterate over the array and track which queries had not result but it seems like such a manual and wasteful way to go.
What could be a real waste is spending time solving this non-existent "problem".
Yes, you could iterate. Either manually, or using a syntax sugar like array_diff() in PHP.
I suggest that instead of making your query more complex (means heavier to support) for little gain, you just move on.
As old man Knuth once said 'premature optimization is the root of all evil'.
The only thing I could think of a help from PDO is a fetch mode that will put IDs as keys for the returned array, and thus you'll be able to make it without [explicitly written] loop, like
$stmt->execute($ids);
$data = $stmt->fetchAll(PDO::FETCH_UNIQUE);
$notFound = array_diff($ids, array_keys($data));
Yet a manual loop would have taken only two extra lines, which is, honestly, not that a big deal to talk about.
You are on the right track - a left join that filters out matches will give you the missing joins. You just need to move all conditions on the left-joined table up into the join.
If you leave the conditions on the joined table in the where clause you effectively cause an inner join, because the where clause is executed on the rows after the join is made, which is too late if there was no join in the first place.
Change the query to use proper join syntax, specifying a left join, with the conditions on activity moved to the join'n on clause:
SELECT
r.DEANumber,
TRIM(r.ActivityCode) AS ActivityCode,
TRIM(r.ActivitySubCode) as ActivitySubCode,
// other fields...
a.Activity
FROM registrants r
LEFT JOIN activities a ON a.Code = ActivityCode
AND a.Subcode = ActivitySubCode
WHERE r.DEAnumber IN (?,?,?,?,?,?,?,?)
In your app code, if Activity is null then you know there was no activity for that id.
This won't affect performance much, other than to return (potentially) more rows.
To just select all registrants without activities:
select r.DEAnumber
from registrants r
left join activities a on a.Code = ActivityCode
and a.Subcode = ActivitySubCode
where r.`DEAnumber` IN ( ?,?,?,?,?,?,?,? )
and a.Code is null

MySQL - Fix multiple records

SELECT cf.FK_collection, c.collectionName,
uf.FK_userMe, uf.FK_userYou,
u.userId, u.username
FROM userFollows as uf
INNER JOIN collectionFollows as cf ON uf.FK_userMe = cf.FK_user
INNER JOIN collections as c ON cf.FK_collection = c.collectionId
INNER JOIN users as u ON uf.FK_userYou = u.userId
WHERE uf.FK_userMe = 2
Hey guys.
I'm trying to make this query, and it of course won't do as I want it to, since it's returning multiple rows which is in some way what I want, and yet it's not. Let me try to explain:
I trying to get both collectionFollows and userFollows, for showing a users activity on the site. But when doing this, I will have multiple rows from userFollows even tho a user only follows 1. This occurs because I'm following multiple collectionFollows.
So when I show my result it will return like this:
John is following 'webdesign'
John is following 'Lisa'
John is following 'programming'
John is following 'Lisa'
I would like to know if I have to make multiple queries or use an subquery? What would be best practice? And how would I write the query then?
You are actually combining two quite unrelated queries. I would keep them as separate queries, especially since you report them like that too. You could, if you like, use UNION ALL to combine those queries. This way, you have just a list of names of items you follow, regardless of the type of item it is. If you want, you can specify that too.
SELECT
cf.user,
cf.FK_collection as followItem,
c.collectionName as followName,
'collection' as followType
FROM collectionFollows as cf
INNER JOIN collections as c ON cf.FK_collection = c.collectionId
WHERE cf.user = 2
UNION ALL
SELECT
uf.FK_userMe,
u.userId,
u.username
'user' as followType
FROM userFollows as uf
INNER JOIN users as u ON uf.FK_userYou = u.userId
WHERE uf.FK_userMe = 2
An alternative would be to filter unique values in PHP, but even then your query will fail. Because of the inner joins, you will not get any results if a user only follows other users or only follows collections. You need at least one of both to get any results.
You could change INNER JOIN to LEFT JOIN, but then you would still have to post-process the query to filter doubles and filter out the NULL values.
UNION ALL is fast. It just sticks two query results together without furthes processing. This is different from UNION, which will filter double as well (like DISTINCT). In this case, it is not needed, because I assume a user can only follow a collection or other user once, so these queries will never return duplicate records. If that is indeed the case, UNION ALL will do just fine and will be faster than UNION.
Apart from UNION ALL, two separate queries is fine too.

Proper SqlAlchemy query using contains_eager

I have a query that I have found a working solution for. I am not sure if I am performing the query properly and I was hoping to find out. The tables appear as follows:
The query is:
q = session.query(Person).outerjoin(PetOwner).join(Animal).options(contains_eager("petowner_set"), contains_eager("petowner_set.animal"))
There is a relationship on person connecting it to petowner.
It would be easy if the join from person to petowner AND the join from petowner to animal were both inner joins or both outer joins. However, the join from person to petowner is an outer join and the join from petowner to animal is an inner join. To accomplish this, I added two contains_eager calls to the options.
Is this the correct way to accomplish this?
Short answer: From what I can see, you should be using outerjoin for both, unless you do not want to see Persons who have no animals.
First, Lets take a look at JOINs:
both INNER: in this case the result of your query will return only those Persons that have at least one animal (assuming that any PetOwner.animal is not nullable)
OUTER for PetOwner, INNER for Animal: same as above (again, assuming that any PetOwner.animal is not nullable)
both OUTER: the result of your query will return all Persons irrespective if they own an Animal or not
Then, what do contains_eager do? According to the documentation,
... will indicate to the query that the given attribute should be eagerly loaded from columns currently in the query.
What this means is that when you access the Person.petowner_set, not additional database query will be required, because the SA will will load the relationships from your original query. This has absolutely no impact of how your JOINs work, and only affects the loading of the relationships. This is simply a data loading optimization.
I think the only difference is that you should chain the calls to contains_eager, like this:
q = (session.query(Person)
.outerjoin(PetOwner)
.join(Animal)
.options(
contains_eager("petowner_set").contains_eager("petowner_set.animal")
)
)
see: https://docs.sqlalchemy.org/en/latest/orm/loading_relationships.html#controlling-loading-via-options

SQL to Filter on multiple dimensions with a many-to-many relationship

In my rails app I have three tables to deal with the many-to-many relationship between courses and categories
courses
course_categories_courses
course_categories
I have groups of categories, and I want to allow filtering of the listing of the courses by categories through an interface like:
Location
very near
near
far
Type
short
medium
long
To search for medium types either near or far I had thought of using:
SELECT distinct courses.*
FROM `courses`
inner join course_categories on
course_categories_courses.course_category_id = course_categories.id
and (
course_categories.id in ('medium')
and course_categories.id in ('near', 'far')
)
but that's not working. Anyone able to point me in the right direction please?
You generally want to put to 'relationships' as join conditions, and values (literals) to filter by in the where clauses
ex. join condition - on course_categories_courses.course_category_id = course_categories.id
ex. where clause - where course_categories.id in ('near', 'far')
So you might want to rewrite your query that way. But more than that, how can the both of these possible be true?
(course_categories.id in ('medium') and course_categories.id in ('near', 'far'))
Is that what you intend?
You have to join both tables and you are using id field where type and location (according to your description) should be used
SELECT distinct courses.* FROM `courses`
inner course_categories_courses on course_categories_courses.course_category_id = courses.course_category_id
inner join course_categories on
course_categories.id = course_categories_courses.id
where (course_categories.type in ('medium') and course_categories.location in ('near', 'far'))
Syntactaclly the query you wrote is fine. There are two unusal things about it as other's have noted
Filter on a Inner join. (Typically done in the where but perhaps you have your reasons)
course_categories.id in ('medium'). Most would expect a number here but perhaps this is just for demonstration purposes.
The reason for it "not working"
You're getting an error that you've swept under the rug in your ruby that you're not sharing
Its working there simply are no records matching your criteria.
Ways to debug.
Run it in Workbench (or some other client) and check the results
If there are no errors and no record are returned run some queries in workbench
Select count(*) from course_categories where course_categories.id in ('medium')
Select count(*) from course_categories where course_categories.id in ('near', 'far')
If you got the results you wanted in Workbench then its probably your client (Ruby) code, bad connection string perhaps?