MySQL conditional JOIN using keys from different tables for matching - mysql

I have read and learned a lot here, got a lot of problems solved without asking anything so far, but I couldn't cope with this one.
I have three tables:
-programs
->ProgramID (Primary Unique)
->ProgramName
->AuditID
-audits
->AuditID (Primary Unique)
->VenueName
->Address
-events
->EventID (Primary Unique)
->EventDate
->ProgramID
->AuditID
Events are instances of those programs that are on stage that day. Multiple instances of one program might be performed on one day, or none if it's not played. This is basically a temporary table, repopulated daily. Tickets can be bought from programs that are played on said day.
Audits are the actual stages where the plays are performed. A theatre can have several stages.
Programs are plays for theatres that are on their repertoire for years. The last stage where it was is performed is stored for when someone retrieves the information of a program that is not played on the actual day the stage can still be linked.
I am using this query now:
SELECT
programs.ProgramID,
programs.ProgramName,
programs.AuditID,
events.EventID,
events.EventDate,
audits.VenueName,
audits.Address
FROM programs
LEFT JOIN events
ON programs.ProgramID = events.ProgramID
JOIN audits
ON programs.AuditID = audits.AuditID
WHERE programs.ProgramID = :ProgramID
If there is no event of that program the LEFT JOIN gives me information about the program and the audit leaving the event info blank. Good.
Also okay when the event is at the same audit where it was played last time. Great.
The problem is if it's played in a different audit, because the JOIN for the audits table is made based on programs.AuditID key, which is obsolete now.
What I would need is a conditional JOIN, so that when the events.AuditID is not null, the audits table is joined using events.AuditID instead of programs.AuditID.
Thanks in advance!

I believe you should be able to JOIN using IFNULL(events.AuditID, programs.AuditID) to solve this problem...
SELECT
programs.ProgramID,
programs.ProgramName,
programs.AuditID,
events.EventID,
events.EventDate,
audits.VenueName,
audits.Address
FROM programs
LEFT JOIN events
ON programs.ProgramID = events.ProgramID
JOIN audits
ON IFNULL(events.AuditID, programs.AuditID) = audits.AuditID
WHERE programs.ProgramID = :ProgramID

Related

How to chain SQL queries

I have four tables: series, seasons, episodes, images. Each series consists of multiple seasons which consists of multiple episodes. Each episode has one or more images attached to it. Now I would like to retrieve one series including all its seasons, episodes and images.
SELECT * FROM series
LEFT JOIN seasons ON seasons.seasons_series_id=series.series_id
LEFT JOIN episodes ON episodes.episodes_seasons_id=seasons.seasons_id
LEFT JOIN images ON images.images_id=episodes.episodes_images_id
WHERE series.series_id=1
The above query does not work, because seasons_id is not available when running the second LEFT JOIN etc. Should I be using nested queries instead?
In the query posted to the question, the seasons_id generally IS available for that second LEFT JOIN (and the third, if it comes to it).
When you add additional JOINs to a query, those JOINs take into account not only the table from the original FROM clause but also the entire result sets built up by any additional JOIN so far. This is one reason why always using an alias for your tables is a good idea... its possible to include the same table in a query more than once via a JOIN, and aliases can be important to keep straight separate instances of the same table.
The only case when your seasons_id would not be available is when you have a series record that does not have any seasons records associated with it. In this case, you would have a NULL value in your results for the seasons_id, and you would further have no way in the schema shown to connect any episode record with that series record at all. In this schema, every series must have at least one season if it is to have any episodes or images. Thus, the missing seasons_id wouldn't matter anyway, because you couldn't ever hope to match any episode records for that series.
There's nothing wrong with your query.. if a relationship breaks down and left join shows e.g. A season 2 with no known episodes, then there won't be any images for those non-episodes. It doesn't stop the series having two seasons, you just see results like:
Game of thrones, season 1, episode 1, image 1
Game of thrones, season 2, null, null
If your database enforces relationships then you'll never be able to insert images from game of thrones season 2 episode 1, because the episode has to exist first to be a parent to the child images. If your database doesn't enforce relationships, then you can go ahead and insert a load of images and give them all an episode ID of 971 which you predict is what s2 e1 will get when you do get around to insert it, but they won't show in your query because theyre orphans if episode with ID 971 doesn't exist in the DB yet
If you're hoping your query will show these orphaned images, you'll have to write it in a different way
You might having missing columns as your are not using aliases for your tables, so MySQL does not know which column belongs to each table. Try to use an alias for every table and run it again.
Hope this helps

MYSQL: mass update data into existing table

I got a huge performance issue for my uploading data into mysql db. Using an example, I have special tools to mine say personal information of thousands of people.
I have one tool that mines the phone numbers of the people. Another that mines say the home address of the people. Another mines the photos of the person. So for this example, say there are 100000 people of Country A. I will have to mine data from different countries later on. These mining tools will finish at different times. The mining of phone numbers takes 20 mins. Mining of photos takes 1 week. Mining of the addresses takes 3 days.
The customer wants to see the data as soon as possible in an existing table/db. I wrote some scripts to detect when one tool finishes to start uploading row by row data. However, this seems to take a REALLY long time (using UPDATE ...).
Is there a faster way to do this?
The table that exists in the db is structure like this:
Columns: ID_COUNTRY,ID_PERSON,FULL NAME,PHONE,BLOB_PHOTO,ADDRESS
Yes, there is a faster way. Put the data from each of the processes into a separate table, by inserting into the table.
You will then have to create a query to gather the data:
select *
from people p left outer join
phones ph
on p.personid = ph.perhsonid left outer join
addresses a
on p.personid = a.personid left outer join
photos pho
on p.personid = pho.personid;
Each individual table should start off empty. When the results are available, the table can be loaded using insert. This has at least two advantages. (1) inserts are faster than updates, and bulk inserts may be faster still. (2) The data is available in some tables without blocking inserts into the rest of the tables.

Better practice: saving count result into field or not?

I'm devloping a music streaming site where I have two major tables: 'activity' and 'music'. Activity saves, among other things, every song reproduction into a new record.
Every time I select from music I need to fetch the number of reproductions of every song. So, what would be the better practice
SELECT music.song, music.artist, COUNT (activity.id) AS reproductions
FROM music LEFT JOIN activity USING (song_id) WHERE music.song_id = XX
GROUP BY music.song_id
Or would it be better to save the number of reproductions into a new field in the music table and query this:
SELECT song, artist, reproductions FROM music WHERE music.song_id = XX
This last query is, of course, much easier. But to use it, every time I play a soundfile I should make two querys: one INSERT in the activity table, and one UPDATE on the reproductions field on music table.
What would be the better practice in this scenario?
Well this depends on the response times these two queries will have in time.
After tables will become huge (hypothetically) sql nr 2 will be better.
You have to think that in time even insert might be costly...you you might think on some data warehousing if you will have ..millions of rows in DB.

SQL Multiple Inner Joins on the same field

I am currently working as a programmer in a small startup aiming on providing pay-per-view content online and I have been assigned to develop the metadata database for the movie catalogue
I have two main tables, movie and people, where *movie_ID* and *people_ID* are the primary keys respectively for each table. Both tables have a many-to-many relationship.
To represent different relations I am currently using link tables, for example, actor_movie would store the *movie_ID* and corresponding *people_ID* for each of the actors in the movie, while the director_movie table would store the *movie_ID* and the director(s) *people_ID*. Same goes for writer, composers and producers.
Now, my problem is that I need to craft out a query that returns all the actors, directors, producers, writers, composers, etc. etc. in one single table to be passed on the frontend Web UI as a list of all the persons involved in the movie.
I'm currently stumped as to how to create a multiple SELECT query that would JOIN all the link tables together based on the *movie_ID* and *people_ID* and then return the details of each of the person in the people table as well.
And example of what I have written so far is:
SELECT
movie.titleMovie,
people.namePeople,
FROM
movie movie
INNER JOIN actorlinkmovie acm ON acm.idMovie = movie.idMovie
INNER JOIN people people ON people.idPeople = acm.idPeople
What I would like to have happen is:
SELECT
movie.idMovie,
movie.titleMovie,
movie.descMovie,
movie.dateMovie,
movie.runtimeMovie,
movie.langMovie,
movie.ratingMovie,
people.namePeople
FROM
htv_movie movie
INNER JOIN htv_actorlinkmovie acm ON acm.idMovie = movie.idMovie
INNER JOIN htv_directorlinkmovie dcm ON dcm.idMovie = movie.idMovie
INNER JOIN htv_producerlinkmovie pcm ON pcm.idMovie = movie.idMovie
INNER JOIN htv_people people WHERE people.idPeople = dcm.idPeople AND people.idPeople = acm.idPeople AND people.idPeople = pcm.idPeople
And it should return the all the related people from a single movie.
Would like to get some input about the whole design since I'm a pretty new at designing a whole database (first time actually) and whether would this design be suitable if I need to scale up to about 5000 movies (the current company aim). This database will pretty much serve as the website's backend as well.
Thanks.
UPDATE: Temporarily worked out a dirty solution using PHP variables and a template SQL query. Looks like doing multiple inner joins wasn't that required after all. Thanks for the suggestions though.
You can achieve your goal like this:
SELECT MovieName,
dbo.GetDirector(MovieID),
dbo.GetActors(MovieID),
dbo.GetWriter(MovieID) FROM Movie
where
dbo.GetDirector(MovieID) is a
function that will return directors
in the movie.
dbo.GetActors(MovieID) is a function
that will return actors in the movie.
dbo.GetWriter(MovieID) is a function
that will return writers in the
movie.
If there are some other tables then you can make functions for those tables as well.
Hope this helps.

Conditional columns in MySQL that need to do joins

I've researched related questions on the site but failed to find a solution. What I have is a user activity table in MySQL. It lists all kind of events of a user within a photo community site. Depending on the event that took place, I need to query certain data from other users.
I'll explain it in a more practical way by using two examples. First, a simple event, where the user joined the site. This is what the row in the activity table would look like:
event: REGISTERED
user_id: 19 (foreign key to user table)
date: current date
image_id: null, since this event has nothing to do with images
It is trivial to query this. Now an event for which extra data needs to be queried. This event indicates a user that uploaded an image:
event: IMAGEUPLOAD
user_id: 19 (foreign key to user table)
date: current date
image_id: 12
This second event needs to do a join to the image table to get the image URL column from that table. A third event could be about a comment vote, where I would need to do a join to the comments table to get extra columns.
In essence, I need a way to conditionally select extra columns (not rows) per row based on the event type. This is easy to do when the columns come from the same table, but I'm struggling to do this using joins from other tables. I hope to do this in one, conditional query without the use of a stored procedure.
Is this possible?
You could make the joins depend on the event type, like:
select *
from Events e
left join Image i
on e.event = 'IMAGEUPLOAD'
and e.image_id = i.id
left join comments c
on e.event = 'COMMENT'
and e.comment_id = c.id
If there's one column that is shared among all linked tables, for example create_date, you can coalesce to select the one that's not NULL:
select coalesce(i.create_date, c.create_date, ...) as create_date
Doing precisely what you want to do is not possible. A SELECT is designed to return a list of tuples/rows, and each has the same number of elements/columns.
What you really are doing here is collecting 2 different kinds of information, and you're going to have to process the 2 different kinds of information separately anyway, which should be a hint that you're doing something slightly wrong. Instead, pull the various event types out individually, perform whatever additional operations you need to do to convert them to your common output type (eg. HTML if this is for a website), and then interleave them together at that stage.