Multitable counting and multiplying in same query - mysql

I have got a somewhat complicated problem. This is my situation (ERD).
For a dashboard i need to create a pivot table that shows me the total amount of competences used by the vacancies. Therefore I need to:
Count the amount of vacancies per template
Count the amount of templates per competence
and last: multiply these numbers to get the total amount of comps used.
I have the first query:
SELECT vacancytemplate_id, count(id)
FROM vacancies
group by vacancytemplate_id;
And the second query isn't that difficult either, but I don't know what the right solution will be. I'm literally brainstuck. My mind can't comprehend how I can achieve the next step and put it down in a query. Please kind stranger, help me out :)
EDIT: my desired result is something like this
NameOfComp, NrOfTimesUsed
Leading, 17
Inspiring, 2
EDIT2: the meta query it should look like:
SELECT NameOfComp, (count of the competences used by templates) * (number of vacancies per template)
EDIT3: http://sqlfiddle.com/#!9/2773ca SQLFiddle
Thanks a lot!

If I am understanding your request correctly, you are wanting a count of competences per vacancy. This can be done very simply due to your table structure:
Select v.ID, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID group by v.ID;
The reason you can do only one join is because there will be a record in the CompTemplate_Table for every competency in each template. Additionally, the same key is used to join vacancy to templates as is used to join templates to CompTemplate_Table, so they represent the same key value (and you can skip joining the Templates table if you don't need data from there).
If you are wanting to add this data to a pivot table, I will leave that exercise to you. There are a number of tutorials available if you do a quick google search and it should not be that hard.
UPDATE: For the second query you are looking at something like:
Select cp.NameOfComp, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID inner join competencies as CP
on CP.ID = CT.Comp_ID
group by CP.NameOfComp
The differences here are you are adding in the comptetencies table, as you need data from that, and grouping by the CP.NameOfComp instead of the vacancy id. You can also restrict this to specific templates, competencies, or vacancies by adding in search conditions (e.g. where CP.ID = 12345)

Related

MySQL INNER JOIN with GROUP BY and COUNT(*)

I've never been able to get my head around INNER JOINs (or any other JOIN types for that matter) so I'm struggling to work out how to use it in my specific situation. In fact, I'm not even sure if it's what I need. I've looked at other examples and read tutorials but my brain just doesn't seem to work the way needed to truly get it (or it doesn't function at all).
Here's the scenario:
I have two tables -
phone_numbers - this table has a list of phone numbers that
belong to lots of different customers. A single customer can have
multiple numbers. For simplicity's sake, we'll say the fields are
'number_id', 'customer_id', 'phone_number'.
call_history - this table has a record of every single call that one of these
numbers in the first table could have had. There's a record for
every individual call going back years. Again, for simplicity,
we'll say the relevant fields are customer_id, phone_number,
call_start_time.
What I'm trying to accomplish is to find all of the numbers that belong to a particular customer_id in the phone numbers table and use that information to search through the call_history table and find the number of calls each phone number has received, and group that by the number of calls for each number, preferably also showing zeros where a number hasn't received any calls at all.
The reason the zero calls is important is because that's the data I'm interested in. Otherwise, I could just get all the information out of the call_history table. But what I'm trying to achieve is find the numbers with no activity.
All I've been able to accomplish is run one query to get all of the numbers belonging to one customer:
SELECT customer_id, phone_number FROM phone_numbers WHERE customer_id = Y;
Then run a second query to get all phone calls for that customer_id for a set duration:
SELECT customer_id, phone_number, COUNT(*) FROM call_history WHERE customer_id = Y and call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY) GROUP BY phone_number;
I've then had to use the data returned from both queries and use a VLOOKUP function in Excel to match number of calls for each individual number from the second query to the list of all numbers from the first query, thus leaving blanks in my "all numbers" table and identifying those numbers that had no calls for that time period.
I'm hoping there's some way to do all of this with a single query and return a table of results, listing the zero number of calls with it and eliminate the whole manual Excel bit as it's not overly efficient and prone to human error.
Without at least a workable example from you, it's not easy to re-create your situation. Anyway, INNER JOIN might not return the result as how you expected. In my short time with MySQL, I mainly use 2 types of JOIN; one is already mentioned and the other is LEFT JOIN. From what I can understand in your question, what you want to achieve can be done by using LEFT JOIN instead of INNER JOIN. I may not be the best person to explain this to you but this is how I understand it:
INNER JOIN - only return anything that match in ON clause between two (or more) tables.
LEFT JOIN - will return everything from the table on the left side of the join and return NULL if ON get no match in the table on the right side of the join .. unless you specify some WHERE condition from something on the right table.
Now, here is my query suggestion and hopefully it'll be useful for you:
SELECT A.customer_id, A.phone_number,
SUM(CASE WHEN call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY)
THEN 1 ELSE 0 END) AS Total
FROM phone_numbers A
LEFT JOIN call_history B
ON A.customer_id=B.customer_id
GROUP BY A.customer_id,A.phone_number;
What I did here is I LEFT JOIN phone_numbers table with call_history on customer_id and I re-position the WHERE call_start_time >= .. condition into a CASE expression in the SELECT since putting it at WHERE will turn this into a normal join or inner join instead.
Here is an example fiddle : https://www.db-fiddle.com/f/hriFWqVy5RGbnsdj8i3aVG/1
For Inner join You should have to do like this way..
SELECT customer_id,phone_number FROM phone_numbers as pn,call_history as ch where pn.customer_id = ch.customer_id and call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY) GROUP BY phone_number;
Just add table name whatever you want to join and add condition

SQL: Column Must Appear in the GROUP BY Clause Or Be Used in an Aggregate Function

I'm doing what I would have expected to be a fairly straightforward query on a modified version of the imdb database:
select primary_name, release_year, max(rating)
from titles natural join primary_names natural join title_ratings
group by year
having title_category = 'film' and year > 1989;
However, I'm immediately running into
"column must appear in the GROUP BY clause or be used in an aggregate function."
I've tried researching this but have gotten confusing information; some examples I've found for this problem look structurally identical to mine, where others state that you must group every single selected parameter, which defeats the whole purpose of a group as I'm only wanting to select the maximum entry per year.
What am I doing wrong with this query?
Expected result: table with 3 columns which displays the highest-rated movie of each year.
If you want the maximum entry per year, then you should do something like this:
select r.*
from ratings r
where r.rating = (select max(r2.rating) where r2.year = r.year) and
r.year > 1989;
In other words, group by is the wrong approach to writing this query.
I would also strongly encourage you to forget that natural join exists at all. It is an abomination. It uses the names of common columns for joins. It does not even use properly declared foreign key relationships. In addition, you cannot see what columns are used for the join.
While I am it, another piece of advice: qualify all column names in queries that have more than one table reference. That is, include the table alias in the column name.
If you want to display all the columns you can user window function like :
select primary_name, year, max(rating) Over (Partition by year) as rating
from titles natural
join primary_names natural join ratings
where title_type = 'film' and year > 1989;

MySQL: Count then sort by the count total

I know other posts talk about this, but I haven't been able to apply anything to this situation.
This is what I have so far.
SELECT *
FROM ccParts, ccChild, ccFamily
WHERE parGC = '26' AND
parChild = chiId AND
chiFamily = famId
ORDER BY famName, chiName
What I need to do is see the total number of ccParts with the same ccFamily in the results. Then, sort by the total.
It looks like this is close to what you want:
SELECT f.famId, f.famName, pc.parCount
FROM (
SELECT c.chiFamily AS famId, count(*) AS parCount
FROM
ccParts p
JOIN ccChild c ON p.parChild = c.chiId
WHERE p.parGC ='26'
GROUP BY c.chiFamily
) pc
JOIN ccFamily f ON f.famId = pc.famId
ORDER BY pc.parCount
The inline view (between the parentheses) is the headliner: it does your grouping and counting. Note that you do not need to join table ccFamily there to group by family, as table ccChild already carries the family information. If you don't need the family name (i.e. if its ID were sufficient), then you can stick with the inline view alone, and there ORDER BY count(*). The outer query just associates family name with the results.
Additionally, MySQL provides a non-standard mechanism by which you could combine the outer query with the inline view, but in this case it doesn't help much with either clarity or concision. The query I provided should be accepted by any SQL implementation, and it's to your advantage to learn such syntax and approaches first.
In the SELECT, add something like count(ccParts) as count then ORDER BY count instead? Not sure about the structure of your tables so you might need to improvise.

Remove duplicates from LEFT JOIN query

I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName

How do I get MYSQL to join a whole table?

I have a SELECT query that returns the response based on an unique ID, so I always get just one row.
I thought that I could save my machine an extra SELECT query if I simply added the prices table to the result, and read them to memory later on.
Would that be a good approach or am I missing something ?
(I tried it out and seems to get the job done)
SELECT *
FROM subscriptions
LEFT JOIN prices ON 1=1
WHERE subscriptions.ID = 100
edit: The prices table has no ID. I just need to get the complete table, I used to have a different SELECT just for that
This looks like a terrible idea... you should join the subscriptions table to the prices table using the foreign key that you (supposedly/should) have.
Assuming your prices table has a subscription ID column then your query should look something like this:
SELECT *
FROM subscriptions LEFT JOIN prices ON subscriptions.ID=prices.ID
WHERE subscriptions.ID=100
What this will do is produce a cartesian join - not too bad since you're limiting the 'subscriptions' side of things to a single record, but will still produce as many rows as there's records in the price side. Where this gets bad is when you've got multiple rows on both sides. Then you get n x m results - think of how big the result set would be if you had 50,000 subscriptions joined against 1000 prices: 50,000 x 1,000 = 50 million result rows.
First off, this approach is going to be much less clear what you're doing than two SELECT statements unless there is an actual relation between the tables. Second, it's probably going to be slower, because you're transferring much more data (each row of prices additionally gets all the fields from subscriptions copied).
If subscriptions and prices are related, you want to change that ON condition to use the relation, so you're only pulling the data you need.
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (s.subscription_id = p.subscription_id)
WHERE s.subscription_id = 100
One thing you definitely don't want to do is this:
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (1=1)
as that'd pull the full Cartesian product. Once your tables get sufficiently large, that will run you out of temporary table space.
why your condition have 1=1 ?
I thing that is's must something like this:
SELECT s.*,p.*
FROM subscriptions as s
LEFT JOIN prices as p ON p.product_id=s.product_id
WHERE s.ID = 100
show me your full fields of tables subscriptions and prices to help for you
This?
SELECT *
FROM subscriptions, prices
WHERE subscriptions.ID = 100
You'll get horrible results like this, but it seems this is what you wanted.
The table with less rows will have its rows repeating. Again, this is not a good practice.
Use two SELECTs.
This is a cross join http://en.wikipedia.org/wiki/Join_(SQL)#Cross_join
which means your resultset will contain as many rows as you have in the prices table.
So I guess it is not a good idea