SELECT DISTINCT ON Simple Table Unique Values - mysql

I have two tables, one with unique LTV (lifetime values) with around 3300 records and then the transaction log with more than 5000 transactions.
Whenever I run the following query it keeps showing me duplicate values. I just want to look up the person's first name and last name from the first column.
SELECT
SociAll.firstname,
SociAll.lastname,
SociLTV.Email,
SociLTV.LTV
FROM
SociAll
INNER JOIN SociLTV ON SociAll.Email = SociLTV.Email
Sometimes the same email address is repeated 3 or 4 times depending on the number of transactions from that given user, even though the LTV is the exact same value.
How can I have only 1 record per email address on this Query?

Try this:
SELECT
SociAll.firstname,
SociAll.lastname,
SociLTV.Email,
Sum(SociLTV.LTV)
FROM
SociAll
INNER JOIN SociLTV ON SociAll.Email = SociLTV.Email
GROUP BY SociAll.firstName,SociAll.LastName,SociAll.Email
You can also use COUNT() or MIN(), or MAX() etc. on the last column. If you don't care at all about the last column, remove it.
You can also do the following if you don't care at all about the SocilTV records
SELECT DISTINCT
SociAll.firstname,
SociAll.lastname,
SociLTV.Email,
FROM
SociAll
INNER JOIN SociLTV ON SociAll.Email = SociLTV.Email

We don't need a sum of the LTV since the LTV already has the final value. To answer the question I had a list of lifetime values of each customer.
SELECT
SociAll.firstname,
SociAll.lastname,
SociLTV.Email,
SociLTV.LTV
FROM
SociAll
INNER JOIN SociLTV ON SociAll.Email = SociLTV.Email
GROUP BY SociAll.firstName,SociAll.LastName,SociAll.Email
If I wasn't comparing the two tables and simply had a list of transactions, this query works great. It's a derivative of your solution.
SELECT
SociAll.Email,
SociAll.firstname,
SociAll.lastname,
Sum(SociAll.Price) as LTV
FROM
SociAll
GROUP BY SociAll.firstName,SociAll.LastName,SociAll.Email
As I posed the question I was using a pre-established 'LTV' table export from a list of customers and their lifetime value from an excel spreadsheet.
Thank you so much for your contributions. I hope others find this post useful.

Related

MySQL INNER JOIN with GROUP BY and COUNT(*)

I've never been able to get my head around INNER JOINs (or any other JOIN types for that matter) so I'm struggling to work out how to use it in my specific situation. In fact, I'm not even sure if it's what I need. I've looked at other examples and read tutorials but my brain just doesn't seem to work the way needed to truly get it (or it doesn't function at all).
Here's the scenario:
I have two tables -
phone_numbers - this table has a list of phone numbers that
belong to lots of different customers. A single customer can have
multiple numbers. For simplicity's sake, we'll say the fields are
'number_id', 'customer_id', 'phone_number'.
call_history - this table has a record of every single call that one of these
numbers in the first table could have had. There's a record for
every individual call going back years. Again, for simplicity,
we'll say the relevant fields are customer_id, phone_number,
call_start_time.
What I'm trying to accomplish is to find all of the numbers that belong to a particular customer_id in the phone numbers table and use that information to search through the call_history table and find the number of calls each phone number has received, and group that by the number of calls for each number, preferably also showing zeros where a number hasn't received any calls at all.
The reason the zero calls is important is because that's the data I'm interested in. Otherwise, I could just get all the information out of the call_history table. But what I'm trying to achieve is find the numbers with no activity.
All I've been able to accomplish is run one query to get all of the numbers belonging to one customer:
SELECT customer_id, phone_number FROM phone_numbers WHERE customer_id = Y;
Then run a second query to get all phone calls for that customer_id for a set duration:
SELECT customer_id, phone_number, COUNT(*) FROM call_history WHERE customer_id = Y and call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY) GROUP BY phone_number;
I've then had to use the data returned from both queries and use a VLOOKUP function in Excel to match number of calls for each individual number from the second query to the list of all numbers from the first query, thus leaving blanks in my "all numbers" table and identifying those numbers that had no calls for that time period.
I'm hoping there's some way to do all of this with a single query and return a table of results, listing the zero number of calls with it and eliminate the whole manual Excel bit as it's not overly efficient and prone to human error.
Without at least a workable example from you, it's not easy to re-create your situation. Anyway, INNER JOIN might not return the result as how you expected. In my short time with MySQL, I mainly use 2 types of JOIN; one is already mentioned and the other is LEFT JOIN. From what I can understand in your question, what you want to achieve can be done by using LEFT JOIN instead of INNER JOIN. I may not be the best person to explain this to you but this is how I understand it:
INNER JOIN - only return anything that match in ON clause between two (or more) tables.
LEFT JOIN - will return everything from the table on the left side of the join and return NULL if ON get no match in the table on the right side of the join .. unless you specify some WHERE condition from something on the right table.
Now, here is my query suggestion and hopefully it'll be useful for you:
SELECT A.customer_id, A.phone_number,
SUM(CASE WHEN call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY)
THEN 1 ELSE 0 END) AS Total
FROM phone_numbers A
LEFT JOIN call_history B
ON A.customer_id=B.customer_id
GROUP BY A.customer_id,A.phone_number;
What I did here is I LEFT JOIN phone_numbers table with call_history on customer_id and I re-position the WHERE call_start_time >= .. condition into a CASE expression in the SELECT since putting it at WHERE will turn this into a normal join or inner join instead.
Here is an example fiddle : https://www.db-fiddle.com/f/hriFWqVy5RGbnsdj8i3aVG/1
For Inner join You should have to do like this way..
SELECT customer_id,phone_number FROM phone_numbers as pn,call_history as ch where pn.customer_id = ch.customer_id and call_start_time >= DATE_SUB(SYSDATE(), INTERVAL 30 DAY) GROUP BY phone_number;
Just add table name whatever you want to join and add condition

Join error and order by

I'm trying to write a query which does the below:
For every guest who has the word “Edinburgh” in their address show the total number of nights booked. Be sure to include 0 for those guests who have never had a booking. Show last name, first name, address and number of nights. Order by last name then first name.
I am having problems with making the join work properly,
ER Diagram Snippet:
Here is my current (broken) solution:
SELECT last_name, first_name, address, nights
FROM booking
RIGHT JOIN guest ON (booking.booking_id = guest.id)
WHERE address LIKE '%Edinburgh%';
Here is the results from that query:
The query is partially complete, hoping someone can help me out and create a working version. I'm currently in the process of learning SQL so apologies if its a rather basic or dumb question!
Your query seems almost correct. You were joining the booking id with guets id which gave you some results because of overlapping (matching) ids, but this most likely doesn't correspond to the foreign keys. You should join on guest_id from booking to id from guest.
I'd add grouping to sum all booked nights for a particular guest (assuming that nights is an integer):
SELECT g.last_name, g.first_name, g.address, SUM(b.nights) AS nights
FROM guest AS g
LEFT JOIN booking AS b ON b.guest_id = g.id
WHERE g.address LIKE '%Edinburgh%'
GROUP BY g.last_name, g.first_name, g.address;
Are you sure that nights spent should be calculated using nights field? Why can it be null? If you'd like to show zero for null values just wrap it up with a coalesce function like that:
COALESCE(SUM(b.nights), 0)
Notes:
Rewriten RIGHT JOIN into LEFT JOIN, but that doesn't affect results - it's just cleaner for me
Using aliases eg. AS g makes the code shorter when specifying joining columns
Reference every column with their table alias to avoid ambiguity
SELECT g.first_name,
g.last_name,
g.address,
COALESCE(Sum(b.nights), 0)
FROM booking b
RIGHT JOIN guest g
ON ( b.guest_id = g.id )
WHERE address LIKE 'edinburgh%'
GROUP BY g.last_name,
g.first_name,
g.address;
This post answers your questions about how to make the query.
MySQL SUM with same ID
You can simply use COALESCE as referenced here to avoid the NULL Values
How do I get SUM function in MySQL to return '0' if no values are found?

Multitable counting and multiplying in same query

I have got a somewhat complicated problem. This is my situation (ERD).
For a dashboard i need to create a pivot table that shows me the total amount of competences used by the vacancies. Therefore I need to:
Count the amount of vacancies per template
Count the amount of templates per competence
and last: multiply these numbers to get the total amount of comps used.
I have the first query:
SELECT vacancytemplate_id, count(id)
FROM vacancies
group by vacancytemplate_id;
And the second query isn't that difficult either, but I don't know what the right solution will be. I'm literally brainstuck. My mind can't comprehend how I can achieve the next step and put it down in a query. Please kind stranger, help me out :)
EDIT: my desired result is something like this
NameOfComp, NrOfTimesUsed
Leading, 17
Inspiring, 2
EDIT2: the meta query it should look like:
SELECT NameOfComp, (count of the competences used by templates) * (number of vacancies per template)
EDIT3: http://sqlfiddle.com/#!9/2773ca SQLFiddle
Thanks a lot!
If I am understanding your request correctly, you are wanting a count of competences per vacancy. This can be done very simply due to your table structure:
Select v.ID, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID group by v.ID;
The reason you can do only one join is because there will be a record in the CompTemplate_Table for every competency in each template. Additionally, the same key is used to join vacancy to templates as is used to join templates to CompTemplate_Table, so they represent the same key value (and you can skip joining the Templates table if you don't need data from there).
If you are wanting to add this data to a pivot table, I will leave that exercise to you. There are a number of tutorials available if you do a quick google search and it should not be that hard.
UPDATE: For the second query you are looking at something like:
Select cp.NameOfComp, count(*) from vacancy as v inner join CompTemplate_Table as CT
on v.Template_ID = CT.Template_ID inner join competencies as CP
on CP.ID = CT.Comp_ID
group by CP.NameOfComp
The differences here are you are adding in the comptetencies table, as you need data from that, and grouping by the CP.NameOfComp instead of the vacancy id. You can also restrict this to specific templates, competencies, or vacancies by adding in search conditions (e.g. where CP.ID = 12345)

Remove duplicates from LEFT JOIN query

I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName

Counting records from another mysql table with a left join

I have two DB tables, one that store events and the second that stores any associated comments for that event.
DB Tables:
events: id, owner_id, timestamp
comments: cmt_id, parent_id(events id), cmt_time
I'm trying to get the last 5 comments for each event based on a specific owner_id.
This is how I'm joining my tables:
SELECT * FROM `events`
LEFT JOIN comments ON comments.parent_id=events.id
WHERE owner_id=X
ORDER BY timestamp DESC LIMIT 0,5
Any idea how I can get the number of comments based on the event_id?
Your question is about the number of comments for each event (at least as I interpret it). For this, you want to use a group by:
SELECT e.event_id, COUNT(c.parent_id) as NumComments
FROM events e left JOIN
comments c
ON c.parent_id=e.id
WHERE e.owner_id = X
group by e.event_id;
As for the query in your question. It does not do what you want it to do ("I'm trying to get the last 5 comments for each event based on a specific owner_id."). Instead, it is getting the last five comments for a given user. Period.
You can do the table join at then use the COUNT() function to count how many comments are associated with a given event_id
http://www.w3schools.com/sql/sql_func_count.asp
I would give an example but I'm not entirely sure what you would like your end dataset to look like. COUNT(col) will count the number of rows associated with the query result