ORDER BY # of rows with same option ID from different table - mysql

I've made a voting script which already works, but I wanted to practice some MySQL to try and do sorting/filtering of the results via SQL queries instead of getting an entire table as an array and working with that using loops.
I've ran into an issue with sorting the options of a vote based on the amount of times it was voted on. My DB has 3 tables related to this script:
vote_votes - which contains the cast votes, one row per vote cast
vote_options - which contains the possible options for every vote
vote_list - which contains the list of votes with title, type etc.
My original script just got all the options that matched the currently visited vote's ID using
SELECT * FROM vote_options WHERE vote_options.vid = $voteID
then counted the rows matching the option's unique ID inside vote_votes in an array, then sorted it based on the amount of rows with a custom sorting function.
I want to do the same in one query, and I think it's possible, I just don't know how. Here's my current SELECT statement:
SELECT
options.optid as id,
options.value as value
FROM vote_options options
WHERE vid = {$vote['vid']};");
Basically, inside vote_votes, each entry has a unique entryid and an optid column, and I want to add it to the query in a way that these entries are counted as WHERE vote_votes.optid = options.optid (Option IDs are unique, so no need to also look for vote ID).
I was hoping this would work, but it's obviously wrong. This is the closest I got before giving up and asking a question here.
SELECT
options.optid as id,
options.value as value
FROM vote_options options
WHERE vid = {$vote['vid']}
ORDER BY (
SELECT COUNT(*)
FROM vote_votes
WHERE vote_votes.optid = options.optid
) DESC;

I've found the solution, it was just a matter of moving the SELECT in brackets to the correct place:
SELECT
options.optid as id,
options.value as value,
options.vid as voteid,
(
SELECT COUNT(*)
FROM vote_votes votes
WHERE votes.optid = options.optid
) as votes
FROM vote_options options
WHERE options.vid = {$vote['vid']}
ORDER BY votes DESC;

Related

Chat mySQL table, select statement to find unique conversations

I have a table of chat messages:
id, nToUserID, nFromUserID, strMessage
I'm trying find unique occurrences of messages between two users. NOT all the messages, just, is there at least one message to a user or from a user. I'll use this to show a list of "conversations" which could then be clicked to view the full chat thread.
I tried using a DISTINCT select, but that appeared to still give me multiple records between the same users.
I thought about a left JOIN, but again that appears to give me multiple or empty records.
If I understand correctly, you could use least() and greatest():
select distinct least(nToUserID, nFromUserID), greatest(nToUserID, nFromUserID)
from t;
If you want the other users, then use:
select distinct (case when nToUserID = ? then nFromUserID else nToUserID end) as userID
from t
where ? in (nToUserID, nFromUserID);
The ? is the id of the user you want the connections to.
You can use the count function to get the unique id pairs and their respective number of conversations.
SELECT nFromUserId, nToUserId, count(id) FROM `table` GROUP BY nFromUserId, nToUserId
Though this doesn't work if you need to count nFromUserId and nToUserId interchangeably.

Accessing multiple tables at the same time and multiplying values in SQL

My problem is very specific and I couldn't figure out a better name for the title.
I have 3 tables, which are Pessoa (Person), Bicicleta (Bicicle) and Viagem (Trip):
What I want to do is select the names of the individuals by alphabetic order who had a trip, together with the Avaliacao (Evaluation) multiplied by Valor_Viagem (Trip cost).
What I tried to do (not working properly nor finished):
select distinct PESSOA.Nome, VIAGEM.Avaliacao, VIAGEM.Id_Bicicleta, BICICLETA.Valor_Viagem from PESSOA, VIAGEM
join BICICLETA ON VIAGEM.Id_Bicicleta = BICICLETA.Id where PESSOA.Email IN (
SELECT Email_Utilizador FROM VIAGEM
);
Which gives me:
^This is NOT what I want, as stated before.
I am also not 100% sure what you are looking for, but I assume you need a list of distinct names that contains the Avalacao * Valor_Viagem summed for each person (so a person with 5 trips has five times Avalacao * Valor_Viagem + ... + ...).
That is very easy to achieve:
select PESSOA.Nome, VIAGEM.Avaliacao, VIAGEM.Id_Bicicleta, BICICLETA.Valor_Viagem from PESSOA, VIAGEM, SUM(VIAGEM.Avaliacao * BICICLETA.Valor_viagem) AS trip_cost
join BICICLETA ON VIAGEM.Id_Bicicleta = BICICLETA.Id where PESSOA.Email IN (
SELECT Email_Utilizador FROM VIAGEM
) GROUP BY PESSOA.Nome;
What happens is the following:
first you compute the product for each trip
than you use the GROUP BY clause to group persons with identical names together
using SUM in combination with GROUP BY causes to sum all values of persons within this group, in that case all records with the same PESSOA.Nome
A word of warning
This assumes you will have distinct names. This appears risky. Better assign each person a unique Id and use this Id as foreign key instead of the name.

SQL Query sorting rows by duplicate name keeping lowest in result

I've got a table with 11 columns and I want to create a query that removes the rows with duplicate names in the Full Name's column but keeps the row with the lowest value in the Result's column. Currently I have this.
SELECT
MIN(sql363686.Results2014.Result),
sql363686.Results2014.Temp,
sql363686.Results2014.Full Name,
sql363686.Results2014.Province,
sql363686.Results2014.BirthDate,
sql363686.Results2014.Position,
sql363686.Results2014.Location,
sql363686.Results2014.Date
FROM
sql363686.Results2014
WHERE
sql363686.Results2014.Event = '50m Freestyle'
AND sql363686.Results2014.Gender = 'M'
AND sql363686.Results2014.Agegroup = 'Junior'
GROUP BY
sql363686.Results2014.Full Name
ORDER BY
sql363686.Results2014.Result ASC ;
At first glance it seems to work fine and I get all the correct values, but I seem to be getting a different (wrong) value in the Position column then what I have in my database table. All other values seem to be right. Any ideas on what I'm doing wrong?
I'm currently using dbVisualizer connected to a mysql database. Also, my knowledge and experience with sql is the bare mimimum
Use group by and a join:
select r.*
from sql363686.Results2014 r
(select fullname, min(result) as minresult
from sql363686.Results2014 r
group by fullname
) rr
on rr.fullname = r.fullname and rr.minresult = r.minresult;
You have fallen into the trap of the nonstandard MySQL extension to GROUP BY.
(I'm not going to work with all those fully qualified column names; it's unnecessary and verbose.)
I think you're looking for each swimmer's best time in a particular event, and you're trying to pull that from a so-called denormalized table. It looks like your table has these columns.
Result
Temp
FullName
Province
BirthDate
Position
Location
Date
Event
Gender
Agegroup
So, the first step is to locate the best time in each event for each swimmer. To do this we need to make a couple of assumptions.
A person is uniquely identified by FullName, BirthDate, and Gender.
An event is uniquely identified by Event, Gender, Agegroup.
This subquery will get the best time for each swimmer in each event.
SELECT MIN(Result) BestResult,
FullName,BirthDate, Gender,
Event, Agegroup
FROM Results2014
GROUP BY FullName,BirthDate, Gender, Event, Agegroup
This gets you a virtual table with each person's fastest result in each event (using the definitions of person and event mentioned earlier).
Now the challenge is to go find out the circumstances of each person's best time. Those circumstances include Temp, Province, Position, Location, Date. We'll do that with a JOIN between the original table and our virtual table, like this
SELECT resu.Event,
resu.Gender,
resu.Agegroup,
resu.Result,
resu.Temp.
resu.FullName,
resu.Province,
resu.BirthDate,
resu.Position,
resu.Location,
resu.Date
FROM Results2014 resu
JOIN (
SELECT MIN(Result) BestResult,
FullName,BirthDate, Gender,
Event, Agegroup
FROM Results2014
GROUP BY FullName,BirthDate, Gender, Event, Agegroup
) best
ON resu.Result = best.BestResult
AND resu.FullName = best.FullName
AND resu.BirthDate = best.BirthDate
AND resu.Gender = best.Gender
AND resu.Event = best.Event
AND resu.Agegroup = best.Agegroup
ORDER BY resu.Agegroup, resu.Gender, resu.Event, resu.FullName, resu.BirthDate
Do you see how this works? You need an aggregate query that pulls the best times. Then you need to use the column values in that aggregate query in the ON clause to go get the details of the best times from the detail table.
If you want to report on just one event you can include an appropriate WHERE clause right before ORDER BY as follows.
WHERE resu.Event = '50m Freestyle'
AND resu.Gender = 'M'
AND resu.Agegroup = 'Junior'

SQL performance of a large number of sum()s

Within my J2EE web application, I need to generate a bar chart representing the percentage of users in the system with specific alerts. (EDIT - I forgot to mention, the graph only deals with alerts associated with the first situationof each user, thus the min(date) ).
A simplified (but structurally similar) version of my database schema is as follows :
users { id, name }
situations { id, user_id, date }
alerts { id, situation_id, alertA, alertB }
where users to situations are 1-n, and situations to alerts are 1-1.
I've omitted datatypes but the alerts (alertA and B) are booleans. In my actual case, there are many such alerts (30-ish).
So far, this is what I have come up with :
select sum(alerts.alertA), sum(alerts.alertB)
form alerts, (
select id, min(date)
from situations
group by user_id) as situations
where situations.id = alerts.situation_id;
and then divide these sums by
select count(users.id) from users;
This seems far from ideal.
Your recommendations/advice as to how to improve as query would be most appreciated (or maybe I need to re-think my database schema)...
Thanks,
Anthony
PS. I was also thinking of using a trigger to refresh a chart specific table whenever the alerts table is updated but I guess that's a subject for a different query (if it turns out to be problematic).
At first, think about your schema again. You will have a lot of different alerts and you probably don't want to add a single column for every one of those.
Consider changing your alerts table to something like { id, situation_id, type, value } where type would be (A,B,C,....) and value would be your boolean.
Your task to calculate the percentages would then split up into:
(1) Count the total number of users:
SELECT COUNT(id) AS total FROM users
(2) Find the "first" situation for each user:
SELECT situations.id, situations.user_id
-- selects the minimum date for every user_id
FROM (SELECT user_id, MIN(date) AS min_date
FROM situations
GROUP BY user_id) AS first_situation
-- gets the situations.id for user with minimum date
JOIN situations ON
first_situation.user_id = situations.user_id AND
first_situation.min_date = situations.date
-- limits number of situations per user to 1 (possible min_date duplicates)
GROUP BY user_id
(3) Count users for whom an alert is set in at least one of the situations in the subquery:
SELECT
alerts.type,
COUNT(situations.user_id)
FROM ( ... situations.user_id, situations.id ... ) AS situations
JOIN alerts ON
situations.id = alerts.situation_id
WHERE
alerts.value = 1
GROUP BY
alerts.type
Put those three steps together to get something like:
SELECT
alerts.type,
COUNT(situations.user_id)/users.total
FROM (SELECT situations.id, situations.user_id
FROM (SELECT user_id, MIN(date) AS min_date
FROM situations
GROUP BY user_id) AS first_situation
JOIN situations ON
first_situation.user_id = situations.user_id AND
first_situation.min_date = situations.date
GROUP BY user_id
) AS situations
JOIN alerts ON
situations.id = alerts.situation_id
JOIN (SELECT COUNT(id) AS total FROM users) AS users
WHERE
alerts.value = 1
GROUP BY
alerts.type
All queries written from my head without testing. Even if they don't work exactly like that, you should still get the idea!

Update Based on a common value where there are multiple matches

I've got two tables (MySQL database), one called cities (that has columns 'state_id' and 'stateAB'--state_id is the row I would like to fill, stateAB is the 2-letter code of the state--I want this to serve as the key value).
I have another table called states (that has columns 'id' [this is the value that I want to go into the 'state_id' field of 'cities'], and a 'title' field [2-letter state codes] to be the common-key value).
I wanted to use a simple:
UPDATE cities SET state_id=(SELECT id FROM states WHERE states.title=cities.stateAB)
With the idea being that state_id will be set to the id that is returned where the 2-letter codes match.
The problem is that the following is returned:
#1242 - Subquery returns more than 1 row
I assume this is because there are more than one time per state that the codes match, for the simple reason that there are multiple cities per state (and they all have the same state/codes).
I'm not sure how to change this to make it work--I'm sure it's something obvious I'm just missing, but I don't know how to deal with the issue.
This is your query:
UPDATE cities
SET state_id = (SELECT id FROM states WHERE states.title = cities.stateAB);
You are getting the error because states has duplicates in the title column. You can find these by running:
select title, count(*) as numdups
from states
group by title
having count(*) > 1;
You may not care about the duplicates, happy to select just one id (consistently) when there is a match. If so, you can do:
UPDATE cities
SET state_id = (SELECT MIN(id) FROM states WHERE states.title = cities.stateAB);