How to select distinct rows from a table without a primary key - sql-server-2008

I need to show a Notification on user login if there is any unread messages. So if multiple users send (5 messages each) while the user is in offline these messages should be shown on login. Means have to show the last messages from each user.
I use joining to find records.
In this scenario Message from User is not a primary key.
This is my query
SELECT
UserMessageConversations.MessageFrom, UserMessageConversations.MessageFromUserName,
UserMessages.MessageTo, UserMessageConversations.IsGroupChat,
UserMessageConversations.IsLocationChat,
UserMessageConversations.Message, UserMessages.UserGroupID,UserMessages.LocationID
FROM
UserMessageConversations
LEFT OUTER JOIN
UserMessages ON UserMessageConversations.UserMessageID = UserMessages.UserMessageID
WHERE
UserMessageConversations.MessageTo = 743
AND UserMessageConversations.ReadFlag = 0
This is the output obtained from above query.
MessageFrom -582 appears twice. I need only one record of this user.
How is it possible

I'm not entirely sure I totally understand your question - but one approach would be to use a CTE (Common Table Expression).
With this CTE, you can partition your data by some criteria - i.e. your MessageFrom - and have SQL Server number all your rows starting at 1 for each of those partitions, ordered by some other criteria - this is the point that's entirely unclear from your question, whether you even care what the rows for each MessageFrom number are sorted on (do you have some kind of a MessageDate or something that you could order by?) ...
So try something like this:
;WITH PartitionedMessages AS
(
SELECT
umc.MessageFrom, umc.MessageFromUserName,
um.MessageTo, umc.IsGroupChat,
umc.IsLocationChat,
umc.Message, um.UserGroupID, um.LocationID ,
ROW_NUMBER() OVER(PARTITION BY umc.MessageFrom
ORDER BY MessageDate DESC) AS 'RowNum' <=== totally unclear yet
FROM
dbo.UserMessageConversations umc
LEFT OUTER JOIN
dbo.UserMessages um ON umc.UserMessageID = um.UserMessageID
WHERE
umc.MessageTo = 743
AND umc.ReadFlag = 0
)
SELECT
MessageFrom, MessageFromUserName, MessageTo,
IsGroupChat, IsLocationChat,
Message, UserGroupID, LocationID
FROM
PartitionedMessages
WHERE
RowNum = 1
Here, I am selecting only the "first" entry for each "partition" (i.e. for each MessageFrom) - ordered by a "imagined" MessageDate column so that the most recent (the newest) message would be selected.
Does that approach what you're looking for??

If you think of them as same rows, I assume you don't care about the message field.
In this case you can use the DISTINCT clause:
SELECT DISTINCT
UserMessageConversations.MessageFrom, UserMessageConversations.MessageFromUserName,
UserMessages.MessageTo, UserMessageConversations.IsGroupChat,
UserMessageConversations.IsLocationChat,
UserMessages.UserGroupID,UserMessages.LocationID
FROM
UserMessageConversations
LEFT OUTER JOIN
UserMessages ON UserMessageConversations.UserMessageID = UserMessages.UserMessageID
WHERE
UserMessageConversations.MessageTo = 743
AND UserMessageConversations.ReadFlag = 0
In general with distinct clause you have a row for every distinct group of row attributes.
If your requirement instead is to show a single field for all the messages (example: every message folded in a single message with a separator between them) you can use an aggregate function, but in SQL Server it seems is not that easy.

Related

Include rows which don't match with IN() clause

I have a table called log which contains logs sent by several applications. This table has a varchar field called reference.
I have a table panel in Grafana in which I show how many logs we have grouped by reference values. So the user types one or multiple values in a text field on Grafana like 'ref1', 'ref2', 'ref3' and a query like this is fired:
SELECT reference, count(id)
FROM db.log
WHERE reference IN('ref1', 'ref2', 'ref3')
GROUP BY reference
So far so good, it works as intended. What I would like to do is showing a row with count=0 in case a log with given reference doesn't exist. I know I could add arbitrary rows using UNION but I think I can't do it in Grafana dynamically.
Any ideas?
Use a query that returns all the values for which you want results and left join the table to aggregate:
select t.reference, count(l.id)
from (
select 'ref1' reference union all
select 'ref2' union all
select 'ref3'
) t left join db.log l
on l.reference = t.reference
group by t.reference
See a simplified demo.

How can I filter out results based on another table. (A reverse join I guess?)

Basically, I have a table which contains two fields: [id, other] which have user tokens stored in them. The goal of my query is to select a random user that has not been selected before. Once the user is selected it is stored in the table shown above. So if Jack selects Jim randomly, Jack cannot select Jim again, and on the flip side, Jim cannot select Jack.
Something like this is what comes to mind:
SELECT * FROM users
WHERE (SELECT * FROM selected WHERE (id=? AND other=?) OR (id=? AND other=?));
Well, first of all I've read that uses sub-queries like this is extremely inneficient, and I'm not even sure if I used the correct syntax, the problem is however, that I have numerous tables in my scenario which I need to filter by, so it would look more like this.
SELECT * FROM users u
WHERE (SELECT * FROM selected WHERE (id=? AND other=?) OR (id=? AND other=?))
AND (SELECT * FROM other_table WHERE (id=? AND other=?) OR (id=? AND other=?))
AND (SELECT * FROM diff_table WHERE (id=? AND value=?))
AND u.type = 'BASIC'
LIMIT = 1
I feel like there's a much, much more efficient way of handling this.
Please note: I don't want a row returned at all if the users id is present in any of the nested queries. Returning "null" is not sufficient. The reason I have the OR clause is because the user's id can be stored in either the id or the other field, so we need to check both.
I am using Postgre 9.5.3, but I added the MySQL tag as the code is mostly backwards comptable, Fancy Postgre only solutions are accepted(if any)
You can left join to another table, which produces nulls where no record is found:
Select u.* from users u
left selected s on s.id = u.id or s.other = u.other
where s.id is null
The or in a join is different, but should work. Example is kinda silly...but as long as you understand the logic. Left join first table to second table, where second table column is not null means there was atleast one record found that matched the join conditions. Where second table column is null means no record was found.
And you are right...avoid the where field = (select statement) logic when you can, poor performer there.
Use an outer join filtered on missed joins:
SELECT * FROM users u
LEFT JOIN selected s on u.id in (s.id, s.other) and ? in (s.id, s.other)
WHERE u.id != ?
AND s.id IN NULL
LIMIT 1

MySQL subquery returns more than one row selecting count with subquery

Im executing this query
SELECT COUNT(*) AS CONNECTIONS, SERVERS.NAME AS SERVER_NAME, USERS.USERNAME AS USER, CONECT_DATE AS CONNECTION_DATE
FROM CONECTIONS JOIN SERVERS ON(CONECTIONS.SERVERID = SERVERS.ID)
JOIN USERS ON (CONECTIONS.USERID = USERS.ID)
WHERE CONECTIONS.USERID = (SELECT ID FROM USERS
WHERE UPPER(USERNAME) = UPPER((SELECT USERNAME FROM USERS)))
AND CONECT_DATE BETWEEN '2010-06-11' AND '2019-06-11'
GROUP BY SERVERID, CONECT_DATE;
Im trying to get this query from each user in the db TABLE USERS with making a subquery to select everyone 'UPPER((SELECT USERNAME FROM USERS)))' and the thing is that is returning more than one row of the result but if I put directly the USERNAME like 'ADMIN' for example it gives me the results.
You're doing username=select.... A subquery used in an equality test can return only ONE field/row. Since you're going an unfiltered "gimme everything in the users table", you're returning MANY rows of one field each. For that you need an IN match:
SELECT ...
WHERE ... UPPER(username) IN (SELECT USERNAME FROM users)
^^^^---this
SQL doesn't permit what is essentially
WHERE foo=1,2,3,4,5,6
which is why the IN operator works, for comparisons of sets of values.
This is your where clause:
WHERE CONECTIONS.USERID = (SELECT ID
FROM USERS
WHERE UPPER(USERNAME) = UPPER((SELECT USERNAME FROM USERS))
)
This has at least three problems.
The subquery argument to UPPER() might return more than one value, and likely will, unless the USERS table has 0 or 1 rows.
The = to the UPPER() has a similar problem. An in is more appropriate.
The use of = in the outer where clause has a similar problem. An in is more appropriate.
The following fixes these problems:
WHERE CONECTIONS.USERID IN (SELECT ID
FROM USERS
WHERE UPPER(USERNAME) IN (SELECT UPPER(USERNAME) FROM USERS))
)
However, I doubt that will fix your overall query. Your knowledge of SQL seems a bit limited. I would suggest that you ask another question, provide sample data, desired results, and explain what you want to do. That will make it easier for knowledgeable people on this site to point you in the right directions for writing your queries.

Join on 3 tables insanely slow on giant tables

I have a query which goes like this:
SELECT insanlyBigTable.description_short,
insanlyBigTable.id AS insanlyBigTable,
insanlyBigTable.type AS insanlyBigTableLol,
catalogpartner.id AS catalogpartner_id
FROM insanlyBigTable
INNER JOIN smallerTable ON smallerTable.id = insanlyBigTable.catalog_id
INNER JOIN smallerTable1 ON smallerTable1.catalog_id = smallerTable.id
AND smallerTable1.buyer_id = 'xxx'
WHERE smallerTable1.cont = 'Y' AND insanlyBigTable.type IN ('111','222','33')
GROUP BY smallerTable.id;
Now, when I run the query first time it copies the giant table into a temp table... I want to know how I can prevent that? I am considering a nested query, or even to reverse the join (not sure the effect would be to run faster), but that is well, not nice. Any other suggestions?
To figure out how to optimize your query, we first have to boil down exactly what it is selecting so that we can preserve that information while we change things around.
What your query does
So, it looks like we need the following
The GROUP BY clause limits the results to at most one row per catalog_id
smallerTable1.cont = 'Y', insanelyBigTable.type IN ('111','222','33'), and buyer_id = 'xxx' appear to be the filters on the query.
And we want data from insanlyBigTable and ... catalogpartner? I would guess that catalogpartner is smallerTable1, due to the id of smallerTable being linked to the catalog_id of the other tables.
I'm not sure on what the purpose of including the buyer_id filter on the ON clause was for, but unless you tell me differently, I'll assume the fact it is on the ON clause is unimportant.
The point of the query
I am unsure about the intent of the query, based on that GROUP BY statement. You will obtain just one row per catalog_id in the insanelyBigTable, but you don't appear to care which row it is. Indeed, the fact that you can run this query at all is due to a special non-standard feature in MySQL that lets you SELECT columns that do not appear in the GROUP BY statement... however, you don't get to select WHICH columns. This means you could have information from 4 different rows for each of your selected items.
My best guess, based on column names, is that you are trying to bring back a list of items that are in the same catalog as something that was purchased by a given buyer, but without any more than one item per catalog. In addition, you want something to connect back to the purchased item in that catalog, via the catalogpartner table's id.
So, something probably akin to amazon's "You may like these items because you purchased these other items" feature.
The new query
We want 1 row per insanlyBigTable.catalog_id, based on which catalog_id exists in smallerTable1, after filtering.
SELECT
ibt.description_short,
ibt.id AS insanlyBigTable,
ibt.type AS insanlyBigTableLol,
(
SELECT smallerTable1.id FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND st.catalog_id = ibt.catalog_id
LIMIT 1
) AS catalogpartner_id
FROM insanlyBigTable ibt
WHERE ibt.id IN (
SELECT (
SELECT ibt.id AS ibt_id
FROM insanlyBigTable ibt
WHERE ibt.catalog_id = sti.catalog_id
LIMIT 1
) AS ibt_id
FROM (
SELECT DISTINCT(catalog_id) FROM smallerTable1 st
WHERE st.buyer_id = 'xxx'
AND st.cont = 'Y'
AND EXISTS (
SELECT * FROM insanlyBigTable ibt
WHERE ibt.type IN ('111','222','33')
AND ibt.catalog_id = st.catalog_id
)
) AS sti
)
This query should generate the same result as your original query, but it breaks things down into smaller queries to avoid the use (and abuse) of the GROUP BY clause on the insanlyBigTable.
Give it a try and let me know if you run into problems.

SQL select with inner join, sub select and limit

I've been working with this SQL problem for about 2 days now and suspect I'm very close to resolving the issue but just can't seem to find a solution that completely works.
What I'm attempting to do is a selective join on two tables called application_info and application_status that are used to store information about open access journal article funding requests.
application_info has general information about the applicant and uses an auto indexing field called Application_ID as a key field. application_status is used to track the ongoing information about the status of the application (received, under review, funded, denied, withdrawn, etc.) as well as status of the journal article (submitted, accepted, resubmitted, published or rejected) and contains both an Application_ID field and an auto indexing field called Status_ID along with a status text and status date field.
Because we want to keep a running log of application, article, and funding status changes we don't want to overwrite existing rows in the application_status with updated values, but instead want to only show the most recent status values. Because an application will eventually have more than one status change this creates a need to apply some sort of limit on the inner join of the status data to the application data so that only one row is returned for each application ID.
Here's an example of what I am attempting to do in a query that currently throws an error:
-- simplified example
SELECT
application_info.*,
artstatus.Status_ID AS Article_Status_ID,
artstatus.Application_ID AS Article_Application_ID,
artstatus.Status_State_Date AS Article_Status_State_Date,
artstatus.Status_State_Text AS Article_Status_State_Text
FROM application_info
LEFT JOIN (
SELECT
Status_ID,
Application_ID,
Status_State_Text,
Status_State_Date,
Status_State_InitiatedBy,
Status_State_ChangebBy,
Status_State_Notes
FROM application_status
WHERE Status_State_Text LIKE 'Article Status%'
AND Application_ID = application_info.Application_ID -- how to pass the current application_info.Application_ID from the ON clause to here?
-- and Application_ID = 29 -- this would be an option for specific IDs, but not an option for getting a complete list of application IDs with status
-- GROUP BY Application_ID -- reduces the sub query to 1 row (Yeah!) but returns the first row encountered before the ORDER BY comes into play
ORDER BY Status_ID DESC
-- a GROUP BY after the ORDER BY might resolve the issue if we could do a sort first
LIMIT 1 -- only want to get the first (most recent) row, only works correctly if passing an Application_ID
) AS artstatus
ON application_info.Application_ID = artstatus.Application_ID
-- WHERE application_info.Application_ID = 29 -- need to get all IDs with statu values as well as for specific ID requests
;
Eliminating the AND Application_ID = application_info.Application_ID and portion of the sub query along with the LIMIT causes the select to work, but returns a row for every status for a given application ID. I've tried messing with using MIN/MAX operators but have noticed that they return unpredictable rows from the application_status table when they work.
I've also attempted to do sub selects in the ON section of the join, but don't know how to make that work because the end result would always need to return an Application_ID (can both Application_ID and Status_ID be returned and used?).
Any hints on how to get this to work as I'm intending? Can this even be done?
Further edit: working query below. The key was to move the sub query in the join one level deeper and then return just a single status ID.
-- simplified example (now working)
SELECT
application_info.*,
artstatus.Status_ID AS Article_Status_ID,
artstatus.Application_ID AS Article_Application_ID,
artstatus.Status_State_Date AS Article_Status_State_Date,
artstatus.Status_State_Text AS Article_Status_State_Text
FROM application_info
LEFT JOIN (
SELECT
Status_ID,
Application_ID,
Status_State_Text,
Status_State_Date,
Status_State_InitiatedBy,
Status_State_ChangebBy,
Status_State_Notes
FROM application_status AS artstatus_int
WHERE
-- sub query moved one level deeper so current join Application_ID can be passed
-- order by and limit can now be used
Status_ID = (
SELECT status_ID FROM application_status WHERE Application_ID = artstatus_int.Application_ID
AND status_State_Text LIKE 'Article Status%'
ORDER BY Status_ID DESC
LIMIT 1
)
ORDER BY Application_ID, Status_ID DESC
-- no need for GROUP BY or LIMIT here because only one row is returned per Application_ID
) AS artstatus
ON application_info.Application_ID = artstatus.Application_ID
-- WHERE application_info.Application_ID = 29 -- works for specific application ID as well
-- more LEFT JOINS follow
;
You can't have a correlated subquery in the from clause.
Try this idea instead:
select <whatever>
from (select a.*,
(select max(status_id) as maxstatusid
from application_status aps
where aps.application_id = a.application_id
) as maxstatusid
from application
) left outer join
application_status aps
on aps.status_id = a.maxstatusid
. . .
That is, put the correlated subquery in the select clause to get the most recent status. Then join this in to the status table to get other information. And, finish the query with other details.
You seem pretty adept at your SQL skills, so it doesn't seem necessary to rewrite the whole query for you.