access get top and last rows - ms-access

let's say I have a table CData with the columns CName, Amount1, Amount2.
Now I want to use a query to get calculate the difference between Amount1 and Amount2 for each distinct CName and, as a result of the query, get the ~1000 rows with the biggest difference and the 1000~ rows with the smallest (or most negative) difference. It doesn't matter if the results come in one table or two.
1) I am aware of the function TOP and so I could do this with two queries and sort by Difference (once ascending, once descending). Is there a way to do this in one query, though? This would save some time.
2) General question: When I define a field in my query (in this example "Difference"), can I somehow use it to, for example, sort the data by it? Like this (well, it's not working, but to give you an idea of what I mean):
SELECT CData.CName, CData.Amount2-CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY Difference
Or do I always have to do the following:
...
ORDER BY CData.Amount2-CData.Amount1
Not much of a difference in this example, I just wanted to know if that's possible in general.

Sort the first time ASC (Ascending) and the second time DESC (Descending)
SELECT TOP 1000
CData.CName,
CData.Amount2 - CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY CData.Amount2 - CData.Amount1 ASC
SELECT TOP 1000
CData.CName,
CData.Amount2 - CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY CData.Amount2 - CData.Amount1 DESC

which aggregate functino do you want to perform for your differences? Avg? Sum?
SELECT CName, avg(Amount2-Amount1) AS Difference
FROM CData
GROUP BY CName
btw, to do it in 'one' query, you could use a union query on two subqueries, one with the TOP 1000 asc, one with the TOP 1000 desc
looks like Access is not allowing you to use an alias in the ORDER BY Clause, if you use the QBE grid you can change the format from the UI to SQL and it repeats the calculation in the ORDER BYclause.
Hi, John.
Check out the SO tour for instructions on how to use options such as formatting code.
Not sure if this will work for you, but you can try something like:
select * from
(SELECT TOP 3
CName, Date_Sale, Sum(Amount) AS SumA, 99999-Sum(Amount) as srt
FROM
Data
GROUP BY
CName, Date_Sale
UNION
SELECT TOP 3
CName, Date_Sale, Sum(Amount) AS SumA, Sum(Amount) as srt
FROM
Data
GROUP BY
CName, Date_Sale) u
order by
srt

Related

How to maintain the order of the parameters on the return [duplicate]

I'm selecting a set of account records from a large table (millions of rows) with integer id values. As basic of a query as one gets, in a sense. What I'm doing us building a large comma separated list, and passing that into the query as an "in" clause. Right now the result is completely unordered. What I'd like to do is get the results back in the order of the values in the "in" clause.
I assume instead I'll have to build a temporary table and do a join instead, which I'd like to avoid, but may not be able to.
Thoughts? The size of the query right now is capped at about 60k each, as we're trying to limit the output size, but it could be arbitrarily large, which might rule out an "in" query anyway from a practical standpoint, if not a physical one.
Thanks in advance.
Actually, this is better:
SELECT * FROM your_table
WHERE id IN (5,2,6,8,12,1)
ORDER BY FIELD(id,5,2,6,8,12,1);
heres the FIELD documentation:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_field
A bit of a trick....
SELECT * FROM your_table
WHERE id IN (5,2,6,8,12,1)
ORDER BY FIND_IN_SET(id,'5,2,6,8,12,1') DESC;
note that the list of ID's in the find_in_set is a string, so its quoted.
Also note that without DESC, they results are returned in REVERSE order to what the list specified.
If your query is 60K, that's a sign that you're doing it the wrong way.
There is no other way to order the result set than by using an ORDER BY clause. You could have a complicated CASE clause in your order by listing all the elements in your IN clause again, but then your query would probably be 120K.
I know you don't want to, but you should put the values in the IN clause in a table or a temporary table and join with it. You can also include a SortOrder column in the temporary table, and order by that. Databases like joins. Doing it this way will help your query to perform well.
This is what I get for mysql 8.0. It seems opposite to above answers.
sort in same order as list specified:
SELECT * FROM your_table
WHERE id IN (5,2,6,8,12,1)
ORDER BY FIND_IN_SET(id,'5,2,6,8,12,1');
sort in reverse order as list specified:
SELECT * FROM your_table
WHERE id IN (5,2,6,8,12,1)
ORDER BY FIND_IN_SET(id,'5,2,6,8,12,1') DESC;
You're first query surely uses an order by clause. So, you could just do a join, and use the same order by clause.
For example, if this was your first query
SELECT customer_id
FROM customer
WHERE customer_id BETWEEN 1 AND 100
ORDER
BY last_name
And this was your second query
SELECT inventory_id
FROM rental
WHERE customer_id in (...the ordered list...)
Combined would be
SELECT r.inventory_id
FROM rental r
INNER
JOIN customer c
ON r.customer_id = c.customer_id
WHERE c.customer_id BETWEEN 1 AND 100
ORDER
BY c.last_name
This is what worked for me
SELECT * FROM your_table
WHERE id IN ('5','2','6','8','12','1')
ORDER BY FIELD(id,'5','2','6','8','12','1');
I added the ids in quotes

Mysql DISTINCT with more than one column (remove duplicates)

My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.
I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.
MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.

SQL select distinct but "keep first"?

According to another SO post (SQL: How to keep rows order with DISTINCT?), distinct has pretty undefined behavior as far as sorting.
I have a query:
select col_1 from table order by col_2
This can return values like
3
5
3
2
I need to then select a distinct on these that preserves ordering, meaning I want
select distinct(col_1) from table order by col_2
to return
3
5
2
but not
5
3
2
Here is what I am actually trying to do. Col_1 is a user id, and col_2 is a log in timestamp event by that user. So the same user (col_1) can have many login times. I am trying to build a historical list of users in which they were seen in the system. I would like to be able to say "our first user ever was, our second user ever was", and so on.
That post seems to suggest to use a group by, but group by is not meant to return an ordering of rows, so I do not see how or why this would be applicable here, since it does not appear group by will preserve any ordering. In fact, another SO post gives an example where group by will destroy the ordering I am looking for: see "Peter" in what is the difference between GROUP BY and ORDER BY in sql. Is there anyway to guarantee the latter result? The strange thing is, if I were implementing the DISTINCT clause, I would surely do the order by first, then take the results and do a linear scan of the list and preserve the ordering naturally, so I am not sure why the behavior is so undefined.
EDIT:
Thank you all! I have accepted IMSoP answer because not only was there an interative example that I could play around with (thanks for turning me on to SQL Fiddle), but they also explained why several things worked the way they worked, instead of simply "do this". Specifically, it was unclear that GROUP BY does not destroy (rather, keeps them in some sort of internal list) values in the other columns outside of the group by, and these values can still be examined in an ORDER BY clause.
This all has to do with the "logical ordering" of SQL statements. Although a DBMS might actually retrieve the data according to all sorts of clever strategies, it has to behave according to some predictable logic. As such, the different parts of an SQL query can be considered to be processed "before" or "after" one another in terms of how that logic behaves.
As it happens, the ORDER BY clause is the very last step in that logical sequence, so it can't change the behaviour of "earlier" steps.
If you use a GROUP BY, the rows have been bundled up into their groups by the time the SELECT clause is run, let alone the ORDER BY, so you can only look at columns which have been grouped by, or "aggregate" values calculated across all the values in a group. (MySQL implements a controversial extension to GROUP BY where you can mention a column in the SELECT that can't logically be there, and it will pick one from an arbitrary row in that group).
If you use a DISTINCT, it is logically processed after the SELECT, but the ORDER BY still comes afterwards. So only once the DISTINCT has thrown away the duplicates will the remaining results be put into a particular order - but the rows that have been thrown away can't be used to determine that order.
As for how to get the result you need, the key is to find a value to sort by which is valid after the GROUP BY/DISTINCT has (logically) been run. Remember that if you use a GROUP BY, any aggregated values are still valid - an aggregate function can look at all the values in a group. This includes MIN() and MAX(), which are ideal for ordering by, because "the lowest number" (MIN) is the same thing as "the first number if I sort them in ascending order", and vice versa for MAX.
So to order a set of distinct foo_number values based on the lowest applicable bar_number for each, you could use this:
SELECT foo_number
FROM some_table
GROUP BY foo_number
ORDER BY MIN(bar_number) ASC
Here's a live demo with some arbitrary data.
EDIT: In the comments, it was discussed why, if an ordering is applied before the grouping / de-duplication takes place, that order is not applied to the groups. If that were the case, you would still need a strategy for which row was kept in each group: the first, or the last.
As an analogy, picture the original set of rows as a set of playing cards picked from a deck, and then sorted by their face value, low to high. Now go through the sorted deck and deal them into a separate pile for each suit. Which card should "represent" each pile?
If you deal the cards face up, the cards showing at the end will be the ones with the highest face value (a "keep last" strategy); if you deal them face down and then flip each pile, you will reveal the lowest face value (a "keep first" strategy). Both are obeying the original order of the cards, and the instruction to "deal the cards based on suit" doesn't automatically tell the dealer (who represents the DBMS) which strategy was intended.
If the final piles of cards are the groups from a GROUP BY, then MIN() and MAX() represent picking up each pile and looking for the lowest or highest value, regardless of the order they are in. But because you can look inside the groups, you can do other things too, like adding up the total value of each pile (SUM) or how many cards there are (COUNT) etc, making GROUP BY much more powerful than an "ordered DISTINCT" could be.
I would go for something like
select col1
from (
select col1,
rank () over(order by col2) pos
from table
)
group by col1
order by min(pos)
In the subquery I calculate the position, then in the main query I do a group by on col1, using the smallest position to order.
Here the demo in SQLFiddle (this was Oracle, the MySql info was added later.
Edit for MySql:
select col1
from (
select col1 col1,
#curRank := #curRank + 1 AS pos
from table1, (select #curRank := 0) p
) sub
group by col1
order by min(pos)
And here the demo for MySql.
The GROUP BY in the referenced answer isn't attempting to perform an ordering... it is simply picking a single associated value for the column that we want to be distinct.
Like #bluefeet states, if you want a guaranteed ordering, you must use ORDER BY.
Why can't we specify a value in the ORDER BY that isn't included in the SELECT DISTINCT?
Consider the following values for col1 and col2:
create table yourTable (
col_1 int,
col_2 int
);
insert into yourTable (col_1, col_2) values (1, 1);
insert into yourTable (col_1, col_2) values (1, 3);
insert into yourTable (col_1, col_2) values (2, 2);
insert into yourTable (col_1, col_2) values (2, 4);
With this data, what should SELECT DISTINCT col_1 FROM yourTable ORDER BY col_2 return?
That's why you need the GROUP BY and the aggregate function, to decide which of the multiple values for col_2 you should order by... could be MIN(), could be MAX(), maybe even some other function such as AVG() would make sense in some cases; it all depends on the specific scenario, which is why you need to be explicit:
select col_1
from yourTable
group by col_1
order by min(col_2)
SQL Fiddle Here
For MySQL only, when you select columns that are not in the GROUP BY it will return columns from the first record in the group. You can use this behavior to select which record is returned from each group like this:
SELECT foo_number, bar_number
FROM
(
SELECT foo_number, bar_number
FROM some_table
ORDER BY bar_number
) AS t
GROUP BY foo_number
ORDER BY bar_number DESC;
This is more flexible because it allows you to order the records within each group using expressions that are not possible with aggregates - in my case I wanted to return the one with the shortest string in another column.
For completeness, my query looks like this:
SELECT
s.NamespaceId,
s.Symbol,
s.EntityName
FROM
(
SELECT
m.NamespaceId,
i.Symbol,
i.EntityName
FROM ImportedSymbols i
JOIN ExchangeMappings m ON i.ExchangeMappingId = m.ExchangeMappingId
WHERE
i.Symbol NOT IN
(
SELECT Symbol
FROM tmp_EntityNames
WHERE NamespaceId = m.NamespaceId
)
AND
i.EntityName IS NOT NULL
ORDER BY LENGTH(i.RawSymbol), i.RawSymbol
) AS s
GROUP BY s.NamespaceId, s.Symbol;
What this does is return a distinct list of symbols in each namespace, and for duplicated symbols returns the one with the shortest RawSymbol. When the RawSymbol lengths are the same, it returns the one who's RawSymbol comes first alphabetically.

SQL Server: Selecting DateTime and grouping by Date

This simple SQL problem is giving me a very hard time. Either because I'm seeing the problem the wrong way or because I'm not that familiar with SQL. Or both.
What I'm trying to do: I have a table with several columns and I only need two of them: the datetime when the entry was created and the id of the entry. Note that the hours/minutes/seconds part is important here.
However, I want to group my selection according to the DATE part only. Otherwise all groups will most likely have 1 element.
Here's my query:
SELECT MyDate as DateCr, COUNT(Id) as Occur
FROM MyTable tb WITH(NOLOCK)
GROUP BY CAST(tb.MyDate as Date)
ORDER BY DateCr ASC
However I get the following error from it:
Column "MyTable.MyDate" is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I don't do the cast in the GROUP BY, everything fine. If I cast MyDate to DATE in the SELECT and keep the CAST from GROUP BY, everything fine once more. Apparently it wants to keep the same DATE or DATETIME format in the GROUP BY as in the SELECT.
My approach can be completely wrong so I am not necessarily looking to fix the above query, but to find the proper way to do it.
LE: I get the above error on line 1.
LE2: On a second look, my question indeed is not very explicit. You can ignore the above approach if it is completely wrong. Below is a sample scenario
Let me tell you what I need: I want to retrieve (1) the DateTime when each entry was created. So if I have 20 entries, then I want to get 20 DateTimes. Then if I have multiple entries created on the same DAY, I want the number of those entries. For example, let's say I created 3 entries on Monday, 1 on Tuesday and 2 today. Then from my table I need the datetimes of these 6 entries + the number of entries which were created on each day (3 for 19/03/2012, 1 for 20/03/2012 and 2 for 21/03/2012).
I'm not sure why you're objecting to performing the CONVERT in both the SELECT and the GROUP BY. This seems like a perfectly logical way to do this:
SELECT
DateCr = CONVERT(DATE, MyDate),
Occur = COUNT(Id)
FROM dbo.MyTable
GROUP BY CONVERT(DATE, MyDate)
ORDER BY DateCr;
If you want to keep the time portion of MyDate in the SELECT list, why are you bothering to group? Or how do you expect the results to look? You'll have a row for every individual date/time value, where the grouping seems to indicate you want a row for each day. Maybe you could clarify what you want with some sample data and example desired results.
Also, why are you using NOLOCK? Are you willing to trade accuracy for a haphazard turbo button?
EDIT adding a version for the mixed requirements:
;WITH d(DateCr,d,Id) AS
(
SELECT MyDate, d = CONVERT(DATE, MyDate), Id
FROM dbo.MyTable)
SELECT DateCr, Occur = (SELECT COUNT(Id) FROM d AS d2 WHERE d2.d = d.d)
FROM d
ORDER BY DateCr;
Even though this is an old post, I thought I would answer it. The solution below will work with SQL Server 2008 and above. It uses the over clause, so that the individual lines will be returned, but will also count the rows grouped by the date (without time).
SELECT MyDate as DateCr,
COUNT(Id) OVER(PARTITION BY CAST(tb.MyDate as Date)) as Occur
FROM MyTable tb WITH(NOLOCK)
ORDER BY DateCr ASC
Darren White

How to perform COUNT() or COUNT(*)

I have a list of tags in a database.
Ex:
villan
hero
spiderman
superman
superman
I wanted to obtain a sorted list of the tag names in ascending order and the number of times the unique tag appeared in the database. I wrote this code:
Ex:
SELECT hashtag.tag_name
, COUNT( * ) AS number
FROM hashtag
GROUP BY hashtag.tag_name
ORDER BY hashtag.tag_name ASC
This yields the correct result:
hero , 1
spiderman , 1
superman , 2
villan , 1
How can I obtain the full COUNT of this entire list. The answer should be 4 in this case because there are naturally 4 rows. I can't seem to get a correct COUNT() without the statement failing.
Thanks so much for the help! :)
SELECT COUNT(DISTINCT `tag_name`) FROM `hashtag`
Use COUNT DISTINCT(hashtag.tag_name) -- it can't go in the same SELECT you have (except with a UNION of course), but on a SELECT of its own (or an appropriate UNION) it will give the result you want.
i am not sure about the query in my-sql but this one works fine with oracle.
SELECT hashtag.tag_name, count(*) FROM hashtag GROUP BY cube(hashtag.tag_name)
To do it exactly as you're describing (to obtain the full count of the resulting list), you'd want to take a count of the results, like:
SELECT COUNT(*) AS uniquetags
FROM (SELECT hashtag.tag_name, COUNT( * ) AS number
FROM hashtag GROUP BY hashtag.tag_name
ORDER BY hashtag.tag_name ASC)
Of course the ORDER BY clause is unnecessary and gets swallowed by the outer aggregate COUNT, as does the inner COUNT.
Additionally, as a few people have pointed out, the shortcut to this is a COUNT DISTINCT, as in:
SELECT COUNT(DISTINCT hashtag.tag_name)
FROM hashtag
This may or may not use indexes more efficiently, depending on whether it realizes it doesn't have to count everything or not. Someone with more knowledge, please feel free to comment (or just try a couple EXPLAINs).