I have a list of tags in a database.
Ex:
villan
hero
spiderman
superman
superman
I wanted to obtain a sorted list of the tag names in ascending order and the number of times the unique tag appeared in the database. I wrote this code:
Ex:
SELECT hashtag.tag_name
, COUNT( * ) AS number
FROM hashtag
GROUP BY hashtag.tag_name
ORDER BY hashtag.tag_name ASC
This yields the correct result:
hero , 1
spiderman , 1
superman , 2
villan , 1
How can I obtain the full COUNT of this entire list. The answer should be 4 in this case because there are naturally 4 rows. I can't seem to get a correct COUNT() without the statement failing.
Thanks so much for the help! :)
SELECT COUNT(DISTINCT `tag_name`) FROM `hashtag`
Use COUNT DISTINCT(hashtag.tag_name) -- it can't go in the same SELECT you have (except with a UNION of course), but on a SELECT of its own (or an appropriate UNION) it will give the result you want.
i am not sure about the query in my-sql but this one works fine with oracle.
SELECT hashtag.tag_name, count(*) FROM hashtag GROUP BY cube(hashtag.tag_name)
To do it exactly as you're describing (to obtain the full count of the resulting list), you'd want to take a count of the results, like:
SELECT COUNT(*) AS uniquetags
FROM (SELECT hashtag.tag_name, COUNT( * ) AS number
FROM hashtag GROUP BY hashtag.tag_name
ORDER BY hashtag.tag_name ASC)
Of course the ORDER BY clause is unnecessary and gets swallowed by the outer aggregate COUNT, as does the inner COUNT.
Additionally, as a few people have pointed out, the shortcut to this is a COUNT DISTINCT, as in:
SELECT COUNT(DISTINCT hashtag.tag_name)
FROM hashtag
This may or may not use indexes more efficiently, depending on whether it realizes it doesn't have to count everything or not. Someone with more knowledge, please feel free to comment (or just try a couple EXPLAINs).
Related
I wondering if I can do a unique Select to diferent count of distinct items into many columns. By now this works fine, but doing only one count by column like:
SELECT provincia,count(provincia) as total_provincia
FROM museos
group by provincia ORDER BY provincia
Result of Select for uno count
Then if a try with other field, still looks good as I needed too:
SELECT localidad,count(localidad) as total_localidad
FROM museos
group by localidad ORDER BY localidad
Result of another Select Query
But, when I trying to get both column with its count at one Select query, it's suppose to be something like this:
SELECT provincia,localidad,count(localidad) as total_localidad,
count(provincia) as total_provincia
FROM museos
group by provincia,localidad ORDER BY provincia,localidad
Then I got:
Select for both count
i've been looking everywhere around Stack and websites with examples, unfortunately I couldn't find something similar of what I tryied to do, and I realy got no answer for why only seems to work one of the count, and always for the small granular data, in this case, "localidad". At the same time the other count of "provincia" is ignored, and its column named as total is just a copie of the values of the first one, as we can see. So, is it possible to make a Select query that return two or more count made on diferent columns, in order to get this kind of response:
Hopefull Select query expected
I mean, finaly the organization of the required table result in a tree scheme, where data like "provincia" are the body or the root, and its capillary data would be the leaves. It's kind of weird built a Query this way, but I think is not impossible at all. So any help or coment I'll be greatful.
You can use over(partition by):
select distinct
provincia, count(provincia) over(partition by provincia) 'count_provincia'
,localidad, count(localidad) over(partition by localidad) 'count_localidad'
from museos
order by provincia;
My database is called: (training_session)
I try to print out some information from my data, but I do not want to have any duplicates. I do get it somehow, may someone tell me what I do wrong?
SELECT DISTINCT athlete_id AND duration FROM training_session
SELECT DISTINCT athlete_id, duration FROM training_session
It works perfectly if i use only one column, but when I add another. it does not work.
I think you misunderstood the use of DISTINCT.
There is big difference between using DISTINCT and GROUP BY.
Both have some sort of goal, but they have different purpose.
You use DISTINCT if you want to show a series of columns and never repeat. That means you dont care about calculations or group function aggregates. DISTINCT will show different RESULTS if you keep adding more columns in your SELECT (if the table has many columns)
You use GROUP BY if you want to show "distinctively" on a certain selected columns and you use group function to calculate the data related to it. Therefore you use GROUP BY if you want to use group functions.
Please check group functions you can use in this link.
https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html
EDIT 1:
It seems like you are trying to get the "latest" of a certain athlete, I'll assume the current scenario if there is no ID.
Here is my alternate solution:
SELECT a.athlete_id ,
( SELECT b.duration
FROM training_session as b
WHERE b.athlete_id = a.athlete_id -- connect
ORDER BY [latest column to sort] DESC
LIMIT 1
) last_duration
FROM training_session as a
GROUP BY a.athlete_id
ORDER BY a.athlete_id
This syntax is called IN-SELECT subquery. With the help of LIMIT 1, it shows the topmost record. In-select subquery must have 1 record to return or else it shows error.
MySQL's DISTINCT clause is used to filter out duplicate recordsets.
If your query was SELECT DISTINCT athlete_id FROM training_session then your output would be:
athlete_id
----------
1
2
3
4
5
6
As soon as you add another column to your query (in your example, the column called duration) then each record resulting from your query are unique, hence the results you're getting. In other words the query is working correctly.
I searched for an answer here and didn't find one closer to my question.
I have the following situation: I need to display a person first and then show the rest in ascending order. All the people from the same table. I tried UNION but after that, the SQL seems to mix everything again.
I have tried this:
select name from people where name = 'John'
UNION
select name from people order by name
Since UNION does not select duplicated values. But in the end, it mixed up every result and did not show in the correct order that should be:
John
Ana
Bruce
What am I doing wrong?
You need to use order by to get what you want. In MySQL, this is pretty easy:
select name
from people
order by (name = 'John') desc, name
Results sets (like tables) represent unordered sets in SQL. The only way to impose an order is to use order by. The order by at the end of a union/union all query applies to the entire query.
As an aside, your code would come close to working if you used union all -- which is much preferred over union. The union operation does additional work to remove duplicates. In this case, that reorders the results, a convenient reminder that you can only depend on the order of results when you use order by.
Also you can use UNION ALL in a derived table
SELECT name
FROM
(
SELECT 1 AS Row_Id, name
FROM people
WHERE name = 'John'
UNION ALL
SELECT 2 AS Row_Id, name
FROM people
) t
ORDER BY Row_Id
let's say I have a table CData with the columns CName, Amount1, Amount2.
Now I want to use a query to get calculate the difference between Amount1 and Amount2 for each distinct CName and, as a result of the query, get the ~1000 rows with the biggest difference and the 1000~ rows with the smallest (or most negative) difference. It doesn't matter if the results come in one table or two.
1) I am aware of the function TOP and so I could do this with two queries and sort by Difference (once ascending, once descending). Is there a way to do this in one query, though? This would save some time.
2) General question: When I define a field in my query (in this example "Difference"), can I somehow use it to, for example, sort the data by it? Like this (well, it's not working, but to give you an idea of what I mean):
SELECT CData.CName, CData.Amount2-CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY Difference
Or do I always have to do the following:
...
ORDER BY CData.Amount2-CData.Amount1
Not much of a difference in this example, I just wanted to know if that's possible in general.
Sort the first time ASC (Ascending) and the second time DESC (Descending)
SELECT TOP 1000
CData.CName,
CData.Amount2 - CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY CData.Amount2 - CData.Amount1 ASC
SELECT TOP 1000
CData.CName,
CData.Amount2 - CData.Amount1 AS Difference
FROM CData
GROUP BY CData.CName
ORDER BY CData.Amount2 - CData.Amount1 DESC
which aggregate functino do you want to perform for your differences? Avg? Sum?
SELECT CName, avg(Amount2-Amount1) AS Difference
FROM CData
GROUP BY CName
btw, to do it in 'one' query, you could use a union query on two subqueries, one with the TOP 1000 asc, one with the TOP 1000 desc
looks like Access is not allowing you to use an alias in the ORDER BY Clause, if you use the QBE grid you can change the format from the UI to SQL and it repeats the calculation in the ORDER BYclause.
Hi, John.
Check out the SO tour for instructions on how to use options such as formatting code.
Not sure if this will work for you, but you can try something like:
select * from
(SELECT TOP 3
CName, Date_Sale, Sum(Amount) AS SumA, 99999-Sum(Amount) as srt
FROM
Data
GROUP BY
CName, Date_Sale
UNION
SELECT TOP 3
CName, Date_Sale, Sum(Amount) AS SumA, Sum(Amount) as srt
FROM
Data
GROUP BY
CName, Date_Sale) u
order by
srt
a noob question here!
I wrote this query, but the "group by" is very stupid...
so, how can I correct this?
SELECT
COUNT(*) AS total,
'x' as test
FROM
contents
WHERE name LIKE 'C%'
GROUP BY
test
ORDER BY id ASC
different solutions and info about performances are welcome ( maybe using DISTINCT? )
thanks in advance!
This should perform as well as any other option -
SELECT
LEFT(name, 1) AS first_letter,
COUNT(*) AS total
FROM contents
GROUP BY first_letter
If you want to run this query for a single letter at a time you can add the WHERE clause and drop the GROUP BY -
SELECT COUNT(*) AS total
FROM contents
WHERE name LIKE 'a%'
Let's dissect your query:
SELECT
COUNT(*) AS total,
'x' as test <-- Why?
FROM <-- Bad formatting.
contents
WHERE name LIKE 'C%'
GROUP BY
test <-- Removing 'x' and the whole GROUP BY has the same effect.
ORDER BY id ASC <-- The result only contains one row - nothing to sort.
So the query that returns one row with one field, containing the number of rows whose name begins with 'C' would look like this:
SELECT COUNT(*)
FROM contents
WHERE name LIKE 'C%'
Having an index whose leading edge is name would ensure good performance. To understand why, take a look at the Anatomy of an SQL Index.
should give you everything in case you want it
SELECT
COUNT(*) AS total,
test
FROM
(SELECT substring(name,1,1) as test
from contents) t
GROUP BY test