I have a SQL query which I need to use to order a field. Typically a query would go something like this in hibernate
select * from model order by model.field1, model.field2
The issue is that the first sort needs to be done in a very specific order.
Where the order in field1 has to be filtered in a SPECIFIC order.
So instead of listing cars by an alphabetical order for example. I would prefer to sort in say
Lexus, Ford, Toyota, Mazda, Mercedes etc. Is there a clean way to do this? Currently I get the result and then have to put it in specific lists and is just not very clean. I do not have option of modifying things in the database
Thanks.
You can create a custom ORDER BY using a CASE statement.
The CASE statement checks for your condition and assigns to rows which meet that condition a lower value than that which is assigned to rows which do not meet the condition.
It's probably easiest to understand given an example:
SELECT mycolumn
FROM model
ORDER BY CASE WHEN model.field1 = 'Lexus' THEN 0
WHEN model.field1 = 'Ford' THEN 1
WHEN model.field1 = 'Toyota' THEN 2
WHEN model.field1 = 'Mazda' THEN 3
WHEN model.field1 = 'Mercedes' THEN 4 END, model.field2;
order by (case
when field1 = 'Lexus' then 1
when field1 = 'Ford' then 2
when field1 = 'Toyota' then 3
...
else null end)
You need to use window functions and the PARTITION BY clause like so:
SELECT ROW_NUMBER() OVER(PARTITION BY manufacturer ORDER BY field1) [RN],
manufacturer,
field1,
field2
FROM model
This will give a row number for each different grouping in the partition by clause based on the ordering of field1 in this example.
EDIT:
If your RDMS does not support window functions, you can order by manufacturer then a second value - this would 'group' all the rows for each manufacturer together:
SELECT manufacturer, field1
FROM model
ORDER BY manufacturer, field1
Related
I have a data Frame that has millions of records and 8 columns.
I want to group by it with col1 and col2 and in select, I want name_id, max(SUM),col1,col2.
Now the problem is I am not using name_id in a group by condition nor is it an aggregate function.
Can you please suggest any method that solves my problem in SQL or Pyspark.
Input Data Frame here SUM = number of columns have data and name_id is unique:
Required Output : name_id (as it is), max(SUM),Col1,Col2
I tried something like this but it's not working:
Any suggestion is welcome!
I tried below code which is working fine with one scenario and not with others.
Working scenario, When I have duplicate maximum values in sum column then its working fine and retuning max name_id which is my requirement
When SUM columns do not have maximum value duplicate then it is returning null, in the below table according to logic my output should contain name_id = 48981 and name_id = 52214 but I am getting the only name_id = 52214.
It is a classical greatest per group problem. I would suggest using the following solution to this problem:
select d.*
from data_frame d
join (
select col_1, col_2,
max(sum) max_sum,
max(name_id) max_name_id
from data_frame
group by col_1, col_2
) t on d.col_1 = t.col_1 and
d.col_2 = t.col_2 and
d.name_id = t.max_name_id and
d.sum = t.max_sum
You seem to want:
select max(name_id), max(sum), col1, col2, max(col3), . . .
from t
group by col1, col2;
Your last column doesn't seem to be using max(), but you have not explained that logic.
I am trying to concatenate 2 columns, then count the number of rows i.e. the total number of times the merged column string exists, but I don't know if it is possible. e.g:
SELECT
CONCAT(column_1,':',column_2 ) as merged_columns,
COUNT(merged_columns)
FROM
table
GROUP BY 1
ORDER BY merged_columns DESC
Note: the colon I've inserted as a part of the string, so my result is something like 12:3. The 'count' then should tell me the number of rows that exist where column_1 =12 and column_2 = 3.
Obviously, it tells me 'merged_columns' isn't a column as it's just an alias for my CONCAT. But is this possible and if so, how?
Old question I know, but the following should work without a temp table (unless I am missing something):
SELECT
CONCAT(column_1,':',column_2 ) as merged_columns,
COUNT(CONCAT(column_1,':',column_2 ))
FROM
table
GROUP BY 1
ORDER BY merged_columns DESC
You can try creating a temp table from your concatenation select and then query that:
SELECT CONCAT(column_1,':',column_2 ) AS mergedColumns
INTO #temp
FROM table
SELECT COUNT(1) AS NumberOfRows,
mergedColumns
FROM #temp
GROUP BY mergedColumns
Hope this answer is what your are looking for.
Try this
SELECT
CONCAT(column_1,column_2 ) as merged_columns,
COUNT(*)
FROM
table
GROUP BY merged_columns
ORDER BY merged_columns DESC
size color in_stock
----- ----- -----
small red 0
large red 1
xlarge red 1
When I'm using GROUP BY size,color, the first row with in_stock 0 is being chosen over the second row. Is there any way to have GROUP BY always give priority for rows with in_stock 1, rather than in_stock 0?
The short answer is: No
I suspect (because you don't supply your original query) that you are using something like this:
SELECT size,color, in_stock
FROM atable
GROUP BY size,color
MySQL allows a GROUP BY clause to have just a few non-aggregating columns - BUT it only does so by virtue of a server setting
see: http://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html
If you use this "feature" there is no control over what data is chosen in the other non-aggregating columns.
You should NOT use this "feature" of MySQL because if the server settings turn off this extension your queries will no longer work.
You could do something like this instead:
SELECT size,color, MIN(case when in_stock = 1 then in_stock else NULL end)
FROM atable
GROUP BY size,color
You can use syntax as per below
select size, color, in_stock, ....other fields...
from yourtable
where ...conditions if any....
group by size,color
order by in_stock desc;
Order by always work after group by, so if you want to first order then group. you can use below query-
select size, color, in_stock
from
(
select size, color, in_stock, ....other fields...
from yourtable
where ...conditions if any....
order by in_stock desc
) as a
group by size,color;
If you GROUP BY column(s) and also return a non aggregate column that is not in the GROUP BY clause, then which row that columns value is taken from is not defined. It might be the first one, it might be the last one. It might change depending on storage engine, or anything else.
If you specifically wanted the 2nd one you could do something like this:-
SELECT size, color, SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(in_stock ORDER BY FIELD(size, 'small', 'large', 'xlarge')), ',', 2), ',', -1)
FROM atable
GROUP BY size,color
I have a simple query, but I would like to see the results in a specific way. I would like to see 'N/A' at the top of the result, without having to result to "Case When Then"
Select *
From Ordertype
Results:
Car21
Car34
Bus42
N/A
Thanks,
There are no 'overrides' for ORDER BY, if you want the specific order you're asking for, you'll have to use the CASE:
SELECT type
FROM OrderType
ORDER BY
CASE
WHEN type = 'N/A' THEN 1
ELSE 2
END
,type
If you want an arbitrary order that is not tied directly to the structured of a column (alphabetical/numerical) but rather to it's importance which only you know in your head it can be useful to add a Rank column to your table.
Column1 Rank
Car21
Car34 2
Bus42 1
N/A 99
then you can do
select Column1
from Table
order by rank desc, column1
This will put highly ranked items first then low ranked items, then when rows don't have a rank it will sort them alphabetically by column1
You can try this:
SELECT * FROM ordertype ORDER BY ID DESC
to see the newest ones 1st
I need to export data from an existing TABLE. Till recently I used -
SELECT item_ID, item_Info, ...moreFields... FROM tableName WHERE myCondition=0
Now, they changed the TABLE structure, and added new field "item_Info2"
When they fill in the TABLE:
if "item_Type" is 1 or 2 or 3 or 4, then "item_Info" is relevant
if "item_Type" is 5, then "item_Info" is empty, and I need "item_Info2" as my query result, instead of "item_Info"
What is the corresponding SELECT command?
[similiar question: mysql select any one field out of two with respect to the value of a third field
but from this example I cannot see the syntax for selecting moreFields ]
Thanks,
Atara.
You can treat the CASE statement as any other column name, separate it from the other columns with a comma, etc. What happens in the CASE should be considered a single column name, in your case you need commas before and after it.
SELECT item_ID, CASE WHEN item_Type = 5 THEN item_info ELSE item_info2 END, field_name, another_field_name ...moreFields... FROM tableName WHERE myCondition=0
you could also use an alias to be easier to get the result from the query:
SELECT item_ID, CASE WHEN item_Type = 5 THEN item_info ELSE item_info2 END AS 'item_info', field_name, another_field_name ...moreFields... FROM tableName WHERE myCondition=0
To edit the reply on the other question:
SELECT
item_id, item_otherproperty,
CASE WHEN item_Type= 5
THEN item_Info2
ELSE item_Info
END AS `info` FROM table ...